Machine learning (ML) is called the most promising area of AI. The ML market is projected to be $39.98 billion by 2025. But ML is a complex process requiring a team with data expertise, technology, and tools. Part of the processes for working with ML models can be automated using tools for AutoML.
The Main Task Of AutoML Is Routine Automation
AutoML tools are designed to automate machine learning processes and work effectively for solving common and repetitive tasks. According to the participants of the discussion, AutoML tools are needed to save data engineers and scientists from routine tasks. Still, so far, they cannot completely replace data scientists.
“Ideally, AutoML should provide full automation of all processes for working with ML. But in practice, it is impossible to automate everything. Therefore, AutoML solutions are needed to automate the routine and quickly get adequate solutions without wasting the efforts of data engineers and, even more so, data scientists.
There Is No Universal Tool For Everyone
Each type of task requires a specific AutoML tool. At the same time, it is necessary to consider the goals of using machine learning and the teams’ competencies.
For example, if a company has data scientists who work with code, know how to write queries, and call the necessary libraries, industrial-grade solutions are more suitable:
- H2O;
- data robot;
- AutoSklearn;
- Autogluon;
- TPOT.
They require expertise, but these tools ensure the quality and accuracy of training ML models.
For teams without expertise, No-Code platforms are better – for example, Pecan. You don’t need programming skills to work with them, but their accuracy is lower.
AutoML Tools Are Effective Only On Conjunction With A Data Scientist
Automation tools simplify a person’s work and help to find hidden dependencies and patterns. They also allow you to find non-obvious but effective solutions when building models. At the same time, at many stages of working with machine learning models, data engineers and data scientists are still needed, who must formalize the task, select variables, adjust parameters, and interpret the result.
“It is too early to talk about the creation of AutoML in a broad sense, as a system that completely solves a business problem using machine learning. The work of an ML specialist remains largely defining. The issues of productization of the constructed machine learning model, A / B testing remain open.”
When Choosing An AutoML Solution For Business, There Are Many Parameters To Consider
- Initial Machine Learning Problems. There are no universal AutoML tools, so you need to understand what exactly the solution should help with.
- Team presence. If the company does not have a large team of DS (data scientists), it is better to choose a no-code solution – even analysts can work with it. If there are specialists, industrial-grade solutions will be the best option – they give higher accuracy, although they require programming and fine-tuning.
- Modularity. When working with ML automation tools, it is important that they can fully solve specific business tasks. Therefore, it is better to choose modular tools – solutions that can be independently adapted by adding objective functions, heuristics, rules, or other parameters.
Modularity makes AutoML multitasking and flexible. It allows you to cover a wide range of tasks using just one framework. At the same time, the modular tool adapts to the tasks of the business, not the business adapts to the capabilities of the tool.
The panelists noted that it is better to choose open-source tools – this allows studying how the solution works and allows you to change it yourself.
AutoML Solutions Will Continue To Evolve
With machine learning and artificial intelligence development, AutoML tools are becoming more in demand. But existing solutions are limited by data types, formats, and other parameters. In this regard, AutoML solutions require optimization. Representatives of Russian companies expect that shortly:
AutoML approaches will be able to cover new areas – time series, signal processing, SVI, NLP, and others. This will expand the scope of AutoML.
The current state of AutoML libraries makes it possible to reduce the complexity of ML solutions by automating individual stages of the machine-learning pipeline. The libraries provide a convenient API to the parametric families of algorithms and automate the selection of hyperparameters, feature selection, ensembling, and model selection. But algorithms for data preprocessing and quality control, automatic generation of features specific to a particular applied task, budget management, and learning strategy need to be developed. Open-source solutions are configured primarily to work with tabular data, only partially covering the processing of time series and natural language.”