First contact with what they call Artificial Intelligence, or sometimes also Machine Learning, or wait … and Deep Learning?
As you have seen in recent years, there has been a great growth in the use of these terms despite existing for a long time. The use has been such that many times some concepts are confused with others; because in case you still don’t know, there are three different meanings although related to each other.
The most general of them all Artificial Intelligence (AI) and it includes the other two. The AI, rather than a technique to be applied, is defined as a branch of science (specifically computer science) that refers to the ability of a machine to solve a problem as a person would. It tries to solve a situation in a coherent way and as accurately as possible in relation to the environment in which it develops. As you can see, it has been around for a long time. Who has never played chess or checkers against a machine? However, will that machine be just as good trying to solve another problem in a different environment? This ability to adapt from one environment to another is the biggest complication that AI can find, as well as being very unreachable at this moment.
If we want the machine to learn by itself in addition to solving a problem as a person would, then we would be talking about the second and most familiar term: Machine Learning (ML). After making clear the three concepts that we are dealing with, we will look at it in detail a little later, since I consider this the most important part.
Finally, Deep Learning (DL) is a specific part of Machine Learning. Some articles consider that Deep Learning is the closest to a practical simplification of the biological processes that happen in our neurons, all by extracting abstract characteristics in each layer. Other sources consider Deep Learning as the most flexible current technique, but also possibly the most demanding in the consumption of resources for machine learning. However, as the entire AI community agrees, it is a process capable of making accurate and very specific decisions only when the amount of data is large enough, but still far from the real processes of the human brain.
Once it has been clarified that Artificial Intelligence, Machine Learning and Deep Learning are related terms, that each one includes the others respectively, and that nevertheless they are different meanings, we can go a little deeper into Machine Learning.
In a maximum simplification of the process to apply ML we could say that it consists of taking a set of data, entering it as input into one or several models of ML to train it and verifying that the result is consistent with the objective we intend.
However, it is not as simple as it may seem. Let us briefly study the steps to follow.
Since information is power, stopping to see the data, understand what we have and how the variables are related to each other, will provide us with a great advantage in the next steps. These are some of the techniques used to clean and prepare the selected data set:
- Representation of the variables in different graphs.This can be very useful to have a sense of the values of each variable. With this we can detect possible errors in the data as negative values in variables that do not admit it, find outliers, see if the variables follow a normal distribution or not … In summary, information that can be useful for your understanding.
- Null value detection. There will be times when it is convenient to replace them with values calculated as the average of the others in that same column or consider other variables to achieve a sectioned average. It may be better to enter a new default value, or it may even be convenient to leave them null as they came. Again, this will be easier if we have a good knowledge of the data.
- Reduction of the number of variables. There are possibly variables that do not provide new information to the group, so we could do without them. Or, that your information can be exactly deductible from other variables. On the contrary, despite providing a minimum of extra information, it may be convenient for us to lose that contribution to avoid an over-adjustment in the model and that the response of the model is simpler or more appropriate. In any case, to detect these possible variables that could be eliminated, it would be possible to do it either manually (if there are not too many input variables), or automatically with the help of other models created for this purpose.
- Instead of reducing the number of variables, this method consists in reducing the values or ranges of the parameters. It aims to control the complexity of the algorithm by adding a penalty term in the objective function of the model. It is used when we have many input variables and each of them provides useful information to predict the output variable, and / or when our model adapts almost perfectly to the input sample, but worsens greatly with the new predictions, it is say, when it is not able to generalize correctly.
- Create new variables from the data.Here an example to explain it better. Suppose that in the data we have the column “countries” that only takes the values “Spain”, “France” and “Portugal”. Instead of adding the variable “countries” of type String, we could add the variables “country_Spain”, “country_France”, “country_Portugal” of binary type, that is, to take the values one or zero depending on whether the input belongs to that country or not respectively. Everything will always depend on the model to be applied.
As a result, I would dare to say that this is the most complex part of the process: to achieve a good quality of the data from the chosen set. The better quality, knowledge and cleanliness of them, the better results we can expect. It is worth taking time in this part to avoid, as stated in this area, ” Garbage in, garbage out “.
This part is not about inventing a new mathematical model, but about training models that already exist but that learn in one way or another depending on the data they receive as input. Since there are countless models, we will explain the main types and name the most common.
There are mainly three types of Machine Learning algorithms:
- Supervised learning: They take as input a sample of data of which the result to be predicted is known. The algorithm that tries to identify patterns with it and find a prediction model is trained, which will have to be corrected in case the predictions are not made correctly. Some examples of such algorithms would be decision trees, Naive Bayes classification algorithms, minimum squares regression, logistic regression, SVM (Support Vector Machines) …
- Unsupervised learning: This type of algorithms are responsible for organizing and classifying data based on the properties that the model has managed to detect following the study of the relationships and correlations of the data. Some examples would be clustering algorithms, principal component analysis, singular value decomposition …
- Reinforcement learning: Given an objective, this type of algorithm should try to achieve it through a series of defined actions. Thus, the model is trained through rewards associated with these actions, and subsequently also learns from its mistakes since it feeds back with its own results. It will be with multiple workouts as the algorithm will gradually be optimized to reach its goal in the best possible way.
CHECK THE RESULT
After carrying out multiple executions of the model to refine your learning, you must perform an analysis of the result (error analysis) and study how good or bad the predictions or the results of the model used have been.
Probably the model we are using is not the right one and you must use a more complex one, or even a simpler one. Another option may be that more data is needed to learn, or more variables from which to draw more relationships or information, or both.
That is why, if we fail to achieve our purpose as we expected, we may have to return to the first step where we emphasized the importance of knowing in detail the sample of data obtained.
TOOLS TO BEGIN
I didn’t want to finish without naming some of the most common tools that are often used to start “messing around” in Machine Learning.
First, if you do not have a set of data to start from, on the Internet there are some prepared to download such as: Titanic passengers, mnist, mnist fashion and classification of plants according to their petals.
After finding a data set, the best thing to understand how everything works is to use online notebooks . Both Jupiter and Google Collaboratory or R-Studio are quite useful since they do not require any previous installation.
Regarding the language programming, there is and there will be an ongoing debate about whether to use Python, R or Matlab, but that for anyone to start is perfectly valid.
Finally, if the language chosen is Python, these are some of the most useful libraries that will be of great help: Numpy, Pandas, Matplotlib, Seaborn, TensorFlow, NLTK, Sklearn …
Summarizing all the information, remember that Deep Learning is a specific part of Machine Learning, and this in turn is included in the field of Artificial Intelligence.
On the other hand, we have also seen that entering the world of Machine Learning is not as simple as it might seem since it requires paying close attention to the details and knowledge of the data that we are going to use in our model. And not only have to “pamper” the data, but the whole process from beginning to end: from its preparation, through the choice of the model and its training, to the assessment of whether the results are consistent with the desired objective reach from the beginning or not.