Machine Learning — Key Terminology

4 min readJun 27, 2019

Below are key Machine Learning Terminology that I thought to list down, as these, I believe are must to know and be able to differentiate between each other, to able to learn ML in a better way.

I understand this are not the entire list of terms, but I will keep on updating this article as I will come across a new terminology.

Hypothesis: A hypothesis is a certain function that we believe (or hope) is similar to the true function, the target function that we want to model. In context of email spam classification, it would be the rule we came up with that allows us to separate spam from non-spam emails. In a single you can say an “Educated Guess”.
Model: A model is something to which when you give an input, gives an output. In ML, any ‘object’ created after training from an ML algorithm is a model. To generate a machine learning model you will need to provide training data to a machine learning algorithm to learn from. For example: Linear regression algorithm is a technique to fit points to a line y = m x+c.
Now after fitting, you get for example, y = 10 x + 4 as a perfect match for your problem. This a model. I want all my Model take me as close as my Target Function — Thats my success.
Target Function: The target function f(x) = y is the actual function f that we want to model, using a hypothesis. In one word this is the Result we want. In order to do machine learning, there should exist a relationship (pattern) between the input and output values. Lets say that this the function y=f(x) this known as the target function. The target function f(x) = y is the true function f that we want to model.
Training Sample: Training Set data, that we use to train our Learning Algorithm. While training our machine learning algorithm, we pass training data to it. The learning algorithm finds patterns in the training data such that the input parameters correspond to the target.
The output of the training process is a machine learning “Model” which we can then use to make predictions. This process is also called “Learning”.
Learning Algorithm: Our goal is to find or approximate the target function, and the learning algorithm is a set of instructions that tries to model the target function using our training dataset. When we say Linear Regression algorithm, it means a set of functions that define similar characteristics as defined by Linear Regression and from those set of functions we will choose one function that fits the most by the training data.
Regression: It is a simple technique to model or predict the dependent variable (y) using independent variables (x1, x2, etc). In simple linear regression, there is only one independent variable, x, and one dependent variable, y.
For example — calculating the price (y) of Ticket for WorldCup Cricket Match based parameter like Nation Playing (x1), Host Stadium (x2), Type of Match — League or Qualifier (x3) and so on.
Classification: Classification is a technique for determining class the dependent belongs to based on the one or more independent variables. Given one or more inputs, a classification model will try to predict the value of one or more outcomes
Classification is used for predicting discrete responses.
For example — Identifying if the fruit is apple or pear, here apple and pear are classes.
Features: The key identified properties based on which we would like to predict the result of our problem statement.
For example — My Problem statement, If the price of ether will go up or not next year. ;-)
Features influencing that will be:- x1: Number Ethereum use cases implemented, x2: Govt endorsing crypto-currencies, x3: Investment of private firms in blockchain, x4: Implementing use case in Live, so on and so forth.
Cost Function: Whenever we train a model with our data, we are actually producing some new values (predicted) for a specific feature. However, that specific feature already has some values which are real values in the dataset. We know the closer the predicted values to their corresponding real values, the better the model.
And we use cost function to measure how close the predicted values are to their corresponding real values. Our main aim is to minimise cost function for our model.
Overfitting: A model is overfitting if it fits the training data too well and there is a poor generalisation of new data. Like the graph we received after placing our algorithm is unable to predict the correct output.
For example — To predict the win in a cricket match — we not only consider the team strength, home ground, pitch but also lets say the count of audience or breakfast players have eaten
Under-fitting: Just oppose to overfitting, when we do not enrich our algorithm with sufficient amount of input variables and hence the output doesn’t give the correct outcome.
Gradient Descent: It is an optimisation algorithm (we have some other optimisation algorithms as well) and its responsibility is to find the minimum cost value in the process of trying the model with different weights or indeed, updating the weights

I have tried to accumulate the data from different sources on internet (including medium blogs, Quora, stack overflow)and some of my understanding and place it on single plate. I will make sure to enrich it as and when I will come across any new terminology or a better definition of above terms. If it helps you, please make sure it reaches others, a clap may help :-).

Machine Learning — Key Terminology

Written by Arjit Sharma