Linear Regression is a statistical way of measuring the relationship between variables. In this one variable is an explanatory or independent variable, and another is considered as dependent, or variable of response or outcome.
It was developed in the field of statistics. It is both a statistical algorithm and a machine learning algorithm. Over the years it has become the most basic and powerful tool of Data Science.
The variable which is being predicted is called the Criterion variable and referred to as Y. The variable which is base for prediction is called the predicted variable and referred to as X. When there is a single input variable (x), the method is referred to as simple linear regression. Similarly, when there are multiple input variables, referred to as multiple linear regression.
Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line is called a regression line.
Let us understand this by a simple example:
Example: The height of a tree is depending on the amount of water it receives.
In this case water in an independent variable whereas the height of the tree is a dependent variable as it is depending on the amount of water it receives.
“X” is a Dependent variable (Height in Feet’s) = 3,5,8,10,11
“Y” is a Dependent variable (Water in Gallon’s) = 2,3,4,5,6
📷
The black diagonal line in the above figure is the regression line and consists of the predicted score on Y for each possible value of X. Points are the actual data. As we can see, some points are very near the regression line; its error of prediction is small than those are little away.
Conclusion: Linear regression is widely used in an industry to do predictive analysis. The overall idea of regression is to examine two things. It is a simple approach to predict based on data that follows a linear trend.