The equation y = mx + b represents a straight line and is foundational in linear regression.
Here’s what the terms mean:
m: The slope, determining the steepness of the line.
b: The intercept, where the line crosses the Y-axis.
x: The input variable or predictor.
y: The output or predicted value.
Use the sliders to adjust the slope and intercept, and observe how the line dynamically updates on the graph.
Add your Data Points
X-Axis
Y-Axis
Visualizing Errors
Error!
No data points. Add some above to visualize errors.
Total Absolute Error: 0.00
In linear regression, an error is the difference between the actual value and the predicted value of the dependent variable (Y). Errors help us measure how well our regression line fits the data, and they are critical for improving the model.
Why Errors Are Important:
Errors quantify how far our predictions are from the actual values.
They help us choose the best regression line by minimizing the overall error.
They are crucial in optimizing and evaluating machine learning models.
Types of Errors:
Mean Absolute Error (MAE): This is the average of the absolute differences between the predicted values and actual values:
MAE = (1/n) Σ |yᵢ - ŷᵢ|
MAE is simple to interpret and less sensitive to large errors, but it may not penalize large deviations strongly enough.
Mean Squared Error (MSE): This is the average of the squared differences between the predicted values and actual values:
MSE = (1/n) Σ (yᵢ - ŷᵢ)²
MSE is widely used because it penalizes large errors more strongly than MAE. However, it can be influenced by outliers.
Key Points:
Lower MAE or MSE indicates a better fit of the regression line to the data.
MSE is commonly used for its mathematical properties and sensitivity to large errors.
Both metrics help us evaluate and compare regression models.
Visual Regression Trainer
Iteration: 0 Current Error: 0
ClickReset first to refresh data points.
Understanding Linear Regression and Gradient Descent
The Role of Gradient Descent
Gradient descent is an optimization algorithm used to minimize the error (or cost) function. The cost function in linear regression is defined as:
J(m, b) = 𝟏/𝟐n ∑ (yi - (m𝑥i + b))²
Where:
J(m, b): The cost function (mean squared error).
n: The number of data points.
(xi, yi): The input and actual output values for the i-th data point.
The goal is to adjust m (slope) and b (intercept) iteratively to minimize J(m, b). This is achieved by updating them using the gradients:
m = m - α * ∂J/∂m
b = b - α * ∂J/∂b
Here, α is the learning rate that controls how big each step is in the direction of the gradient.
How to Use This Visualization Tool
This tool demonstrates how linear regression works with gradient descent:
Best-Fit Line Plot: Shows how the line adjusts to fit the data points.
Error Plot: Visualizes the decrease in error over iterations.
Iteration Count: Displays the current iteration and the error value.
You can interact with the tool using these controls:
Learning Rate: Adjust the value to control the step size of gradient descent. Common values are 0.01 or 0.001. Higher values speed up convergence but may overshoot the minimum.
Max Iterations: Sets the number of iterations for gradient descent. A higher value provides more refinement for the line.
Start Training: Begins the gradient descent process.
Stop Training: Pauses the gradient descent process.
Reset: Resets the best-fit line and incorporates any new data points added.
Explore the sliders and buttons to observe how gradient descent adjusts the line and reduces error. This interactive visualization is a powerful way to understand the inner workings of linear regression.