~
Machine Learning: Predictive Algorithms

Predictive Algorithms in Machine Learning
Mative employs these predictive algorithms on its time series data. Below is a brief description of how they work and a hint on how they can be used.
Linear Regression
Linear regression is a statistical method that models the relationship between a dependent variable (target) and one or more independent variables (predictors) using a straight line. The formula is:
[ y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \ldots + \beta_nx_n + \epsilon ]
Where:
- ( y ) is the dependent variable.
- ( x_i ) are the independent variables.
- ( \beta_i ) are the coefficients representing the impact of each independent variable.
- ( \epsilon ) is the residual error.
OLS Linear Regression
OLS (Ordinary Least Squares) linear regression is a specific form of linear regression that minimizes the sum of the squares of the differences between the observed values and the predicted values. Essentially, it finds the best-fit line that minimizes the mean squared error:
[ \text{Minimize} \sum (y_i - \hat{y}_i)^2 ]
Where:
- ( y_i ) are the observed values.
- ( \hat{y}_i ) are the predicted values.
ARIMA
ARIMA (AutoRegressive Integrated Moving Average) is a model used for time series analysis that combines three components:
- AutoRegressive (AR): the model uses past values to predict future values.
- Integrated (I): it differentiates the data to make it stationary.
- Moving Average (MA): it uses past errors in predicting future values.
The model is often written as ARIMA(p, d, q), where:
- ( p ) is the order of the autoregressive term.
- ( d ) is the number of differences needed to make the series stationary.
- ( q ) is the order of the moving average term.
Fourier Transformation
The Fourier transformation is a mathematical method for transforming a function from the time domain to the frequency domain. It is used to analyze the frequency components of signals. In the context of time series, it can be used to identify cycles or periodic patterns:
[ F(k) = \sum_{n=0}^{N-1} x(n) e^{-2\pi i \frac{kn}{N}} ]
Where:
- ( x(n) ) is the signal in the time domain.
- ( F(k) ) is the representation in the frequency domain.
- ( N ) is the number of samples.
- ( i ) is the imaginary unit.
If you have any questions about these algorithms, feel free to ask!