Skip to content

LearningModel

Base class of a predictive model used in ChronoEpilogi.

Notes

This class is a template that can be used to create custom models. ChronoEpilogi expects properly implemented subclass of LearningModel.

The base class does not distinguish between one-level column index and two-levels column index. A subclass model may handle only one of these data format. See data format documentation for precisions.

__init__(config: dict, target: str) -> None

Initialize a learning model.

Parameters:

Name Type Description Default
config dict

Contains the parameter settings of the model. Free for each class to define.

required
target str

The name of the target. As a string for single level dataframe, a tuple of strings for two-levels dataframes.

required

Returns:

Type Description
None

fit(data: pd.DataFrame) -> None

Fit the model on the provided data.

Parameters:

Name Type Description Default
data DataFrame

2D DataFrame containing the multivariate time series. The index of the DataFrame should correspond to timesteps, and the columns to covariates. The column index may have one or two levels depending on user choice.

required

Returns:

Type Description
None
Notes

If the model requires a validation split, it may handle it internally. The method may save the fitted model in self.model The method MUST save the data used to train the model in self.data.

fittedvalues(data: None | pd.DataFrame = None) -> pd.Series

Produces the predicted values.

Parameters:

Name Type Description Default
data None | DataFrame

The data used for prediction. It must have the same columns as the data passed to fit. If None is passed, the training data should be used to compute the predictions.

None

Returns:

Name Type Description
fittedvalues Series

1D series with index aligned on the original data, containing the predictions.

Notes

Would correspond to a .predict operation in other libraries.

has_too_many_parameters(ratio: float) -> bool

(Legacy) Flag specifying whether the model is too large to be trained.

Parameters:

Name Type Description Default
ratio float

Originally, the ratio (sample size)/(number of parameters) that should be minimally respected.

required

Returns:

Name Type Description
flag bool

True if the model is too large for the available data. False otherwise.

Notes

Legacy method from linear model applications. Originally, it was used to prevent models from being fitted on datasets with too low sample size compared to number of model parameters. May return False constantly if this is no concern for the present model.

residuals(data: None | pd.DataFrame = None) -> pd.DataFrame

Compute modeling residuals.

Parameters:

Name Type Description Default
data None | DataFrame

The data used for prediction. It must have the same columns as the data passed to fit. If None is passed, the training data should be used.

None

Returns:

Name Type Description
residuals_df DataFrame

DataFrame with a single column named after the target TS. The index must correspond to the index of the fitted values.

Notes

Examples:

>>> class NewLearningModel(LearningModel):
...     def fittedvalues(self,data=None):
...         return data[self.target].iloc[100:]  # dummy
>>> data = pd.DataFrame(np.ones((101,12)))
>>> data.columns = pd.MultiIndex.from_product([[1,2,3],[1,2,3,4]])
>>> model = NewLearningModel({},(1,1))
>>> model.residuals(data)
       1
       1
100  0.0

With a single level of columns:

>>> data = pd.DataFrame(np.ones((101,12)))
>>> data.columns = list(range(12))
>>> model = NewLearningModel({},1)
>>> model.residuals(data)
       1
100  0.0

stopping_metric(previous_model: LearningModel, method: str = '') -> float

Metric by which model equivalence is tested.

Parameters:

Name Type Description Default
previous_model LearningModel

An instance of the same model class, containing a model fitted on data with a subset of the covariates of the current model.

required
method str

Optional parameter that specifies the method by which to obtain the metric, if several such methods are implemented.

''

Returns:

Name Type Description
metric float

The metric of model equivalence.

Notes

The returned metric will be compared to the threshold provided to the ChronoEpilogi algorithm. If the metric is higher than the threshold, we consider the models equivalent. If the metric is lower than the threshold, we consider the current model more powerful than the previous model.