LearningModel
Base class of a predictive model used in ChronoEpilogi.
Notes
This class is a template that can be used to create custom models. ChronoEpilogi expects properly implemented subclass of LearningModel.
The base class does not distinguish between one-level column index and two-levels column index. A subclass model may handle only one of these data format. See data format documentation for precisions.
__init__(config: dict, target: str) -> None
Initialize a learning model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict
|
Contains the parameter settings of the model. Free for each class to define. |
required |
target
|
str
|
The name of the target. As a string for single level dataframe, a tuple of strings for two-levels dataframes. |
required |
Returns:
| Type | Description |
|---|---|
None
|
|
fit(data: pd.DataFrame) -> None
Fit the model on the provided data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
2D DataFrame containing the multivariate time series. The index of the DataFrame should correspond to timesteps, and the columns to covariates. The column index may have one or two levels depending on user choice. |
required |
Returns:
| Type | Description |
|---|---|
None
|
|
Notes
If the model requires a validation split, it may handle it internally. The method may save the fitted model in self.model The method MUST save the data used to train the model in self.data.
fittedvalues(data: None | pd.DataFrame = None) -> pd.Series
Produces the predicted values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
None | DataFrame
|
The data used for prediction. It must have the same columns as the data passed to fit. If None is passed, the training data should be used to compute the predictions. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fittedvalues |
Series
|
1D series with index aligned on the original data, containing the predictions. |
Notes
Would correspond to a .predict operation in other libraries.
has_too_many_parameters(ratio: float) -> bool
(Legacy) Flag specifying whether the model is too large to be trained.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ratio
|
float
|
Originally, the ratio (sample size)/(number of parameters) that should be minimally respected. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
flag |
bool
|
True if the model is too large for the available data. False otherwise. |
Notes
Legacy method from linear model applications. Originally, it was used to prevent models from being fitted on datasets with too low sample size compared to number of model parameters. May return False constantly if this is no concern for the present model.
residuals(data: None | pd.DataFrame = None) -> pd.DataFrame
Compute modeling residuals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
None | DataFrame
|
The data used for prediction. It must have the same columns as the data passed to fit. If None is passed, the training data should be used. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
residuals_df |
DataFrame
|
DataFrame with a single column named after the target TS. The index must correspond to the index of the fitted values. |
Notes
Examples:
>>> class NewLearningModel(LearningModel):
... def fittedvalues(self,data=None):
... return data[self.target].iloc[100:] # dummy
>>> data = pd.DataFrame(np.ones((101,12)))
>>> data.columns = pd.MultiIndex.from_product([[1,2,3],[1,2,3,4]])
>>> model = NewLearningModel({},(1,1))
>>> model.residuals(data)
1
1
100 0.0
With a single level of columns:
stopping_metric(previous_model: LearningModel, method: str = '') -> float
Metric by which model equivalence is tested.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
previous_model
|
LearningModel
|
An instance of the same model class, containing a model fitted on data with a subset of the covariates of the current model. |
required |
method
|
str
|
Optional parameter that specifies the method by which to obtain the metric, if several such methods are implemented. |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
metric |
float
|
The metric of model equivalence. |
Notes
The returned metric will be compared to the threshold provided to the ChronoEpilogi algorithm. If the metric is higher than the threshold, we consider the models equivalent. If the metric is lower than the threshold, we consider the current model more powerful than the previous model.