solver_ops#
Solver generic methods
- wip.modules.solver_ops.adjust_real_cost(real_cost, features, mult_coef=1, div_coef=3)[source]#
Adjust the real cost of some features
- wip.modules.solver_ops.custom_format(x)[source]#
Format numeric value as a 5 decimal places string.
If :param:`x` is not a numeric value, the function returns the original value unchanged.
- Parameters
x (
Any
) – The value to try to format as a string with 5 decimal places.- Returns
The formatted value or the original value if :param:`x` is not a numeric value.
- Return type
Any
Examples
>>> custom_format(1.2345678) '1.23457' >>> custom_format('1.2345678') '1.2345678' >>> custom_format(0) '0.00000' >>> custom_format('some text') 'some text'
- wip.modules.solver_ops.define_range_constraints(token: str, range_start: int, range_end: int, step: int = 1) List[str] [source]#
Generate a list of strings by applying a token format over a defined range.
This function receives a formatting
token
, and applies this format to each number in the range specified byrange_start
,range_end
, andstep
. The formatted strings are then returned in a list.- Parameters
token (
str
) – A string with{}
as a placeholder for the integer to be formatted.range_start (
int
) – The start of the range to which thetoken
is applied.range_end (
int
) – The end of the range to which thetoken
is applied. This value is not included in the output list.step (
int
, optional) – The step between consecutive integers in the range, default is 1.
- Returns
A list of strings obtained by applying the
token
to each number in the specified range.- Return type
Examples
>>> token = "{}_formatted" >>> define_range_constraints(token, 1, 4) ['1_formatted', '2_formatted', '3_formatted']
>>> token = "prefix_{}_suffix" >>> define_range_constraints(token, 1, 5, 2) ['prefix_1_suffix', 'prefix_3_suffix']
- wip.modules.solver_ops.get_pi_system_tag_names(dataset: DataFrame) DataFrame [source]#
Get the PI System tag names from the PIMS tag names.
This function takes a DataFrame and returns a copy of it with a new column containing the PI System tag names.
- Parameters
dataset (
pd.DataFrame
) –pandas.DataFrame
containing the optimization model resultas, with PIMS tag names as a column named ‘Tag’.- Returns
A
pandas.DataFrame
with the PI System tag names in a column named ‘Tag PI’.- Return type
pd.DataFrame
- wip.modules.solver_ops.retrieve_best_model(model, models_results, metric='mape')[source]#
Sort the models by its metrics
- wip.modules.solver_ops.retrieve_model_coeficients(model: str, models_results: dict)[source]#
Retrieve the ridge regression model coefficients.
- wip.modules.solver_ops.save_solver_results(solver_path, df, resultado_otimizador_filename: str = 'resultado_otimizador-US08_2024-04-19.xlsx')[source]#
Save optimization results to Azure Container Storage, or to a local filepath.
The default file name that is used to save the optimization results to Azure Data Lake is:
f"resultado_otimizador-US{US_SUFIX}_{datetime.today().strftime('%Y-%m-%d')}.csv"
For example, the file name should be similar to the following:
"resultado_otimizador-US08_2024-03-21.csv"
- Parameters
solver_path (
str
) – The Azure Data Lake container URL path or a local folder path where the optimization results will be saved.df (
pd.DataFrame
) – Apandas.DataFrame
containing the optimization results for all production ranges.resultado_otimizador_filename (
str
, defaultwip.constants.RESULTADO_OTIMIZADOR_FILENAME
) – The name of the file to use for saving the optimization results.versionchanged: (..) – 2.8.11: Bugfix error that caused optimization results to try to save it as a “.csv” file using the suffix “.xlsx” on the filename. This problem was making the DataBricks job “Integração SensUP” to read the optimization results in an incorrect format and subsequently fail to finish executing the job.
- wip.modules.solver_ops.unnormalize_optimization_tags(scalers: dict, real_cost: dict) dict [source]#
Unnormalize optimization tags using scalers.
This function receives scalers and real cost dictionaries, filtering the optimization keys, and unnormalizes the real cost based on the provided scalers. The
real_cost
dictionary is updated for each optimization key by dividing the value of the key by its corresponding data range inscalers
.- Parameters
scalers (
dict
) – Dictionary containing the scaler objects, where each key represents a column, and the corresponding value is the scaler object associated with that column.real_cost (
dict
) – Dictionary containing real cost data, where each key is a column name, and the corresponding value is the real cost associated with that column.
- Returns
real_cost – Updated dictionary containing the unnormalized real cost data.
- Return type
Notes
This function assumes that the
scalers
dictionary contains Scaler objects with adata_range_
attribute. The function does not return a new dictionary object, rather it updates the inputreal_cost
dictionary in-place and returns it.The keys in
real_cost
that are inconstants.TARGETS_IN_MODEL.values()
or not inscalers.keys()
are not considered as optimization keys, and thus are not processed.Examples
>>> from sklearn.preprocessing import MinMaxScaler >>> scalers = {'key1': MinMaxScaler(data_range=(1, 10))} >>> real_cost = {'key1': 5, 'key2': 10} >>> unnormalize_optimization_tags(scalers, real_cost) {'key1': 0.5, 'key2': 10}
- wip.modules.solver_ops.write_constraint(file, constraint, terms, target=False, description=False)[source]#
Write a set of constraints to a file.
- Parameters
file (
_io.TextIOWrapper
) – Filepath to write the constraints to.constraint (
str
) – Name of the restriction to add to the file.terms (
tuple
) – Terms from the constraint equation.target (
bool
, defaultFalse
) – Features that are target variables get treated bywrite_descriptive_constraints
function.description (
bool
, defaultFalse
) – Whether to include a description of the constraint to the file.
- wip.modules.solver_ops.write_descriptive_constraints(file, model_target, datasets, df_detailed, scalers, models_coeficients, features_coeficient, models_results)[source]#
Write constraint built from the Ridge Regression model coefficients.
Model target → name of the selected model
Some target constraints have a different treatment when compared to other features.
- Parameters
file (
TextIOWrapper
) – File to write the constraintsmodel_target (
str
) – Name of the constraint being written. This is the same name of the Ridge model.datasets (
Dict[str
,pd.DataFrame]
) – Dictionary with the datasetsdf_detailed (
pd.DataFrame
) – Table with descriptions for each term of the constraintscalers (
Dict[str
,sklearn.preprocessing.MinMaxScaler]
) – Dictionary with the scalers for each tag (column)models_coeficients (
Dict[str
,Dict[str
,float]]
) –Dictionary with the coefficients of each model. The dictionary should contain the following structure:
{ "model-name": { "tag": coefficient, # ... }, "model-name-2": { "tag": coefficient, # ... }, # ... }
features_coeficient (
zip
) – List of tuples with the features and their coefficientsmodels_results (
Dict[str
,List[Dict[str
,Any]]]
) – Dictionary with the results of each model.
- Returns
Dictionary with the coefficients of each model.
- Return type
Dict[str
,Dict[str
,float]]