# Example for custom parameter passing in surrogate models This example shows how to define surrogate models with custom model parameters. It also shows the validations that are done and how to specify these parameters through a configuration. This example assumes some basic familiarity with using BayBE. We thus refer to [`campaign`](./../Basics/campaign.md) for a basic example. ## Necessary imports ```python import numpy as np ``` ```python from baybe.campaign import Campaign from baybe.objectives import SingleTargetObjective from baybe.parameters import ( CategoricalParameter, NumericalDiscreteParameter, SubstanceParameter, ) from baybe.recommenders import ( BotorchRecommender, FPSRecommender, TwoPhaseMetaRecommender, ) from baybe.searchspace import SearchSpace from baybe.surrogates import NGBoostSurrogate from baybe.targets import NumericalTarget from baybe.utils.dataframe import add_fake_results ``` ## Experiment Setup ```python parameters = [ CategoricalParameter( name="Granularity", values=["coarse", "medium", "fine"], encoding="OHE", ), NumericalDiscreteParameter( name="Pressure[bar]", values=[1, 5, 10], tolerance=0.2, ), NumericalDiscreteParameter( name="Temperature[degree_C]", values=np.linspace(100, 200, 10), ), SubstanceParameter( name="Solvent", data={ "Solvent A": "COC", "Solvent B": "CCC", "Solvent C": "O", "Solvent D": "CS(=O)C", }, encoding="MORDRED", ), ] ``` ## Create a surrogate model with custom model parameters Please note that model_params is an optional argument: The defaults will be used if none specified ```python surrogate_model = NGBoostSurrogate(model_params={"n_estimators": 50, "verbose": True}) ``` ## Validation of model parameters ```python try: invalid_surrogate_model = NGBoostSurrogate(model_params={"NOT_A_PARAM": None}) except ValueError as e: print("The validator will give an error here:") print(e) ``` The validator will give an error here: Invalid model params for NGBoostSurrogate: NOT_A_PARAM. ## Links for documentation [`RandomForestModel`](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html) [`NGBoostModel`](https://stanfordmlgroup.github.io/ngboost/1-useage.html) [`BayesianLinearModel`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ARDRegression.html) ## Creating the campaign ```python campaign = Campaign( searchspace=SearchSpace.from_product(parameters=parameters, constraints=None), objective=SingleTargetObjective(target=NumericalTarget(name="Yield", mode="MAX")), recommender=TwoPhaseMetaRecommender( recommender=BotorchRecommender(surrogate_model=surrogate_model), initial_recommender=FPSRecommender(), ), ) ``` ________________________________________________________________________________ [Memory] Calling baybe.utils.chemistry._smiles_to_mordred_features... _smiles_to_mordred_features('COC') _______________________________________smiles_to_mordred_features - 0.1s, 0.0min ________________________________________________________________________________ [Memory] Calling baybe.utils.chemistry._smiles_to_mordred_features... _smiles_to_mordred_features('CCC') _______________________________________smiles_to_mordred_features - 0.0s, 0.0min ________________________________________________________________________________ [Memory] Calling baybe.utils.chemistry._smiles_to_mordred_features... _smiles_to_mordred_features('O') _______________________________________smiles_to_mordred_features - 0.0s, 0.0min ________________________________________________________________________________ [Memory] Calling baybe.utils.chemistry._smiles_to_mordred_features... _smiles_to_mordred_features('CS(=O)C') _______________________________________smiles_to_mordred_features - 0.0s, 0.0min ## Iterate with recommendations and measurements We can print the surrogate model object ```python print("The model object in json format:") print(surrogate_model.to_json(), end="\n" * 3) ``` The model object in json format: {"type": "NGBoostSurrogate", "model_params": {"n_estimators": 50, "verbose": true}} ```python # Let's do a first round of recommendation recommendation = campaign.recommend(batch_size=1) ``` ```python print("Recommendation from campaign:") print(recommendation) ``` Recommendation from campaign: Granularity Pressure[bar] Temperature[degree_C] Solvent 3 coarse 1.0 100.0 Solvent D ```python # Add some fake results add_fake_results(recommendation, campaign.targets) campaign.add_measurements(recommendation) ``` ## Model Outputs Note that this model is only triggered when there is data. ```python print("Here you will see some model outputs as we set verbose to True") ``` Here you will see some model outputs as we set verbose to True ```python # Do another round of recommendation recommendation = campaign.recommend(batch_size=1) ``` Print second round of recommendation ```python print("Recommendation from campaign:") print(recommendation) ``` Recommendation from campaign: Granularity Pressure[bar] Temperature[degree_C] Solvent index 0 coarse 1.0 100.0 Solvent A ## Using configuration instead Note that this can be placed inside an overall campaign config Refer to [`create_from_config`](./../Serialization/create_from_config.md) for an example Note that the following explicit call `str()` is not strictly necessary. It is included since our method of converting this example to a markdown file does not interpret this part of the code as `python` code if we do not include this call. ```python CONFIG = str( """ { "type": "NGBoostSurrogate", "model_params": { "n_estimators": 50, "verbose": true } } """ ) ``` ```python ### Model creation from json recreate_model = NGBoostSurrogate.from_json(CONFIG) ``` This configuration creates the same model ```python assert recreate_model == surrogate_model ```