Constraints¶
Experimental campaigns often have naturally arising constraints on the parameters and their combinations. Such constraints could for example be:
When optimizing a mixture, the relative concentrations of the used ingredients must add up to 1.0.
For chemical reactions, a reagent might be incompatible with high temperatures, hence these combinations must be excluded.
Certain settings are dependent on other parameters, e.g. a set of parameters only becomes relevant if another parameter called
"Switch"
has the value"on"
.
Similar to parameters, BayBE distinguishes two families of constraints, derived from the
abstract Constraint
class: discrete and
continuous constraints (DiscreteConstraint
,
ContinuousConstraint
).
A constraint is called discrete/continuous if it operates on a set of exclusively
discrete/continuous parameters.
Hybrid constraints
Currently, BayBE does not support hybrid constraints, that is, constraints which operate on a mixed set of discrete and continuous parameters. If such a constraint is necessary, it is possible to rephrase the parametrization so that the parameter set is exclusively discrete or continuous in most cases.
Continuous Constraints¶
Warning
Not all surrogate models are able to treat continuous constraints. In such situations the constraints are currently silently ignored.
ContinuousLinearEqualityConstraint¶
The ContinuousLinearEqualityConstraint
asserts that the following equation is true (up to numerical rounding errors):
where \(x_i\) is the value of the \(i\)’th parameter affected by the constraint, \(c_i\) is the coefficient for that parameter, and \(\text{rhs}\) is a user-chosen number.
As an example, let’s assume we have three parameters named x_1
, x_2
and
x_3
, which describe the relative concentrations in a mixture campaign.
The constraint assuring that they always sum up to 1.0 would look like this:
from baybe.constraints import ContinuousLinearEqualityConstraint
ContinuousLinearEqualityConstraint(
parameters=["x_1", "x_2", "x_3"], # these parameters must exist in the search space
coefficients=[1.0, 1.0, 1.0],
rhs=1.0,
)
A more detailed example can be found here.
ContinuousLinearInequalityConstraint¶
The ContinuousLinearInequalityConstraint
asserts that the following equation is true (up to numerical rounding errors):
where \(x_i\) is the value of the \(i\)’th parameter affected by the constraint, \(c_i\) is the coefficient for that parameter, and \(\text{rhs}\) is a user-chosen number.
Reversing the inequality
You can specify a constraint involving <=
instead of >=
by multiplying
both sides, i.e. the coefficients and rhs, by -1.
Let us amend the example from ContinuousLinearEqualityConstraint
and assume
that there is always a fourth component to the mixture that serves as a “filler”.
In such a case, we might want to ensure that the first three components only make up to
80% of the mixture.
The following constraint would achieve this:
from baybe.constraints import ContinuousLinearInequalityConstraint
ContinuousLinearInequalityConstraint(
parameters=["x_1", "x_2", "x_3"], # these parameters must exist in the search space
coefficients=[-1.0, -1.0, -1.0],
rhs=-0.8, # coefficients and rhs are negated because we model a `<=` constraint
)
A more detailed example can be found here.
Conditions¶
Conditions are elements used within discrete constraints. While discrete constraints can operate on one or multiple parameters, a condition always describes the relation of a single parameter to its possible values. It is through chaining several conditions in constraints that we can build complex logical expressions for them.
ThresholdCondition¶
For numerical parameters, we might want to select a certain range, which can be
achieved with a ThresholdCondition
:
from baybe.constraints import ThresholdCondition
ThresholdCondition( # will select all values above 150
threshold=150,
operator=">",
)
SubSelectionCondition¶
In case a specific subset of values needs to be selected, it can be done with the
SubSelectionCondition
:
from baybe.constraints import SubSelectionCondition
SubSelectionCondition( # will select two solvents identified by their labels
selection=["Ethanol", "DMF"]
)
Discrete Constraints¶
Discrete constraints currently do not affect the optimization process directly.
Instead, they act as a filter on the search space.
For instance, a search space created via from_product
might include invalid combinations, which can be removed again by applying constraints.
Discrete constraints have in common that they operate on one or more parameters,
identified by the parameters
member, which expects a list of parameter names as
strings.
All of these parameters must be present in the search space specification.
DiscreteExcludeConstraint¶
The DiscreteExcludeConstraint
constraint simply removes a set of search space elements, according to its
specifications.
The following example would exclude entries where “Ethanol” and “DMF” are combined with temperatures above 150, which might be due to their chemical instability at those temperatures:
from baybe.constraints import (
DiscreteExcludeConstraint,
ThresholdCondition,
SubSelectionCondition,
)
DiscreteExcludeConstraint(
parameters=["Temperature", "Solvent"], # names of the affected parameters
combiner="AND", # specifies how the conditions are logically combined
conditions=[ # requires one condition for each entry in parameters
ThresholdCondition(threshold=150, operator=">"),
SubSelectionCondition(selection=["Ethanol", "DMF"]),
],
)
A more detailed example can be found here.
DiscreteSumConstraint and DiscreteProductConstraint¶
DiscreteSumConstraint
and DiscreteProductConstraint
impose conditions on sums or products of numerical parameters.
In the example from ContinuousLinearEqualityConstraint
, we
had three continuous parameters x_1
, x_2
and x_3
, which needed to sum
up to 1.0.
If these parameters were instead discrete, the corresponding constraint would look like:
from baybe.constraints import DiscreteSumConstraint, ThresholdCondition
DiscreteSumConstraint(
parameters=["x_1", "x_2", "x_3"],
condition=ThresholdCondition( # set condition that should apply to the sum
threshold=1.0,
operator="=",
tolerance=0.001, # optional; here, everything between 0.999 and 1.001 would also be considered valid
),
)
An end to end example can be found here.
DiscreteNoLabelDuplicatesConstraint¶
Sometimes, duplicated labels in several parameters are undesirable.
Consider an example with two solvents that describe different mixture
components.
These might have the exact same or overlapping sets of possible values, e.g.
["Water", "THF", "Octanol"]
.
It would not necessarily be reasonable to allow values in which both solvents show the
same label/component.
We can exclude such occurrences with the
DiscreteNoLabelDuplicatesConstraint
:
from baybe.constraints import DiscreteNoLabelDuplicatesConstraint
DiscreteNoLabelDuplicatesConstraint(parameters=["Solvent_1", "Solvent_2"])
Without this constraint, combinations like below would be possible:
Solvent_1 |
Solvent_2 |
With DiscreteNoLabelDuplicatesConstraint |
|
---|---|---|---|
1 |
Water |
Water |
would be excluded |
2 |
THF |
Water |
|
3 |
Octanol |
Octanol |
would be excluded |
The usage of DiscreteNoLabelDuplicatesConstraint
is part of the
example on mixtures.
DiscreteLinkedParametersConstraint¶
The DiscreteLinkedParametersConstraint
is, in a sense, the opposite of the
DiscreteNoLabelDuplicatesConstraint
.
It will ensure that only entries with duplicated labels are present.
This can be useful, for instance, in situations where we have one parameter but would
like to include it with several encodings:
from baybe.parameters import SubstanceParameter
from baybe.constraints import DiscreteLinkedParametersConstraint
dict_solvents = {"Water": "O", "THF": "C1CCOC1", "Octanol": "CCCCCCCCO"}
solvent_encoding1 = SubstanceParameter(
name="Solvent_RDKIT_enc",
data=dict_solvents,
encoding="RDKIT",
)
solvent_encoding2 = SubstanceParameter(
name="Solvent_MORDRED_enc",
data=dict_solvents,
encoding="MORDRED",
)
DiscreteLinkedParametersConstraint(
parameters=["Solvent_RDKIT_enc", "Solvent_MORDRED_enc"]
)
Solvent_RDKIT_enc |
Solvent_MORDRED_enc |
With DiscreteLinkedParametersConstraint |
|
---|---|---|---|
1 |
Water |
Water |
|
2 |
THF |
Water |
would be excluded |
3 |
Octanol |
Octanol |
DiscreteDependenciesConstraint¶
A dependency is a situation where parameters depend on other parameters.
Let’s say an experimental setup has a parameter called "Switch"
, which turns on
pieces of equipment that are optional.
This means the other parameters (called affected_parameters
) are only relevant if
the switch parameter has the value "on"
. If the switch is "off"
, the affected
parameters are irrelevant.
You can specify such a dependency with the
DiscreteDependenciesConstraint
, which requires:
A list
parameters
with the names of the parameters upon which others depend.A list
conditions
, specifying the values of the corresponding entries inparameters
that “activate” the dependent parameters.A list of lists, each containing the
affected_parameters
, which become relevant only if the corresponding entry inparameters
is active as specified by the entry inconditions
.
Internally, BayBE drops elements from the SearchSpace
where affected parameters are
irrelevant. Since in our example "off"
is still a valid value for the switch, the
SearchSpace
will still retain one configuration for that setting, showing arbitrary
values for the affected_parameters
(which can be ignored).
Important
BayBE requires that all dependencies are declared in a single
DiscreteDependenciesConstraint
. Creating a SearchSpace
from multiple
DiscreteDependenciesConstraint
’s will throw a validation error.
In the example below, we mimic a situation where there are two switches and each switch
activates two other parameters that are only relevant if the first switch is "on"
/ the
second switch is set to "right"
, respectively.
from baybe.constraints import DiscreteDependenciesConstraint, SubSelectionCondition
DiscreteDependenciesConstraint(
parameters=["Switch_1", "Switch_2"], # the two parameters upon which others depend
conditions=[
SubSelectionCondition(
# values of Switch_1 that activate the affected parameters
selection=["on"]
),
SubSelectionCondition(
# values of Switch_2 that activate the affected parameters
selection=["right"]
),
],
affected_parameters=[
["Solvent", "Fraction"], # parameters affected by Switch_1
["Frame_1", "Frame_2"], # parameters affected by Switch_2
],
)
An end to end example can be found here.
DiscretePermutationInvarianceConstraint¶
Permutation invariance, enabled by the
DiscretePermutationInvarianceConstraint
, is a property where combinations of values of multiple
parameters do not depend on their order due to some symmetry in the experiment.
Suppose we create a mixture containing up to three solvents, i.e. parameters
“Solvent_1”, “Solvent_2”, “Solvent_3”.
In this situation, all combinations from the following table would be equivalent,
hence the SearchSpace
should effectively only contain one of them.
Solvent_1 |
Solvent_2 |
Solvent_3 |
|
---|---|---|---|
1 |
Substance_43 |
Substance_3 |
Substance_12 |
2 |
Substance_43 |
Substance_12 |
Substance_3 |
3 |
Substance_3 |
Substance_12 |
Substance_43 |
4 |
Substance_3 |
Substance_43 |
Substance_12 |
5 |
Substance_12 |
Substance_43 |
Substance_3 |
6 |
Substance_12 |
Substance_3 |
Substance_43 |
Note
Complex properties such as permutation invariance not only affect the search space but should ideally also constrain the surrogate model. For instance, the kernels in a Gaussian process can be made permutation-invariant to reflect this constraint, which generally results in a better learning curve. Note that at this stage no surrogate model provided by BayBE takes care of these invariances. This means the invariance is ignored during model fitting and these models do not benefit from a priori known constraints and invariances between parameters. However, generally, the optimization will still work. We are in the process of enabling this as new feature, but in the meantime the user can introduce their own custom surrogate model to include these.
Let’s add to the mixture example the fact that not only the choice of substance but also
their relative mixture fractions are parameters, i.e. “Fraction_1”, “Fraction_2” and
“Fraction_3”.
This also implies that the solvent parameters depend on their corresponding
fraction being > 0.0
, because in the case == 0.0
the choice of solvent is
irrelevant. This models a scenario that allows “up to, but not necessarily,
three solvents”.
Important
If some of the parameters
of the DiscretePermutationInvarianceConstraint
are
dependent on other parameters, we require that the dependencies are provided as a
DiscreteDependenciesConstraint
to the dependencies
argument of the
DiscretePermutationInvarianceConstraint
. This
DiscreteDependenciesConstraint
will not count towards the maximum limit of one
DiscreteDependenciesConstraint
discussed here.
The DiscretePermutationInvarianceConstraint
below applies to our example and
removes permutation-invariant combinations of solvents that have additional
dependencies as well:
from baybe.constraints import (
DiscretePermutationInvarianceConstraint,
DiscreteDependenciesConstraint,
ThresholdCondition,
)
DiscretePermutationInvarianceConstraint(
parameters=["Solvent_1", "Solvent_2", "Solvent_3"],
# `dependencies` is optional; it is only required if some of the permutation
# invariant entries in `parameters` have dependencies on other parameters
dependencies=DiscreteDependenciesConstraint(
parameters=["Fraction_1", "Fraction_2", "Fraction_3"],
conditions=[
ThresholdCondition(threshold=0.0, operator=">"),
ThresholdCondition(threshold=0.0, operator=">"),
ThresholdCondition(threshold=0.0, operator=">"),
],
affected_parameters=[["Solvent_1"], ["Solvent_2"], ["Solvent_3"]],
),
)
The usage of DiscretePermutationInvarianceConstraint
is also part of the
example on mixtures.
DiscreteCustomConstraint¶
With a DiscreteCustomConstraint
constraint, you can specify a completely custom filter:
import pandas as pd
import numpy as np
from baybe.constraints import DiscreteCustomConstraint
def custom_filter(df: pd.DataFrame) -> pd.Series: # this signature is required
"""
In this example, we exclude entries where the square root of the
temperature times the cubed pressure are larger than 5.6.
"""
mask_good = np.sqrt(df["Temperature"]) * np.power(df["Pressure"], 3) <= 5.6
return mask_good
DiscreteCustomConstraint(
parameters=[ # the custom function will have access to these variables
"Pressure",
"Temperature",
],
validator=custom_filter,
)
Find a detailed example here.
Warning
Due to the arbitrary nature of code and dependencies that can be used in the
DiscreteCustomConstraint
, de-/serializability cannot be guaranteed. As a consequence,
using a DiscreteCustomConstraint
results in an error if you attempt to serialize
the corresponding object or higher-level objects containing it.