baybe.utils.augmentation.df_apply_dependency_augmentation¶

baybe.utils.augmentation.df_apply_dependency_augmentation(df: DataFrame, causing: tuple[str, Sequence], affected: Collection[tuple[str, Sequence]])[source]¶

Augment a dataframe if dependency invariant columns are present.

This works with the concept of column-values pairs for causing and affected column. Any row present where the specified causing column has one of the provided values will trigger an augmentation on the affected columns. The latter are augmented by going through all their invariant values and adding respective new rows.

Parameters:

df (DataFrame) – The dataframe that should be augmented.
causing (tuple[str, Sequence]) – Causing column name and its causing values.
affected (Collection[tuple[str, Sequence]]) – Affected columns and their invariant values.

Return type:

DataFrame

Returns:

The augmented dataframe containing the original one. Augmented row indices are identical with the index of their original row.

Examples

>>> df = pd.DataFrame({'A':[0,1],'B':[2,3], 'C': [5, 5], 'D': [6, 7]})
>>> df
   A  B  C  D
0  0  2  5  6
1  1  3  5  7

>>> causing = ('A', [0])
>>> affected = [('B', [2,3,4])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
0  0  4  5  6
1  1  3  5  7

>>> causing = ('A', [0])
>>> affected = [('B', [2,3,4])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
0  0  4  5  6
1  1  3  5  7

>>> causing = ('A', [0, 1])
>>> affected = [('B', [2,3])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
1  1  2  5  7
1  1  3  5  7

>>> causing = ('A', [0])
>>> affected = [('B', [2,3]), ('C', [5, 6])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  2  6  6
0  0  3  5  6
0  0  3  6  6
1  1  3  5  7