Manager
OctoManager.
ExecutionStrategy
Bases: Protocol
Protocol for outer split execution strategies.
Source code in octopus/manager/execution.py
execute(outer_split_data, run_fn)
OctoManager
Orchestrates the execution of outer splits.
Source code in octopus/manager/core.py
n_cpus = field(validator=(validators.instance_of(int)))
class-attribute
instance-attribute
Number of CPUs to use for parallel processing. n_cpus=0 uses all available CPUs. Negative values indicate abs(n_cpus) to leave free, e.g. -1 means use all but one CPU. Set to 1 to disable all parallel processing and run sequentially.
outer_split_data = field(validator=[validators.instance_of(dict)])
class-attribute
instance-attribute
Preprocessed data for each outer split, keyed by outer split identifier.
single_outer_split = field(validator=(validators.optional(validators.and_(validators.instance_of(int), validators.ge(0)))))
class-attribute
instance-attribute
Index of single outer split to run (None for all).
study_context = field(validator=[validators.instance_of(StudyContext)])
class-attribute
instance-attribute
Frozen runtime context containing study configuration.
workflow = field(validator=[validators.instance_of(list)])
class-attribute
instance-attribute
Workflow tasks to execute.
run_outer_splits()
Run all outer splits.
Source code in octopus/manager/core.py
ParallelRayStrategy
Bases: ExecutionStrategy
Run outer splits in parallel using Ray.
This strategy starts as many parallel workers as allowed by the resource configuration set up in ray_parallel.init() and executes one outer split per worker.
Source code in octopus/manager/execution.py
n_cpus_per_worker = field(validator=[validators.instance_of(int), validators.ge(1)])
class-attribute
instance-attribute
Number of CPUs to use for parallel processing within each parallel worker.
execute(outer_split_data, run_fn)
Execute all outer splits in parallel using Ray.
Source code in octopus/manager/execution.py
SequentialStrategy
Bases: ExecutionStrategy
Run outer splits one after another.
Source code in octopus/manager/execution.py
n_cpus = field(validator=[validators.instance_of(int), validators.ge(1)])
class-attribute
instance-attribute
Number of CPUs to use for parallel processing in each sequential step.
execute(outer_split_data, run_fn)
Execute all outer splits sequentially.
Source code in octopus/manager/execution.py
SingleOuterSplitStrategy
Bases: ExecutionStrategy
Run a single outer split by index.
Source code in octopus/manager/execution.py
n_cpus = field(validator=[validators.instance_of(int), validators.ge(1)])
class-attribute
instance-attribute
Number of CPUs to use for parallel processing within the single outer split.
execute(outer_split_data, run_fn)
Execute only the outer split at outer_split_index.
Source code in octopus/manager/execution.py
WorkflowTaskRunner
Runs workflow tasks for a single outer split.
Handles the lifecycle of processing workflow tasks: - Saving split data - Running tasks with dependencies - Saving task results
Attributes:
| Name | Type | Description |
|---|---|---|
study_context |
StudyContext
|
Frozen runtime context containing study configuration. |
workflow |
Sequence[Task]
|
List of workflow tasks to execute. |
Source code in octopus/manager/workflow_runner.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
run(outer_split_id, outer_split, n_assigned_cpus)
Process all workflow tasks for a single outer split.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
outer_split_id
|
int
|
Current outer split ID |
required |
outer_split
|
OuterSplit
|
OuterSplit containing traindev and test DataFrames |
required |
n_assigned_cpus
|
int
|
Number of CPUs assigned to this outer split for inner parallel processing |
required |