Workflows
=========

SO Campaign Manager supports several types of workflows for different analysis tasks.

Overview
--------

Workflows are the fundamental units of computation in SO Campaign Manager. Each workflow:

* Defines a specific analysis task
* Specifies resource requirements
* Includes environment configuration
* Can have dependencies on other workflows

Available Workflows
-------------------

ML Mapmaking
~~~~~~~~~~~~

Maximum likelihood mapmaking creates maps from time-ordered data using iterative algorithms.

**Purpose:** Generate high-quality maps with proper noise modeling and systematics mitigation.

**Configuration Example:**

.. code-block:: toml

   [campaign.ml-mapmaking]
   context = "file:///path/to/context.yaml"
   area = "file:///path/to/area.fits"
   output_dir = "/path/to/output"
   bands = "f090"
   wafer = "ws0"
   comps = "TQU"
   maxiter = 10
   query = "obs_id='1575600533.1575611468.ar5_1'"
   tiled = 1
   site = "act"

**Key Parameters:**

* ``context``: Context file defining data selection and processing parameters
* ``area``: FITS file defining the sky area to map
* ``bands``: Frequency bands to process ("f090", "f150", etc.)
* ``wafer``: Detector wafer identifier
* ``comps``: Map components ("T" for temperature only, "TQU" for temperature and polarization)
* ``maxiter``: Maximum number of iterations for convergence
* ``query``: SQL-like query for data selection
* ``tiled``: Whether to use tiled processing (0 or 1)

**Resource Requirements:**

* Memory-intensive (typically 64-128 GB per process)
* Can benefit from multiple cores for linear algebra operations
* Disk I/O intensive for large datasets

SAT Simulation
~~~~~~~~~~~~~~

Small Aperture Telescope (SAT) simulation workflows for generating synthetic observations
using ``toast_so_sim``.

**Purpose:** Create realistic simulated timestreams for validation and systematics studies.

**Configuration Example:**

.. code-block:: toml

   [campaign.sat-sims]
   output_dir = "/path/to/output"
   schedule = "/path/to/schedule.txt"
   bands = "SAT_f090"
   wafer_slots = "w25"
   sample_rate = 37
   sim_noise = false
   scan_map = false
   sim_atmosphere = false
   sim_sss = false
   sim_hwpss = false

**Key Parameters:**

* ``output_dir``: Directory for simulation output
* ``schedule``: Observation schedule file
* ``bands``: Frequency band (e.g. ``SAT_f090``, ``SAT_f150``)
* ``wafer_slots``: Wafer slot identifier (e.g. ``w25``)
* ``sample_rate``: Detector sample rate in Hz (default: 37)
* ``sim_noise``: Enable noise simulation (boolean)
* ``scan_map``: Enable map scanning (boolean)
* ``sim_atmosphere``: Enable atmosphere simulation (boolean)
* ``sim_sss``: Enable spin-synchronous signal simulation (boolean)
* ``sim_hwpss``: Enable HWP synchronous signal simulation (boolean)
* ``pixels_healpix_radec_nside``: HEALPix resolution (default: 512)

Power Spectra
~~~~~~~~~~~~~

Power spectrum estimation workflow using PSpipe.

**Purpose:** Compute power spectra from maps produced by the mapmaking pipeline.

**Configuration Example:**

.. code-block:: toml

   [campaign.power-spectra]
   subcommand = "/path/to/script.py"
   script_args = ["/path/to/paramfile.dict"]
   script_flags = ["simulate-syst", "simulate-lens"]

**Key Parameters:**

* ``subcommand``: Path to the PSpipe Python script to run
* ``script_args``: Positional arguments passed to the script (list)
* ``script_flags``: Boolean flags passed as ``--flag`` (list)

**Resource Requirements:**

* Scales with the number of map products being cross-correlated
* Some stages (e.g. mode-coupling matrix) are MPI-parallel and benefit from many ranks

ML Null Tests
~~~~~~~~~~~~~

Statistical tests to validate mapmaking results by creating maps from data splits.

**Purpose:** Detect systematic errors and validate noise models by checking that null maps (differences between splits) are consistent with noise.

All null tests share the following common parameters:

* ``chunk_nobs``: Number of observations per chunk used to define splits
* ``context``, ``area``, ``output_dir``, ``query``: Same as ML Mapmaking

**Types of Null Tests:**

Mission Tests
^^^^^^^^^^^^^

Splits observations in time to test for time-dependent systematics.

.. code-block:: toml

   [campaign.ml-null-tests.mission-tests]
   chunk_nobs = 10
   nsplits = 8

Observations are sorted by timestamp, grouped into chunks of ``chunk_nobs``, and
distributed across ``nsplits`` splits in a time-interleaved fashion.

Wafer Tests
^^^^^^^^^^^

Splits observations by detector wafer to test for detector-dependent systematics.

.. code-block:: toml

   [campaign.ml-null-tests.wafer-tests]
   chunk_nobs = 10
   nsplits = 8

Observations are grouped by wafer slot and maps are produced per-wafer for comparison.

Direction Tests
^^^^^^^^^^^^^^^

Splits observations by scan direction (rising, setting, or middle azimuth) to test for
scan-synchronous systematics. Always uses ``nsplits = 2``.

.. code-block:: toml

   [campaign.ml-null-tests.direction-tests]
   chunk_nobs = 10

Observations are classified by azimuth center into rising (az < 180°), setting (az > 180°),
or middle (az ≈ 180°) groups, and time-interleaved splits are created within each group.

PWV Tests
^^^^^^^^^

Splits observations by precipitable water vapour (PWV) level to test for
atmosphere-dependent systematics.

.. code-block:: toml

   [campaign.ml-null-tests.pwv-tests]
   chunk_nobs = 10
   nsplits = 2

Observations are ordered by PWV value and interleaved into splits, separating
low-PWV from high-PWV conditions.

Day/Night Tests
^^^^^^^^^^^^^^^

Splits observations into daytime and nighttime subsets to test for solar-related
systematics.

.. code-block:: toml

   [campaign.ml-null-tests.day-night-tests]
   chunk_nobs = 10
   nsplits = 2

Observations are classified as day or night based on their timestamp and maps are
produced separately for each condition.

Elevation Tests
^^^^^^^^^^^^^^^

Splits observations by telescope elevation to test for elevation-dependent systematics
such as ground pickup or atmospheric gradients.

.. code-block:: toml

   [campaign.ml-null-tests.elevation-tests]
   chunk_nobs = 10
   nsplits = 2

Observations are sorted by elevation center and distributed across splits.

Moon Rise/Set Tests
^^^^^^^^^^^^^^^^^^^

Splits observations by whether the Moon is rising or setting during the observation,
to test for Moon-related contamination correlated with lunar phase angle.

.. code-block:: toml

   [campaign.ml-null-tests.moonrise-set-tests]
   chunk_nobs = 10
   nsplits = 2

Moon Close Tests
^^^^^^^^^^^^^^^^

Splits observations by proximity to the Moon to test for near-field Moon sidelobe
contamination.

.. code-block:: toml

   [campaign.ml-null-tests.moon-close-tests]
   chunk_nobs = 10
   nsplits = 2

Sun Close Tests
^^^^^^^^^^^^^^^

Splits observations by proximity to the Sun to test for near-field Sun sidelobe
contamination.

.. code-block:: toml

   [campaign.ml-null-tests.sun-close-tests]
   chunk_nobs = 10
   nsplits = 2

Creating Custom Workflows
--------------------------

To create a new workflow type:

1. **Inherit from base Workflow class:**

.. code-block:: python

   from socm.core.models import Workflow

   class MyCustomWorkflow(Workflow):
       # Define additional parameters
       custom_param: str
       threshold: float = 0.5

2. **Implement required methods:**

.. code-block:: python

   def get_command(self, **kwargs) -> str:
       """Return the command to execute."""
       return f"{self.executable} {self.subcommand}"

   def get_arguments(self, **kwargs) -> str:
       """Return command arguments."""
       return f"--param {self.custom_param} --threshold {self.threshold}"

3. **Register the workflow:**

.. code-block:: python

   from socm.workflows import registered_workflows

   registered_workflows['my-custom'] = MyCustomWorkflow

Workflow Dependencies
---------------------

Workflows can depend on outputs from other workflows. The campaign manager handles:

* **Dependency resolution** - Ensures workflows run in the correct order
* **Resource optimization** - Schedules dependent workflows as early as possible using HEFT

For TOML-based campaigns, subcampaigns provide a grouping mechanism. For explicit
stage-by-stage dependency graphs, use the DAG YAML format:

.. code-block:: yaml

   stages:
     preprocess:
       executable: python -u
       script: preprocess.py
       depends: null
       resources:
         memory: 48G
         ranks: 1
         threads: 4
         runtime: 10m

     mapmaking:
       executable: python -u
       script: mapmaking.py
       depends:
         - preprocess
       resources:
         ranks: 14
         threads: 8
         memory: 128G
         runtime: 60m

     spectra:
       executable: python -u
       script: spectra.py
       depends:
         - mapmaking
       resources:
         ranks: 4
         threads: 4
         memory: 32G
         runtime: 20m

See the :doc:`user_guide` DAG section and ``examples/dag.yml`` for a full annotated example.

Best Practices
--------------

Resource Sizing
~~~~~~~~~~~~~~~

* **Memory:** Allocate 20-50% more than estimated need
* **Runtime:** Set conservative estimates to avoid queue timeouts
* **Cores:** Balance between parallelization and memory per core

Data Management
~~~~~~~~~~~~~~~

* Use fast local storage for temporary files
* Ensure output directories have sufficient space
* Clean up intermediate files when possible

Configuration
~~~~~~~~~~~~~

* Use descriptive workflow names for tracking
* Document custom parameters in configuration files
* Test workflows on small datasets first

Monitoring
~~~~~~~~~~

* Check log files for workflow progress
* Monitor resource usage to optimize future runs
* Validate outputs before proceeding to dependent workflows

Troubleshooting
---------------

Common Issues
~~~~~~~~~~~~~

**Memory Errors:**
   * Increase memory allocation
   * Reduce data chunk size
   * Use tiled processing for large areas

**Timeout Errors:**
   * Increase runtime estimates
   * Check for hung processes
   * Optimize algorithm parameters

**Dependency Errors:**
   * Verify input file paths
   * Check workflow ordering
   * Ensure dependent outputs exist

**Environment Issues:**
   * Verify environment variables
   * Check module availability
   * Validate file permissions

Performance Tips
~~~~~~~~~~~~~~~~

* Use SSD storage for temporary files
* Optimize number of MPI ranks vs threads
* Consider memory bandwidth limitations
* Profile workflows to identify bottlenecks