Analysis
1 Overview
The Omega Analysis module provides in-situ computation of desired analysis fields from the ocean model state. Analysis fields are computed on-the-fly during simulation runtime and written to output streams at user-specified intervals, providing an alternative to extensive offline post-processing.
The framework is built on a composable operator architecture where operators
can be chained together to produce analysis outputs. This approach enables
user flexibility, avoids the proliferation of hard-coded analysis routines,
and supports future extensibility without architecture changes. The initial
delivery (v1) provides a set of bundled AnalysisGroup types; full
user-configurable operator composition is planned for subsequent updates.
2 Requirements
2.1 Requirement: Composable operator framework
The Analysis system depends on simple, composable operators where each operator performs a single, well-defined transformation. This enables:
New analysis outputs via configuration rather than new code
Testing of individual operations in isolation
Reuse of common operations (spatial and temporal reductions, binary operations) across analysis computations
2.2 Requirement: Availability of all model variables
All simulation variables produced by the model and available for I/O in Omega must be available to the Analysis module. Variables produced by the Analysis system should also be available for further Analysis computation.
2.3 Requirement: Field access via dependency declaration
Operators must declare their input field dependencies at construction time.
During initialization, the orchestrator resolves dependencies and provides
operators with persistent pointers/references to input fields (from simulation
model fields or upstream operators). Operators retain these references and
access fields directly during compute().
2.4 Requirement: Operator registration and factory
New operators must be registerable via a factory pattern, without changes to the core analysis architecture. Operators self-register during initialization, and the orchestrator queries the factory for operators by name. This facilitates future extensibility; new operators integrate into the analysis framework without modifying orchestration code.
2.5 Requirement: Multi-input and multi-output operators
Operators must be able to accept multiple input fields and produce multiple output fields. Multi-input capability enables operators that combine fields (e.g., binary operations, vector operations requiring multiple components). Multi-output capability allows operators to simultaneously return separable results (e.g., components of a vector field, or the components of a spatial gradient).
2.6 Requirement: Computation caching
When multiple output streams or analysis fields depend on the same intermediate result, that result must be computed once per timestep and cached. Timestamp-based cache validation prevents stale results.
2.7 Requirement: Time operators
Time-based operations (mean, min, max over a period) must be regular operators within the analysis framework, enabling composition with spatial operations. Time period specification should be flexible (not limited to hard-coded groups).
2.8 Requirement: Stream integration
Analysis fields must be integrated into the Omega output stream framework. Configurable output stream parameters (filename, precision, period, etc.) must be provided for fields produced by the analysis system. Fields will be written to output with associated metadata.
2.9 Requirement: Polaris compatibility
Output from the Analysis module must be compatible with Polaris for post-processing.
2.10 Requirement: Requested initial analysis capability
Initial delivery of the Analysis system will supply operators necessary for computing a specified set of Analysis outputs:
Global stats: global reduction to mean, min, max, and standard deviation of configurable fields
AMOC: stream function for Atlantic meridional overturning circulation
Eddy stats
3 Algorithmic Formulation
3.1 Operator Composition and Dependency Resolution
The Analysis system represents computations as a directed acyclic graph (DAG) where nodes are operators and edges represent data dependencies. A single Analysis field computation is defined by a string name that may expand into multiple operators forming a chain.
3.1.1 Operator dependencies
Each operator \(\mathcal{O}_i\) produces one or more output fields and requires zero or more input fields:
where each input \(\mathcal{I}_{i,j}\) is either:
A simulation field from the model (terminal node, no incoming operator edge)
An output of another operator \(\mathcal{O}_j\) (creating dependency edge \(\mathcal{O}_j \to \mathcal{O}_i\))
Operator chains: A single Analysis field name \(a\) may parse into an ordered sequence of operators:
where intermediate operators produce fields consumed by subsequent operators in the chain, and only the terminal operator \(\mathcal{O}_m\) writes to the output stream.
Shared intermediates: When multiple Analysis fields require the same intermediate result, the dependency resolver identifies structurally equivalent operators via signature matching:
Two operators with identical signatures are merged into a single node in the DAG, preventing redundant computation.
v1 implementation note: The full DAG construction algorithm below is the target design. The v1 implementation uses a simpler approximation: operator chains are parsed left-to-right and nodes are appended in natural dependency order; dependency edges are resolved post-hoc by matching operator input names against other operators’ output names. Signature-based deduplication, cycle detection, and formal topological sort are planned for subsequent updates.
3.1.2 Dependency graph construction
Algorithm: \(\texttt{Analysis::buildDependencyGraph}\)
Input: Set of requested Analysis field names \(\mathcal{A} = \{a_1, a_2, \ldots, a_n\}\) from all output streams
Output: Directed acyclic graph \(\mathcal{G} = (\mathcal{V}, \mathcal{E})\) where \(\mathcal{V}\) are operator nodes and \(\mathcal{E}\) are data dependency edges, with topological ordering \(\pi : \mathcal{V} \to \mathbb{N}\)
Phase 1: Parse and expand operator chains
Initialize: \(\mathcal{V} \leftarrow \emptyset\), \(\mathcal{E} \leftarrow \emptyset\), \(\Sigma \leftarrow \emptyset\) (signature cache)
For each analysis field \(a \in \mathcal{A}\):
Parse string into chain of operators: \(\{\mathcal{O}_1, \ldots, \mathcal{O}_m\} \leftarrow \texttt{parseOperatorChain}(a)\)
For \(i = 1\) to \(m\):
Compute signature: \(s \leftarrow \text{sig}(\mathcal{O}_i)\)
If \(s \in \Sigma\) (operator already exists):
Retrieve existing node: \(v \leftarrow \Sigma[s]\)
If \(i = m\) (final operator): Add \(a\) to \(v\)’s output list
Else (create new node):
Create node: \(v \leftarrow \text{OperatorNode}(\mathcal{O}_i)\)
If \(i = m\) (final operator): add output for node \(v\) to stream for \(a\), set alarm period
Else (intermediate operator): no stream output, computed on-demand when downstream alarm rings
Add to graph: \(\mathcal{V} \leftarrow \mathcal{V} \cup \{v\}\)
Cache signature: \(\Sigma \leftarrow \Sigma \cup \{(s, v)\}\)
Phase 2: Resolve dependencies
For each operator node \(v \in \mathcal{V}\):
Let \(\mathcal{I}(v) = \{\mathcal{I}_1, \ldots, \mathcal{I}_n\}\) be input fields for \(v\)
For each required input \(\mathcal{I}_j \in \mathcal{I}(v)\):
If \(\mathcal{I}_j\) is a simulation field from the model: terminal dependency, no edge needed
Else if \(\exists\ u \in \mathcal{V}\) such that \(\mathcal{I}_j \in \text{outputs}(u)\):
Add dependency edge: \(\mathcal{E} \leftarrow \mathcal{E} \cup \{(u, v)\}\)
Propagate alarms: For each \(\text{Alarm} \in v.\text{ComputeAlarms}\):
If \(\text{alarm} \notin u.\text{ComputeAlarms}\):
\(u.\text{ComputeAlarms} \leftarrow u.\text{ComputeAlarms} \cup \{\text{Alarm}\}\) (upstream nodes observe all downstream alarms)
Else (field not found): ERROR
Phase 3: Validate acyclicity
Detect cycles using depth-first search with recursion stack:
\[\begin{split} \text{hasCycle}(\mathcal{G}) = \begin{cases} \texttt{true} & \text{if } \exists \text{ path } v_1 \to v_2 \to \cdots \to v_n \to v_1 \\ \texttt{false} & \text{otherwise} \end{cases} \end{split}\]If cycle detected: ERROR
Phase 4: Topological sort
Compute topological ordering \(\pi : \mathcal{V} \to \{0, 1, \ldots, |\mathcal{V}|-1\}\) using Kahn’s algorithm:
\(\text{inDegree}(v) \leftarrow |\{u \in \mathcal{V} : (u,v) \in \mathcal{E}\}|\) for all \(v\)
\(Q \leftarrow \{v \in \mathcal{V} : \text{inDegree}(v) = 0\}\)
While \(Q \neq \emptyset\):
Remove \(v\) from \(Q\); assign \(\pi(v) \leftarrow \text{order}\); increment order; append \(v\) to sorted list
For each \((v, w) \in \mathcal{E}\): decrement \(\text{inDegree}(w)\); if zero, add \(w\) to \(Q\)
If \(|\text{sorted}| \neq |\mathcal{V}|\): ERROR (cycle)
Return \(\mathcal{G}\) with ordering \(\pi\)
3.2 Operator Factory and Registration
The operator factory provides a runtime registry that maps operator type names to constructor functions. This enables:
Decentralized registration: Operators register themselves via a template helper before
main()executesDynamic instantiation: The orchestrator creates operators by name without hard-coded switch statements
Type-safe dispatch: The factory selects the correct templated specialization based on the input field’s runtime metadata (scalar type, rank, memory location)
Extensibility: New operators can be added without modifying orchestration code
3.2.1 Templated operator specializations
Analysis operators are class templates parameterized on the concrete Kokkos
array type ArrayT of their primary input field:
template<typename ArrayT>
class SpatialMaxOp : public AnalysisOperator { ... };
The factory registers all combinations of scalar type (I4/I8/R4/R8), rank (1–5), and memory location (Device/Host/Both) for each operator template:
AnalysisOpFactory::registerAllArrayVariants<SpatialMaxOp>("SpatialMax");
At operator creation time, the factory inspects the primary upstream Field’s metadata to select the matching specialization.
3.2.2 Registration
Algorithm: AnalysisOpFactory::registerAllArrayVariants<OpT>(BaseName)
Expand
OMEGA_ANALYSIS_ARRAY_TYPESmacro over all (DType, Rank, MemLoc, ArrayT) combinationsFor each combination, call
registerOperatorwith keybaseName + "_" + ArrayT + "_" + memlocand a lambda that constructsOpT<ArrayT>(UpstreamNames, Options)Validate key is unique (abort on duplicate)
All base analysis operators are registered at program startup by
registerAllBaseAnalysisOperators(), called from Analysis::init().
3.2.3 Factory operator creation
Algorithm: AnalysisOpFactory::createOp
Input: Operator type name, upstream field names, configuration options
Output: unique_ptr<AnalysisOperator> with the correct typed specialization
Retrieve the primary upstream Field from the Field registry
Extract
ArrayDataType, rank, andArrayMemLocfrom Field metadataBuild fully-qualified type key:
OpType + "_" + ArrayTypeName + "_" + MemLocLook up constructor in registry; abort if not found
Invoke constructor with
(UpstreamNames, Options)and return result
3.3 Runtime Dispatch
The main Analysis computational loop is executed every timestep. Dependencies are traversed recursively so that upstream operators are always fresh when a downstream operator needs them. Caching prevents redundant work when multiple downstream operators share an upstream.
v1 implementation note: The loop over
SortedOperatorsbelow assumes a topological ordering computed bybuildDependencyGraph. In v1, nodes are iterated in insertion order (which is naturally dependency-correct for linearly chained operators). The full topological sort andcomputeRecursiveare the target design.
Algorithm: Analysis::computeAll
Input: Topologically sorted operator list, current timestamp
Output: Updated Analysis fields written to registered output streams
For each \(\texttt{Op} \in \texttt{SortedOperators}\):
If any alarm in \(\texttt{Op.ComputeAlarms}\) is ringing:
\(\texttt{computeRecursive(Op, TimeStamp)}\)
\(\texttt{computeRecursive(Op, TimeStamp)}\):
If \(\texttt{Op.FieldComputed}\) AND \(\texttt{Op.LastComputed == TimeStamp}\): return (cache hit)
For each \(\texttt{UpstreamOp} \in \texttt{Op.Upstreams}\):
\(\texttt{computeRecursive(UpstreamOp, TimeStamp)}\)
\(\texttt{Op.compute(TimeStamp)}\)
\(\texttt{Op.LastComputed} \leftarrow \texttt{TimeStamp}\); \(\texttt{Op.FieldComputed} \leftarrow \texttt{true}\)
3.4 Alarm Model
Each OperatorNode holds a vector of non-owning alarm pointers
(vector<Alarm*> ComputeAlarms). An operator is triggered when any of its
alarms rings.
Discrete-sampling (non-temporal-reduction) terminal operators: borrow a raw pointer to the write alarm of the associated output stream. The stream owns this alarm.
Temporal reduction terminal operators: require two alarms. An accumulation alarm controls how frequently a sample is added to the running sum; its interval is a user-configurable
AccumulationIntervalparameter (defaulting to every timestep in v1). An output alarm (borrowed from the associated stream, as for discrete-sampling operators) controls when the accumulated sum is divided by the sample count and written to output. Accumulation alarms are owned by theAnalysisobject asvector<unique_ptr<Alarm>> AccumulationAlarms. Each temporal reduction operator’sComputeAlarmsvector contains two pointers: a raw pointer to its accumulation alarm and a raw pointer to its output alarm.Intermediate (non-terminal) operators: receive alarm pointers propagated from their downstream operators. Propagation is performed by
Analysis::propagateAlarmsUpstream(), which iterates until no further changes occur.
This design ensures alarms have a clear single owner (a stream or the Analysis class) while allowing any number of operator nodes to observe them.
v1 constraint: Temporal reduction periods must be evenly divisible into the restart interval. This is validated during
createAnalysisGroupStreams()to ensure proper checkpoint/restart behavior.
3.5 AnalysisGroup Configuration
Each child node of the Analysis: group in the configuration YAML file
represents an AnalysisGroup. The orchestrator iterates over these nodes during
initialization and dispatches to the appropriate handler:
Named pre-defined groups (e.g.
GlobalStats): Dispatched by name to a derivedAnalysisGroupsubclass. The subclass reads its own config parameters and constructs the appropriate operator chains and output streams internally.Custom user-defined groups (future): Config nodes not matching a pre-defined name will be parsed as user-defined groups of composable operator chains using the full chain-parsing and DAG machinery described in Section 3.1.
Example config structure:
Omega:
Analysis:
GlobalStats: # pre-defined bundled group
Enable: true
Fields: ["NormalVelocity", "Temperature", "Salinity"]
SpatialStats: ["Max", "Min", "Mean", "StdDev"]
ReductionPeriod: ["1Day", "1Month"]
SampleFreq: ["1Hour"]
Filename: global.stats.$Y
Stream: # define optional stream parameters
FileFreq: 1
FileFreqUnits: years
MyCustomGroup: # future: user-defined composable group
Enable: false
OperatorChains: # final DSL syntax to be determined
- "FieldA_Op1_Op2(FieldB)"
- "Op3(FieldC,FieldD)_Op4"
Filename: custom.analysis.$Y.$M
4 Design
4.1 Data types and parameters
4.1.1 Configuration
The Analysis config node is a map of group names to group-specific config
sub-nodes. Each group sub-node must contain at minimum an Enable boolean
key. Additional keys are group-specific. The AnalysisGroup base class
provides a StreamParams helper for translating group config options into
IOStream::create arguments. Each group may generate multiple output
streams depending on its configuration; for example, a GlobalStats group
with multiple reduction periods (e.g., ["1Day", "1Month"]) will create
separate streams for each period, grouping operator chains by their output
frequency and whether they perform temporal reduction (e.g., TimeMeanOp) or
discrete sampling (i.e., instantaneous snapshots).
4.1.2 Classes
AnalysisOperator
The AnalysisOperator class is the abstract base class from which all
concrete operators are derived. It is parameterized on the Kokkos array
type ArrayT in derived classes. Output field data arrays are allocated as
members of the derived class and created in the constructor; the Field
registry entry is also created at construction time. The initialize()
method is called after all fields exist, primarily to store mesh/env
pointers needed by compute().
// Temporal operators have an accumulation phase and an operation/output phase
class AnalysisOperator {
public:
AnalysisOperator();
~AnalysisOperator();
/// Return name for this operator type
const std::string getOperatorType();
/// Return unique name for this operator instance.
/// Derived from the concatenated upstream field names and operator type,
/// e.g. "Temperature_SpatialMean_TimeMean1Day"
const std::string getName();
/// Return names of fields required by this operator
const std::vector<std::string> getInputFieldNames();
/// Return names of output fields produced by this operator
const std::vector<std::string> getOutputFieldNames();
/// Returns true if the output field has already been computed for TimeStamp
bool isCacheValid(const TimeInstant &TimeStamp);
/// Initialize operator: store mesh/env pointers needed by compute().
virtual void initialize(const MachEnv *InEnv,
const HorzMesh *Mesh,
const VertCoord *VCoord,
Config Options);
/// Set period alarm for temporal reduction operators
/// Default implementation does nothing (non-temporal operators ignore this)
virtual void setPeriodAlarm(Alarm *Alarm);
/// Perform computation of Analysis fields. Retrieves input data from the
/// Field registry using input field names. Writes to operator-owned output
/// arrays which are attached to the Field registry.
virtual void compute(const TimeInstant &TimeStamp) = 0;
protected:
std::string OperatorTypeName;
std::string InstanceName;
std::vector<std::string> InputNames;
std::vector<std::string> OutputNames;
TimeInstant LastComputed;
bool FieldComputed;
};
Helper utilities for building operator Config objects inline:
// Create a Config from key-value pairs
// Usage: makeOpConfig(opParam("Period", "1day"), opParam("Layer", 10))
template<typename T>
OpParam<T> opParam(std::string Key, T&& Value);
template<typename T, typename... Args>
Config makeOpConfig(const std::pair<std::string, T>& Param, Args... OtherArgs);
These helpers enable in-code construction of Config objects for passing
parameters to operator constructors, using the same YAML-based Config
interface that reads from configuration files. This provides a uniform
parameter-passing mechanism: operators receive a Config object whether
instantiated from user config or programmatically by a bundled
AnalysisGroup. The pattern avoids constructor signature proliferation as
operators gain parameters, maintains type safety via Config::get<T>(), and
allows operator-specific validation and defaults to be centralized in the
constructor.
Example derived operator — SpatialMaxOp
template<typename ArrayT>
class SpatialMaxOp : public AnalysisOperator {
public:
using ScalarT = typename ArrayT::non_const_value_type;
/// Constructor: sets InputNames, creates output Field and data array.
/// InstanceName = UpstreamNames[0] + "_SpatialMax"
SpatialMaxOp(const std::vector<std::string> &UpstreamNames,
Config Options);
/// Retrieves typed input array from the Field registry and calls
/// globalMaxVal() to compute the MPI-global maximum.
void compute(const TimeInstant &TimeStamp) override;
private:
const HorzMesh *Mesh;
const VertCoord *VCoord;
MPI_Comm Comm;
/// Output data — one scalar value stored as a 1D Array of length 1
typename Array1D<ScalarT>::type OutputData;
ScalarT SpatialMax;
};
AnalysisOpFactory
Factory class for creating AnalysisOperator instances. The class
itself is a singleton with all static methods; internally it maintains a
Meyer’s singleton registry map. The factory dispatches to the correct
templated specialization at runtime by inspecting the primary upstream
Field’s metadata.
class AnalysisOpFactory {
public:
using CreatorFunc = std::function<std::unique_ptr<AnalysisOperator>(
const std::vector<std::string> &UpstreamNames, Config Options)>;
/// Register a single operator variant by string label
static void registerOperator(const std::string &Label,
CreatorFunc Creator);
/// Create an operator instance. Inspects Field metadata of UpstreamNames[0]
/// to select the correct templated specialization.
static std::unique_ptr<AnalysisOperator> createOp(
const std::string &OpType,
const std::vector<std::string> &UpstreamNames,
Config Options
);
/// Register all scalar type × rank × memory location variants of a
/// templated operator class.
/// Usage: registerAllArrayVariants<SpatialMaxOp>("SpatialMax");
template<template<typename> class OperatorTemplate>
static void registerAllArrayVariants(const std::string &BaseName);
/// Check if operator type is registered
static bool hasOperator(const std::string &Type);
private:
static std::map<std::string, CreatorFunc>& registry(); // Meyer's singleton
static std::string getArrayTypeName(ArrayDataType DType,
I4 Rank,
ArrayMemLoc MemLoc);
};
All base analysis operators are registered by calling:
void Analysis::registerAllBaseAnalysisOperators();
from Analysis::init() before any operators are instantiated.
OperatorNode
Internal representation of a node in the Analysis operator graph.
struct OperatorNode {
std::unique_ptr<AnalysisOperator> Op; ///< Operator instance (owned)
std::vector<OperatorNode*> Upstreams; ///< Upstream dependencies (non-owning)
std::vector<std::string> StreamNames; ///< Associated output stream names
std::vector<Alarm*> ComputeAlarms; ///< Alarms triggering compute (non-owning)
};
Operators with a non-empty StreamNames vector are terminal nodes whose
output is written to one or more output streams. Operators with an empty
StreamNames vector are intermediate nodes computed on demand when a
downstream alarm rings.
AnalysisGroup
AnalysisGroup is the abstract base class for bundled analysis groups. In
v1, concrete derived classes (e.g. GlobalStats) encapsulate the config
parsing, operator construction, and stream creation for a named analysis
group. In the future, the same base class will support user-defined custom
groups specified entirely in config, where the user supplies composable
operator chains within the group’s config node.
The base class provides a StreamParams helper for translating group config
into IOStream::create arguments, and createAnalysisGroupStreams() which
groups operator chains by their output period and type, validates temporal
reduction periods against the restart interval, and creates the associated
IOStream objects.
class AnalysisGroup {
public:
virtual ~AnalysisGroup() = default;
std::string getName();
/// Groups operator chains by stream characteristics, creates IOStream
/// objects, associates operator output fields with streams, and stores
/// AnalysisStream metadata on the Analysis orchestrator.
void createAnalysisGroupStreams(
const std::string &GroupName,
Config &AnalysisGroupOptions,
Analysis *AnalysisMgr
);
protected:
/// Metadata about a single operator chain within this group
struct OpChainInfo {
std::string ChainStr; ///< Operator instance name (output field name)
std::string FreqStr; ///< Period/frequency string, e.g. "1day", "6hour"
bool IsTimeReduction; ///< true = temporal reduction; false = discrete sample
};
/// Template for constructing an IOStream config for this group's output
struct StreamParams {
StreamParams(); // default values for all IOStream options
void apply(const std::map<std::string, std::string> &Overrides);
Config toConfig() const;
std::map<std::string, std::string> Params;
};
std::string GroupName;
std::vector<OpChainInfo> OpChainInfos; ///< All operator chains in this group
};
GlobalStats (derived AnalysisGroup)
GlobalStats is the first concrete AnalysisGroup subclass. It reads
Fields, SpatialStats, ReductionPeriod, and SampleFreq from the group
config and constructs a matrix of spatial-reduction operator chains, each
optionally followed by a temporal reduction operator. The ReductionPeriod
parameter specifies temporal reduction intervals (e.g., “1Day”, “1Month”)
for outputs computed by temporal reduction operators such as TimeMeanOp,
while the SampleFreq parameter specifies discrete sampling intervals for
instantaneous snapshots of the analysis fields.
class GlobalStats : public AnalysisGroup {
public:
GlobalStats(const std::string &GroupName,
Config &AnalysisGroupOptions,
Analysis *AnalysisMgr);
~GlobalStats() = default;
};
For each (field, stat, period) combination, the constructor builds a chain
string of the form FieldName_SpatialStat_TimeMeanPeriod and calls
AnalysisMgr->parseChainAndBuildOps(). For each
(field, stat, samplefreq) combination, it builds FieldName_SpatialStat chains.
After all chains are registered, it calls createAnalysisGroupStreams().
Analysis
Analysis is the top-level orchestrator class. It owns the OperatorNode
list, the accumulation alarms for temporal reduction operators. It is
responsible for reading the config, constructing AnalysisGroup instances,
resolving operator dependencies, and scheduling compute calls via the
alarm system.
class Analysis {
public:
/// Initialize the Analysis module: register all base operators,
/// retrieve mesh/vertcoord/clock, create the Default Analysis instance.
/// Must be called after HorzMesh, VertCoord, and TimeStepper are initialized.
static void init();
/// Create a named Analysis instance
static Analysis *create(const std::string &Name,
const MachEnv *Env,
const HorzMesh *Mesh,
const VertCoord *VCoord,
Clock *ModelClock,
Config *Options);
/// Called each timestep to trigger all operators whose alarms are ringing
void computeAll();
/// Parse an underscore-delimited operator chain string and register all
/// operators in the chain that do not yet exist as Fields
void parseChainAndBuildOps(const std::string &OpChainStr);
/// Instantiate a single operator and append it as an OperatorNode
void registerAnalysisOp(const std::string &OpName,
const std::vector<std::string> &UpstreamNames,
Config Options);
/// Get a pointer to the model clock (used by AnalysisGroup for stream creation)
Clock *&getModelClock();
/// Check whether a node with FullOpName is already registered
bool OpNodeExists(const std::string &FullOpName);
static Analysis *getDefault();
static void finalize();
~Analysis();
private:
/// Accumulation alarms owned by Analysis for temporal reduction operators
std::vector<std::unique_ptr<Alarm>> AccumulationAlarms;
static Analysis *DefAnalysis;
static std::map<std::string, std::unique_ptr<Analysis>> AllAnalysisObjects;
Analysis(const std::string &Name,
const MachEnv *Env,
const HorzMesh *Mesh,
const VertCoord *VCoord,
Clock *ModelClock,
Config *Options);
std::string Name;
Clock *ModelClock;
const HorzMesh *Mesh;
const VertCoord *VCoord;
/// All registered operator nodes
std::vector<std::unique_ptr<OperatorNode>> OpNodes;
// Private Methods
/// Register all built-in operator types with the AnalysisOpFactory
static void registerAllBaseAnalysisOperators();
/// Post-hoc dependency resolution: match input field names against
/// other nodes' output field names to populate Upstreams vectors.
void buildOperatorDependencies();
/// Set ComputeAlarms on terminal nodes and propagate alarms upstream.
void setComputeAlarms();
/// Iteratively propagate downstream alarms to upstream nodes
void propagateAlarmsUpstream();
Analysis(const Analysis &) = delete;
Analysis(Analysis &&) = delete;
};
4.2 Operator chain string convention
Operator instance names (and the names of the Fields they produce) follow the convention that each component is separated by an underscore character:
FieldName_Op1[Params]_Op2[Params]...
Examples:
Temperature_SpatialMax— spatial maximum of TemperatureNormalVelocity_SpatialMean_TimeMean1day— 1-day time average of the spatial mean of NormalVelocityPseudoThickness_SpatialStdDev— spatial standard deviation of PseudoThickness (implicitly requiresPseudoThickness_SpatialMeanas a shared intermediate)
The parseChainAndBuildOps() method splits on _, reconstructs the running
prefix at each node, and creates an operator only if the corresponding output
Field does not already exist — enabling natural sharing of intermediate
results without an explicit signature cache.
Note on operator chain syntax: The exact form of operator chain strings shown in examples throughout this document represents a preliminary syntax for the v1 implementation. The final syntax for fully composable user-defined operator chains will be refined in future versions. The current v1 implementation focuses on pre-defined bundled groups (e.g.,
GlobalStats) with group-specific configuration parameters.
5 Verification and Testing
5.1 Test: Individual operator correctness
For each operator type (SpatialMax, SpatialMin, SpatialMean, SpatialStdDev,
TimeMean in the first batch), construct a small test mesh with analytic field
values. Call compute() directly and verify output against a known-answer
solution. For TimeMean specifically, verify accumulation over multiple
timesteps, verify correct mean calculation at period end, and test with
different AccumulationInterval settings. This unit test validates each
operator in isolation before integration testing.
5.2 Test: Dependency resolution and execution order
Create configurations with shared intermediate operators (e.g.,
Field_SpatialMean_TimeMean1day and Field_SpatialStdDev both requiring
Field_SpatialMean). Verify that buildOperatorDependencies() correctly
populates the Upstreams vectors, that intermediate results are computed
exactly once per timestep (cache validation), and that upstream operators
complete before downstream operators execute (correct execution order). This
test verifies DAG construction and cache-based deduplication.
5.3 Test: Alarm system
Create operators with multiple downstream consumers at different frequencies.
Verify that propagateAlarmsUpstream() correctly propagates alarms from
terminal nodes to all upstream dependencies. Verify that setPeriodAlarm()
correctly injects period alarms for temporal reduction operators. Verify that
TimeMeanOp correctly accumulates samples during the accumulation phase and
finalizes when the period alarm rings. Verify that operators with multiple
alarms in ComputeAlarms trigger when ANY alarm rings. Verify that
intermediate (non-terminal) operators with empty StreamNames are computed
on-demand when downstream alarms ring and do not create output files. This
test verifies the alarm-driven scheduling mechanism.
5.4 Test: Factory registration and type dispatch
Verify that all base analysis operators register correctly via
registerAllBaseAnalysisOperators(). Verify that the factory can instantiate
operators for all supported array types (I4/I8/R4/R8, ranks 1-5,
Device/Host/Both). Verify that AnalysisOpFactory::createOp() correctly
inspects upstream Field metadata (scalar type, rank, memory location) and
selects the matching template specialization. Verify that appropriate errors
are produced when requesting unregistered operator types or array type
combinations. This test verifies the extensibility mechanism and type-safe
dispatch.
5.5 Test: Configuration parsing and validation
Verify that parseChainAndBuildOps() correctly handles valid operator chain
strings and reuses existing intermediate Fields rather than creating
duplicates. Verify that parseChainAndBuildOps() produces informative error
messages for unrecognized operator names or missing input fields. Verify that
makeOpConfig() and opParam() helper functions correctly construct Config
objects for inline parameter passing. Verify that operator constructors
correctly extract and validate parameters from Config objects, with appropriate
error handling for missing required parameters or invalid types. Verify that
createAnalysisGroupStreams() correctly groups operator chains by period and
type, validates temporal reduction periods against the restart interval via
TimeInterval::isDivisibleBy(), and creates the expected set of IOStream
objects. Verify that StreamParams::apply() correctly overrides default stream
parameters with group-specific configuration. This test verifies the user
interface and configuration system.
5.6 Test: End-to-end integration
Complete system test exercising all components from configuration parsing through NetCDF output for global statistics. Advance the clock through one or more output periods, and verify that output files contain the expected fields with correct values. This test validates the complete workflow with real mesh and I/O.
5.7 Test: Advanced DAG features (future)
Once the full DAG construction algorithm is implemented, create configurations with circular dependencies and verify that cycle detection produces appropriate errors. Test signature-based deduplication to ensure structurally equivalent operators are merged into single nodes. Verify formal topological sort produces correct execution ordering for complex DAGs. This test validates future enhancements to dependency resolution.