E3SM Bundled Ensemble Simulation Runner

---

## Overview

This workflow provides `run_ensemble.sh`, a script designed to run large E3SM ensembles efficiently on Slurm-based HPC systems.

The key feature is a bundled execution strategy:
A single large Slurm allocation is used to run many ensemble members concurrently as background processes. This avoids queue overhead and significantly improves throughput.

The workflow consists of three stages:

1. Compilation
2. Case setup
3. Execution

---

## Key Features

### Bundled execution

Runs multiple simulations inside one Slurm job using:

```bash
./case.submit --no-batch
```

### Compile-once strategy

Only EN00 builds `e3sm.exe`; all other members reuse it.

### Parallel setup

Ensemble members are configured concurrently with a job limit.

### Flexible workflow control

Controlled by flags:

* `do_e3sm_compile`
* `do_ensemble_setup`
* `do_continue_run`
* `do_ensemble_run`

### Restart handling

Automatically copies and links restart files for continuation runs.

### Hook support

* `eam_boundle_extra.sh` (runtime monitoring)
* `eam_boundle_cycling.sh` (post-processing)

---

## Prerequisites

* Slurm-based HPC system
* Working E3SM environment (`xmlchange`, `case.setup`, etc.)
* Required configuration file:
  `create_and_setup_case.sh`
* Template script:
  `compile_and_setup_e3sm_<resolution>.<compset>.sh`

---

## Configuration

All configuration is defined in:

```bash
create_and_setup_case.sh
```

### Main control flags

* `do_e3sm_compile = "true" / "false"`
* `do_ensemble_setup = "true" / "false"`
* `do_continue_run = "true" / "false"`
* `do_ensemble_run = "true" / "false"`

### Key variables

#### Ensemble

* `my_ensnum` (number of members)
* `my_leadymd` (initialization dates)

#### Paths

* `my_workdir`
* `my_runpath`

#### E3SM

* `my_compset`
* `my_resolution`

#### Run length

* `my_runopt`
* `my_runn`

#### Restart

* `my_refdir`
* `my_refcase1`
* `my_refcase2`

#### HPC

* `my_project`
* `my_jobqueue`
* `my_walltime`

---

## Workflow Description

## Stage 1: Compilation

### EN00

Create + build case → produces `e3sm.exe`

### Other members

Setup only (no build)

All members reuse the same executable.

---

## Stage 2: Case Setup

For each ensemble member:

* create directories (case / run / archive)
* copy restart files
* configure XML settings
* modify namelists (`user_nl_*`)
* run `case.setup`

Supports:

* startup runs (`do_ensemble_setup`)
* continuation runs (`do_continue_run`)

Setup is parallelized with a concurrency limit.

---

## Stage 3: Bundled Execution

### Core logic

### Compute concurrency

```bash
NMAXPS = total_nodes / nodes_per_member
```

### Launch runs

```bash
./case.submit --no-batch &
```

### Batch execution

* launch up to `NMAXPS` jobs
* wait for completion
* continue

### Monitor

* jobs
* Slurm status

Exit when all jobs complete.

---

## Usage

## Step 1: Compile and Setup

Edit `create_and_setup_case.sh`:

```bash
do_compile_setup_only="true"
do_continue_run="false"
```

where `do_compile_setup_only="true"` enables:

* `do_e3sm_compile="true"`
* `do_ensemble_setup="true"`
* `do_ensemble_run="false"`

Run:

```bash
./run_ensemble.sh
```

---

## Step 2: Run Ensemble (initial run)

Edit `create_and_setup_case.sh`:

```bash
do_compile_setup_only="false"
do_continue_run="false"
```

This enables:

* `do_e3sm_compile="false"`
* `do_ensemble_setup="false"`
* `do_ensemble_run="true"`

Submit:

```bash
sbatch run_ensemble.sh
```

---

## Step 3: Run Ensemble (restart run)

```bash
do_compile_setup_only="false"
do_continue_run="true"
```

This enables:

* `do_e3sm_compile="false"`
* `do_ensemble_setup="false"`
* `do_ensemble_run="false"`
* `do_continue_run="true"`

Submit:

```bash
sbatch run_ensemble.sh
```

---

## Notes for S2D / Hindcast Workflows

Designed for large ensemble hindcasts:
multiple initialization dates × multiple members

Works well with bundled strategies:
for example, 7-node or 28-node layouts

Suitable for:

* S2D experiments
* FOSIRL initialization
* large ensemble campaigns

### Key advantage

Maximizes node utilization while minimizing queue overhead.

---

## Special Note for S2D Simulations Beyond 2014

The S2D simulation period extends beyond 2014; however, the prescribed historical external forcing dataset is only available from 1850 through 2014. As a result, simulations crossing this boundary must switch from historical forcing to future scenario forcing (SSP245) starting in 2015.

This transition introduces a potential discontinuity between 2014 and 2015 because the forcing source changes from historical to scenario-based forcing. Differences in forcing structure, long-term trends, background states, or implementation details between the historical and SSP245 datasets may create an artificial jump or inconsistency in the simulation. Such discontinuities can affect the continuity of the model evolution and should be carefully considered when interpreting results across the transition period, particularly for subseasonal-to-decadal (S2D) hindcast analyses and anomaly assessments.

Users are advised to pay special attention to diagnostics near the 2014–2015 boundary and to account for this forcing transition when evaluating long-term consistency, drift behavior, and forecast skill.

## Summary

This workflow provides a scalable and efficient solution for running large E3SM ensembles on HPC systems.

### Main strengths

* efficient resource utilization
* reduced queue overhead
* robust parallel workflow
* suitable for production S2D simulations
