# FastSpecFit Code Flow

### Overview

`FastSpecFit` (see documentation [here](https://fastspecfit.readthedocs.io/en/latest/) and web-app [here](https://fastspecfit.desi.lbl.gov/)) is written in pure-python and has dependencies on `numpy`, `scipy`, `astropy`, and `numba`, as well as `desiutil`, `desimodel`, `desitarget`, and `desispec`. The QA code further depends on `matplotlib` and `seaborn`.

The only required input to the code is a `Redrock` FITS filename (documented [here](https://desidatamodel.readthedocs.io/en/latest/DESI_SPECTRO_REDUX/SPECPROD/healpix/SURVEY/PROGRAM/PIXGROUP/PIXNUM/redrock-SURVEY-PROGRAM-PIXNUM.html)) and the corresponding `coadd-` FITS file (see [here](https://desidatamodel.readthedocs.io/en/latest/DESI_SPECTRO_REDUX/SPECPROD/healpix/SURVEY/PROGRAM/PIXGROUP/PIXNUM/coadd-SURVEY-PROGRAM-PIXNUM.html)), which contains the coadded spectra. (`FastSpecFit` works equally well on healpix coadds as on `cumulative`, `pernight` or `perexp` coadds).

The code is parallelized over *files* using `MPI` and parallelized over *targets* using Python `multiprocessing`. Benchmark tests  carried out on Perlmutter found that using `mp=16` (physical, not threaded) multiprocessing cores and `128 * N / mp` MPI tasks, where `N` is the number of requested CPU cores, provided the best balance between I/O and compute time.

Load-balancing between MPI tasks is done using `desispec.parallel.weighted_partition`, where the *weight* of each file is given by the number of spectra to be fitted.

### Sequence

Given a `Redrock` filename, the following steps are carried out in the following sequence:

* Initialize `io.DESISpectra()`. This Class subclasses the cosmology Class; stores basic environment variables; and includes a method for gathering photometry for the targets being fitted.
* Call `io.DESISpectra().select()` to determine which targets to fit given the input `Redrock` file, and to gather broadband photometry for the full sample.  
* Call `io.DESISpectra().read_and_unpack()` to read the individual spectra, generated coadded spectra, correct for dust attenuation, build a per-spectrum emission-line mask, and synthesize photometry from the data. This method is multiprocessed over CPU cores.
* Call `io.init_fastspec_output()` to initialize the output tables and data model.
* Call `fastspecfit.fastspec_one()` to fit each spectrum, parallelized over CPU cores. In this function:
  * Read the continuum templates from disk. (Note: it was found to be faster to have each process read the templates than to read them in the main function and transfer them via pickling.
  * Fit the stellar continuum using `continuum.continuum_specfit()`, which, under the hood, uses `scipy.optimize.nnls()`.
  * Fit the residual emission-line spectrum using `emlines.emline_specfit`, which uses the `trf` method and regularization to carry out bounded, non-linear least-squares using `scipy.optimize.least_squares()`.
* Write out the results using `io.write_fastspecfit()`.

### Known bottlenecks

* 

### Desired features

* 

### Conclusions

* 