The DESC ELAsTiCC Challenge

[DESC Logo]

Current Status

The ELAsTiCC2 training set is available.

ELAsTiCC2 pre-streaming is currently in progress. The alerts being sent here are the same ones from the original ELAsTiCC. None of the data returned from this streaming will be used for evaluating classifiers; the entire purpose of this is to test connections and make sure the code that ingests and produces alerts is all working. There have been a few schema and taxonomy changes since ELAsTiCC. The plan for ELAsTiCC2 is to stream at 3× the rate of the first ELAsTiCC campaign, so that all three simulated years of LSST will be streamed in about 1 month.

The current pre-stream test can be found on the kafka server at public.alerts.ztf.uw.edu:9092 in topic elasticc2-pre-test-2023july-i1. A histogram of how many alerts are sent out each hour each night (updated through the day before the current day) can be found on the temporary TOM used for the pre-streaming. (The desc-tom-2 server is being used just for these pre-stream tests; when actual ELAsTiCC2 begins, we will be back to using desc-tom.lbl.gov for everything.)

ELAsTiCC2 will start streaming for real on August 31.

#elasticc-comms

This channel on the LSST Slack is where you can contact the ELAsTiCC team and discuss the campaign. If you are not on the LSST Slack, you can join this one channel using Slack Connect. (That link expires every 14 days, and we need to renew it, so if you find the link expired, please email raknop@lbl.gov and me to update the link.)

Status & Metrics Web Page

ELAsTiCC page on the DESC TOM has links to diagnostics and metrics for ELAsTiCC. You need an account on the TOM to load this page; contact Rob Knop for an account if you're a DESC member and you don't have an account but need one.

About ELAsTiCC

The purpose of ELAsTiCC ("Extended LSST Astronomical Time-series Classification Challenge") is to spur the creation and testing of an end-to-end real-time pipeline for time-domain science. The challenge starts with a simulation of ~5 million detected events that includes ~50 million alerts. These alerts will be streamed from LSST to brokers, who will classify the events and send new alerts with classifications back to DESC. A talk about ELAsTiCC given at the LSSTC Enabling Science Broker Workshop in 2021 can be found on YouTube.

For discussion or questions about the challenge, use the #elasticc-comms channel on the DESC Slack.

The first ELAsTiCC campaign ran from September 2022 until early January 2023. Metrics and diagnostics from that campaign can be found on the ELAsTiCC page of the DESC Tom (login required). A second campaign (which we're calling ELAsTiCC2) will start in summer 2023.

There is a new github repository for ELAsTiCC-related code and information: LSSTDESC/elasticc.


Timeline

ELAsTiCC2

Original ELAsTiCC Campaign


ELAsTiCC Poster at the Jan 2003 AAS

[PDF of Poster]

(Click image for PDF.)


Participants

For questions, message #elasticc-comms on the DESC Slack.

ELAsTiCC Lead: Gautham Naryan (UIUC)

ELAsTiCC team members: Alex Gagliano (UIUC), Alex Malz (Ruhr-Universitat Bochum), Catarina Alves (University College, London), Deep Chatterjee (UIUC), Emille Ishida (Université Cleremont-Ferrand), Heather Kelly (SLAC), John Franklin Crenshaw (U. Washington), Konstantin Malanchev (UIUC), Laura Salo (UMN), Maria Vincenzi (ICG Portsmouth), Martine Lokken (U. Toronto), Qifeng Cheng (UIUC), Rahul Biswas (Oskar Klein Center), Renée Holžek (U. Toronto), Rick Kessler (U. Chicago), Robert Knop (LBNL), Ved Shah Gautam (UIUC)

Brokers:

Technical info and data for participants

The DESC TOM

ELAsTiCC results are being collected at the DESC TOM. An account on this TOM is required to go to the ELAsTiCC pages (linked in the navbar at the top); contact Rob Knop if you need an account.

Some relevant pages on the DESC TOM:

For ELAsTiCC2 pre-tests, we are currently using desc-tom-2.lbl.gov. However, this is a temporary server that is only being used for these pre-tests. Once ELAsTiCC2 starts, we'll do everything on the main DESC TOM server.


Alert Schema

(Links updated for ELAsTiCC2.)

The alert schema can be found in the alert_schema subdirectory of the LSSTDESC/elasticc github repository: https://github.com/LSSTDESC/elasticc/tree/main/alert_schema.

Brokers will ingest alerts in the elasticc.v0_9_1.alert.avsc format. (A perusal of the schema will reveal that some of the other schema in that directory are embedded in this.) They will issue alerts, which DESC will then ingest, in the elasticc.v0_9_1.brokerClassification.avsc schema. The mapping of event type to classId can be found in a Jupyter notebook in the taxonomy subdirectory of the github archive.

All alerts will be published without embedded schema on Kafka servers (both to and from brokers). As such, for things to work, everybody needs to be using the same version of the alerts. The schema are fairly stable, although we do anticipate a change in the brokerClassification alert on Wednesday, June 29.

Forced Photometry in Alerts

The first detection of a transient will not have any forced photometry. The model is that the project will need time to produce that forced photometry.

All detections at least one night later than the first detection will have forced photometry going back to 30 days before the first detection.

For example, suppose object 42 is detected on MJD 60305, 60306, 60310, and 60340:


Training Set

ELAsTiCC2

Current ELAsTiCC2 training set (updated 2023 July 13):

The ELAsTiCC2 training set has a similar set of models as the original ELAsTiCC training set. The AGN model has been changed with a "changing-look AGN" (CLAGN) model, where a fraction of them will jump in intensity at some point during the three years. The Ia sample now uses SALT3 rather than SALT2 for its lightcurves. The cadence has been updated (and is close to, but not the same as what will be used in the actual ELAsTiCC2 set.

The old (original) ELAsTiCC2 training set (released 2023 June 6) can be found in the ELASTICC2_TRAINING_SAMPLE directory.

Original ELAsTiCC Challenge

The training set is a set of simulated events for classification teams to use to train their models.

The training set was updated on 2022 June 25 to include a few bug fixes and a small fraction of host-less extragalactic events.

Where to find the training set
Format of training set files

The format of the training set files is outlined in the file A_FORMAT.TXT (found in the same directory as the training set). A log of the models produced by the SNANA simulation is in the file A_MODEL_SUMMARY.TXT.

This Jupyter notebook has a demo of using the ELAsTiCC photo-z quantiles.


Truth Tables

Original ELAsTiCC Challenge (Sep 2022-Jan 2023)

The truth tables are now available to all. They are in the TOM database in the tables elasticc_diatruth and elasticc_diaobjecttruth. For the schema of the tables and an example using the Web API for sending SQL queries to the TOM db, see sql_query_tom_db.py in the GitHub tom_desc archive.

Here are CSV files of the truth tables:

The "OBJECT" truth tables have truth for each object; the column SNID corresponds to the field diaObjectId from the alerts. The "ALERT" truth tables have information for each source (there was one alert for each source); the column SourceID corresponds to the field diaSourceId from the alerts. The object type is in the GENTYPE (for object alerts) or TRUE_GENTYPE (for source alerts). These do not correspond directly to the taxonomy brokers used to classify objects, but are internal types corresponding to SNANA models. The definitions of these types may be found in the file elasticc_origmap.txt in the alert_schema subdirectory of the elasticc GitHub archive.

Broker classifications used the Taxonomy. The mapping between SNANA gentype and taxonomy id are in the tables elasticc_gentypeofclassid and elasticc_classidofgentype in the DESC TOM database; those tables are below in CSV format:

There were some types of objects that were in the ELAsTiCC set that were deliberately not in the training set. These have SNANA GENTYPE 71-74 and 98. 71-74 represent strongly lensed SN Ia/II/Ib/Ic, and 98 is...special. Here is a note by Rick Kessler and Justin Pierel describing the strongly lensed SNe.


Set of 5000 alerts sent previously

5000alerts.tar.gz

That tar file has 5000 .avro.gz files for the first 5000 alerts sent out for simulated night MJD 60562. These alerts were sent out as part of ELAsTiCC on October 31. They are here for brokers to use for debugging purposes.

One question we're interested in: are alerts making it all the way through the system? That is, are brokers succesfully receiving them, are classifications succesfully running, and is the DESC Tom succesfully receiving the classifications that the broker meant to send? A suggested use for this test set is to run them through the classification machinery for a classifier and decide whether or not a classification message would have been sent for this alert. Let me (Rob) know which ones should have had a classification sent; I will check the TOM database to see if those alerts do in fact have a classification for that classifier.


Classification codes and models

Classification teams: If you have classification code you wish to make available for brokers to run, please provide links to the code and the trained models on #elasticc-comms on the DESC Slack. If you wish to have those links published on this web page here, tag Rob Knop on that channel with the information, or email me at raknop@lbl.gov.


Database Access Resources

All the alerts that we are streaming are also archived on a database in the DESC TOM. This database will also ingest all of the broker messages from the ELAsTiCC brokers. If you wish to analyze the results, you will need an account on the TOM; if you don't have one, contact Rob Knop and ask him to make you one. Here are some resources to help you access this database:

If you are curious, here are three slides about the ELAsTiCC TOM database that diagrams the relationships between the tables holding information for ELaSTiCC, and provides some estimates of the ultimate sizes of these tables.