trace-data/ traces used for training. THIS DIRECTORY IS ALSO SHARED ON THE WEB AND URL IN THE PAPER APPENDIX 3M trace was used for physics plots and HPO. 15M traces was used for scaling plots /data Data for etalumis SC19 paper plots SingleNode Haswell Performance NERSC_SingleNode/train_loss_log_15Mdata_BZPT-64-Nodes-1-LR-10_scaling-19941335 SingleSocket Comparison of Default (Container) PyTorch with Optimized Pytorch NERSC_SingleNode/train_loss_log_15Mdata_BZPT-64-Nodes-1-LR-10_scaling-20288428_singlenode_Cori_Redo_171MNetwork NERSC_SingleNode/train_loss_log_15Mdata_BZPT-64-Nodes-1-LR-10_scaling-20288111_singlenode_Cori_ContainerPyTorch_Redo_171MNetwork SingleSocket Smaller 3M dataset 151 Param Network (not used in paper but note little difference in rates) data/NERSC_SingleNode/151M_ParamNetwork/ Larry_Diamond_Data - data for Diamond single socket numbers and for imbalance plot in paper physics - numpy files used to make chan2 (and 3) IC plots and RMH numpys - camera ready updated to use Aug5_masterwithprior_chan2_0-* /data/convergence/train_loss_log* used for convergence loss plot data/training-logs/logs Copy of all training logs as of Jun 4 2019 e.g. data/training-logs/logs/rebutrerun_v{run}_cat for the 5 runs used to average for rebuttal data/training-logs/Apr12-Runs/logs/hpo/ Logs used for network HPO search networks for 1k Edison and Cori runs are at /project/projectdirs/dasrepo/etalumis/important_files (also at https://portal.nersc.gov/project/dasrepo/etalumis/important_files/) networks_1kEmu_nodes64_40 used for physics plots in there final physics plot in camera ready (Aug5_masterwithprior) used data/networks/lstm_dim_512_lstm_depth_1_proposal_mixture_components_5/ all plots and numpy also in /global/project/projectdirs/dasrepo/etalumis/plots /scripts scripts/NERSC-SingleNodeThroughput.ipynb - All single socket IVB and HSW numbers and ratio of optimized to default PyTorch scripts/NERSC_single_socket/ Single Socket Execution Scripts run with srun --nodes=1 --ntasks-per-node=1 shifter ./TrainNN_new_allinOne_shifter_BB_scaling_nompi_BB.sh TrainNN_new_allinOne_shifter_BB_scaling_nompi_BB.sh : Burst Buffer Single Socket (Paper table number) TrainNN_new_allinOne_shifter_BB_scaling_nompi_ContainerPyTorch.sh : Container PyTorch (Lustre) rainNN_new_allinOne_shifter_BB_scaling_nompi.sh: Optimized PyTorch (Lustre) scripts/lucas_physicsplots - used for final physics plots libraries/ torch-1.0.0a0+9256bf1-cp37-cp37m-linux_x86_64.whl - optimized wheel with intel opts and 'larry's' distributed optimizations SHOULD ADD: Scripts used for Edison and Cori scaling and convergence runs in reservation Trained network used for physics plots (its at /project/projectdirs/dasrepo/etalumis/important_files/networks_1kEmu_nodes64_40/) Network and files used for loss plot Path of scaling plots (data is in the directory)