DSI Examples
============

PENNANT mini-app
----------------

`PENNANT` is an unstructured mesh physics mini-application developed at Los Alamos National Laboratory
for advanced architecture research.
It contains mesh data structures and a few
physics algorithms from radiation hydrodynamics and serves as an example of
typical memory access patterns for an HPC simulation code.

This DSI PENNANT example is used to show a common use case: create and query a set of metadata derived from an ensemble of simulation runs. 
The example GitHub directory includes 10 PENNANT runs using the PENNANT *Leblanc* test problem.

In the first step, a python script is used to parse the slurm output files and create a CSV (comma separated value) file with the output metadata.

.. code-block:: unixconfig

   python3 parse_slurm_output.py

In the second step, another python script,

.. code-block:: unixconfig

   python3 dsi_pennant.py

reads in the CSV file and creates a database:

.. literalinclude:: ../examples/pennant/dsi_pennant.py

Resulting in the output of the query:

..  figure:: images/example-pennant-output.png
    :alt: Screenshot of computer program output.
    :class: with-shadow

    The output of the PENNANT example.


Wildfire Dataset
----------------

This example highlights the use of the DSI framework with QUIC-Fire simulation data and resulting images. 
QUIC-Fire is a fire-atmosphere modeling framework for prescribed fire burn analysis. 
It is light-weight (able to run on a laptop), allowing scientists to generate ensembles of thousands of simulations in weeks. 
This QUIC-fire dataset is an ensemble of prescribed fire burns for the Wawona region of Yosemite National Park.

The original file, wildfire.csv, lists 1889 runs of a wildfire simulation. Each row is a unique run with input and output values and associated image url. 
The columns list the various parameters of interest. 
The input columns are: wild_speed, wdir (wind direction), smois (surface moisture), fuels, ignition, safe_unsafe_ignition_pattern, 
safe_unsafe_fire_behavior, does_fire_meet_objectives, and rationale_if_unsafe. 
The output of the simulation (and post-processing steps) include the burned_area and the url to the wildfire images stored on the San Diego Super Computer.

After loading dsi, run this example within the ``dsi/examples/wildfire/`` folder as all filepaths are relative to that location:

.. code-block:: unixconfig

   python3 wildfire.py

.. literalinclude:: ../examples/wildfire/wildfire.py


.. _user_schema_example_label:

Cloverleaf (Complex Schemas)
-------------------------------

This example shows how to use DSI with ensemble data from 8 Cloverleaf_Serial runs, and how to create a complex schema compatible with DSI.

The directory with this sample input and output data can be found in ``examples/clover3d/`` where each run has its own subfolder.
Each run's input file is ``clover.in`` and the output is ``clover.out`` and the associated VTK files.

After loading dsi, run this example within the ``dsi/examples/user/`` folder as all filepaths are relative to that location:

.. code-block:: unixconfig

   python3 7.schema.py

This workflow uses a custom Cloverleaf reader to load the data, along with a complex schema that maps the input data, output data, and VTK files to the respective simulation runs.
Once executing the workflow, users can see that the state2_density value is the only input parameter changed for each run.

.. literalinclude:: ../examples/user/7.schema.py

where ``examples/test/example_schema.json`` is:

.. code-block:: json

   {
      "simulation": {
         "primary_key": "sim_id"
      }, 
      "input": {
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"]
         }
      }, 
      "output": {
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"]
         }
      },
      "viz_files": {
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"]
         }
      }
   }
   
and the generated ER diagram is:

..  figure:: images/schema_erd.png
    :scale: 35%
    :align: center

    Entity Relationship Diagram of Cloverleaf data. 
    Displays relations between the simulation, input, output, and viz_files tables.

This section explains how to define primary and foreign key relationships in a JSON file for ``schema()``, such as ``examples/test/example_schema.json``

For futher clarity, each schema file must be structured as a dictionary where:

   - Each table with a relation is a key whose value is a nested dictionary storing primary and foreign key information

      - Ex from above: "input" : { ... }
   - The nested dictionary has 2 keys: 'primary_key' and 'foreign_key' which must be spelled exactly the same to be processed:
   - The value of 'primary_key' is this table's column that is a primary key

      - Ex from above: "primary_key" : "sim_id"
   - The value of 'foreign_key' is another inner dictionary, since a table can have multiple foreign keys:

      - Each key in this dictionary is a column in this table that serves as a foreign key
      - Each value is a list with 2 elements - the table storing the associated primary key, and the column in that table which is the primary key
      - Ex: "foreign_key" : { "name" : ["table1", "table1_id"] , "age" : ["table2", "table2_id"] }
   - If a table does not have a primary or foreign key, you do not have to include them in the table's nested dictionary

For example, if we update the Cloverleaf schema by adding a new primary and foreign key relation (assuming the columns exist):

.. code-block:: json

   {
      "simulation": {
         "primary_key": "sim_id"
      }, 
      "input": {
         "primary_key": "input_id",                  // <--- new primary key
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"]
         }
      }, 
      "output": {
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"],
               "input_id": ["input", "input_id"]   // <--- new foreign key
         }
      },
      "viz_files": {
         "foreign_key": {
               "sim_id": ["simulation", "sim_id"]
         }
      }
   }

our new ER diagram would be:

..  figure:: images/schema_erd_added.png
    :scale: 35%
    :align: center
   
    ER Diagram of same data. However, there is now an additional primary/foreign key relation from "input" to "output"