Derivatives on Pytorch Nested Cylinder Models

Feature dervivatives have been implemented on the Pytorch Nested Cylinder Models.

Table of Contents:

Feature Derivatives Script
- Arguments
Calico Model Diagram
Calico Model Creation
- Unit Tests
- Arguments
Calico Dataset & Dataloader Creation
- Unit Tests
- Arguments

Feature Derivatives Script 

Calculates the derivatives of model outputs with respect to scaling an internal feature using a calico network.

Only operational on pytorch models trainined on the nested cylinder experiment.

Exports derivatives and other sample information as a pandas-readable .csv, including:

Difference between the scaled and unscaled outputs
Derivative of the outputs; equivalent to the difference divided by the dScale
Prediction of the unscaled network
Truth PTW scaling corresponding to the input sample, and identifying sample information

Input Line: python ftderivatives_pyt_nestedcyl.py -M ../examples/pyt_nestedcyl/trained_rho2PTW_model.pth -IF rho -ID ../examples/pyt_nestedcyl/data/ -DF ../examples/pyt_nestedcyl/nestedcyl_design_file.csv -L interp_module.interpActivations.10 -T 1 -S ../examples/pyt_nestedcyl/figures/

Arguments 

Uncollapse Arguments

Calculates the derivatives of model outputs with respect to scaling an internal feature using a calico network.

usage: python ftderivatives_pyt_nestedcyl [-h] [--MODEL] [--INPUT_FIELD]
                                          [--INPUT_DIR] [--FILE_LIST]
                                          [--DESIGN_FILE] [--PRINT_LAYERS]
                                          [--PRINT_FEATURES] [--PRINT_FIELDS]
                                          [--PRINT_KEYS] [--PRINT_SAMPLES]
                                          [--LAYER] [--FEATURES  [...]]
                                          [--D_SCALE] [--FIXED_KEY]
                                          [--NUM_SAMPLES] [--SAVE_FIG]

Named Arguments

--MODEL, -M

Model file

Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”

--INPUT_FIELD, -IF

The radiographic/hydrodynamic field the model is trained on

Default: “pRad”

--INPUT_DIR, -ID

Directory path where all of the .npz files are stored

Default: “../examples/tf_coupon/data/”

--FILE_LIST, -FL

The .txt file containing a list of .npz file paths; use “MAKE” to generate a file list given an input directory (passed with -ID) and a number of samples (passed with -NS).

Default: “MAKE”

--DESIGN_FILE, -DF

The .csv file with master design study parameters

Default: “../examples/tf_coupon/coupon_design_file.csv”

--PRINT_LAYERS, -PL

Prints list of layer names in a model (passed with -M) and quits program

Default: False

--PRINT_FEATURES, -PT

Prints number of features extracted by a layer (passed with -L) and quits program

Default: False

--PRINT_FIELDS, -PF

Prints list of hydrodynamic/radiographic fields present in a given .npz file (passed with -IN) and quits program

Default: False

--PRINT_KEYS, -PK

Prints list of choices for the fixed key avialable in a given input dirrectory (passed with -ID) and quits program

Default: False

--PRINT_SAMPLES, -PS

Prints number of samples in a directory (passed with -ID) matching a fixed key (passed with -XK) and quits program

Default: False

--LAYER, -L

Name of model layer that features will be extracted from

Default: “None”

--FEATURES, -T

List of features to include; “Grid” plots all features in one figure using subplots; “All” plots all features each in a new figure; A list of integers can be passed to plot those features each in a new figure. Integer convention starts at 1.

Default: [‘1’]

--D_SCALE, -DS

Scaling factor used in feature derivatives.

Default: 0.001

--FIXED_KEY, -XK

The identifying string for some subset of all data samples; pass “None” to consider all samples

Default: “None”

--NUM_SAMPLES, -NS

Number of samples to use; pass “All” to use all samples in a given input dirrectory (passed with -ID)

Default: “All”

--SAVE_FIG, -S

Directory to save the outputs to.

Default: “../examples/tf_coupon/figures/”

Calico Model Diagram 

Calico Model Creation 

Defines the calico model for the single branch pytorch nested cylinder models

Execution will print unit test information, perform unit tests, and print the results to the terminal.

Input Line: python pyt_nestedcyl_calico_model.py -M ../../../examples/pyt_nestedcyl/trained_rho2PTW_model.pth -IF rho -IN ../../../examples/pyt_nestedcyl/data/nc231213_Sn_id0643_pvi_idx00112.npz -DF ../../../examples/pyt_nestedcyl/nestedcyl_design_file.csv -L interp_module.interpActivations.10

fns.derivatives.pyt_nestedcyl_calico_model.recurr_getattr(obj, attr, default=None)

Recursive getattr function; allowes ‘attr’ to contain ‘.’

Code from: https://programanddesign.com/python-2/recursive-getsethas-attr/

Parameters

obj (object) –
attr (str) – attribute to obtain; can contain ‘.’, which would be a recurrsive attribute
default – value that is returned when the named attribute is not found

Returns

Attribute of Object

class fns.derivatives.pyt_nestedcyl_calico_model.make_calico(model, lay, ftIDX=0, dScale=0.01)

Pytorch model class that creates a “calico network” from an existing nested cylinder neural net

Parameters

model (loaded pytorch model) – model to copy into calico
lay (str) – the name of the layer in model that will become the split layer
ftIDX (int) – index of the feature to scale; feature w.r.t. the derivative is taken
dScale (float) – derivative scaling factor

forward(x)

Forward pass of pytorch neural network class

Parameters

x (Union[torch.FloatTensor, torch.cuda.FloatTensor]) – input to layer

Returns

splitx (torch.tensor[float]) – prediction from original model
diff (torch.tensor[float]) – difference in prediction between original model and calico model

fns.derivatives.pyt_nestedcyl_calico_model.load_calico(model, checkpoint, device, lay, ftIDX=0, dScale=0.01)

Function to create a pytorch nested cylinder calico model and load in the correct weights

Parameters

model (loaded pytorch model) – model to copy into calico
checkpoint (str) – path to model checkpoint with orignal model weights
device (torch.device) – device index to select
lay (str) – the name of the layer in model that will become the split layer
ftIDX (int) – index of the feature to scale; feature w.r.t. the derivative is taken
dScale (float) – derivative scaling factor

Returns

calico (pytorch model) – calico network

Unit Tests 

Unit Test for the Difference Output: The Calico difference output is the difference between the original branch and the multiply branch. This test sets the dScale value to zero, meaning the multiply branch is scaled by 1. Therefore, the difference should be zero.
Unit Test for the Prediction Output: The Calico prediction output is the output from the origial branch. This test compares the Calico prediction output to the original model prediction output. The difference should be zero.

Arguments 

Uncollapse Arguments

Creates and tests a calcio network given an input model

usage: python pyt_nestedcyl_calico_model [-h] [--MODEL] [--INPUT_FIELD]
                                         [--INPUT_NPZ] [--DESIGN_FILE]
                                         [--PRINT_LAYERS] [--PRINT_FIELDS]
                                         [--LAYER]

Named Arguments

--MODEL, -M

Model file

Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”

--INPUT_FIELD, -IF

The radiographic/hydrodynamic field the model is trained on

Default: “pRad”

--INPUT_NPZ, -IN

The .npz file with an input image to the model

Default: “../examples/tf_coupon/data/r60um_tpl112_complete_idx00110.npz”

--DESIGN_FILE, -DF

The .csv file with master design study parameters

Default: “../examples/tf_coupon/coupon_design_file.csv”

--PRINT_LAYERS, -PL

Prints list of layer names in a model (passed with -M) and quits program

Default: False

--PRINT_FIELDS, -PF

Prints list of hydrodynamic/radiographic fields present in a given .npz file (passed with -IN) and quits program

Default: False

--LAYER, -L

Name of model layer that features will be extracted from

Default: “None”

Calico Dataset & Dataloader Creation 

Defines the pytorch dataset class for the single branch pytorch nested cylinder models

Execution will print test information, perform tests, and print the results to the terminal.

Input Line: python pyt_nestedcyl_calico_dataloader.py -M ../../../examples/pyt_nestedcyl/trained_rho2PTW_model.pth -IF rho -ID ../../../examples/pyt_nestedcyl/data/ -DF ../../../examples/pyt_nestedcyl/nestedcyl_design_file.csv

class fns.derivatives.pyt_nestedcyl_calico_dataloader.calico_DataSet(input_field='rho', input_dir='../examples/pyt_nestedcyl/data/', filelist='filelist', design_file='../examples/pyt_nestedcyl/nestedcyl_design_file.csv')

The definition of a dataset object used as input to the pytorch nested cylinder calico neural networks.

Parameters

input_field (str) – The radiographic/hydrodynamic field the model is trained on
input_dir (str) – The directory path where all of the .npz files are stored
filelist (str) – Text file listing file names to read.
design_file (str) – .csv file with master design study parameters

__len__(): Return number of samples in dataset.

__getitem__(index): Return a tuple of a batch’s input and output data for training at a given index.

fns.derivatives.pyt_nestedcyl_calico_dataloader.calico_dataloader(input_field='rho', input_dir='../examples/pyt_nestedcyl/data/', filelist='filelist', design_file='../examples/pyt_nestedcyl/nestedcyl_design_file.csv', batch_size=8)

Function to create a pytorch dataloader from the pytorch nested cylinder calico model dataset

Parameters

field (input) – The radiographic/hydrodynamic field the model is trained on
input_dir (str) – The directory path where all of the .npz files are stored
filelist (str) – Text file listing file names to read.
design_file (str) – .csv file with master design study parameters
input_field (str) –
batch_size (int) –

Returns

dataloader (torch.utils.data.DataLoader) – pytorch dataloader made from calico model dataset

Unit Tests 

Unit Test of Length Method: The unit tests print the length of the dataset to confirm that is is the same length as the number of samples provided.
Unit Test for Input and Output Shapes: The unit tests print the shapes of the batched input and ground truth. The user must check that these sizes are correct. Batch size 8 is used.

Arguments 

Uncollapse Arguments

Creates and tests a calcio dataloader (for input to a calico model) given an input model, layer, and feature

usage: python pyt_nestedcyl_calico_dataloader [-h] [--MODEL] [--INPUT_FIELD]
                                              [--INPUT_DIR] [--FILE_LIST]
                                              [--DESIGN_FILE]
                                              [--PRINT_SAMPLES] [--D_SCALE]
                                              [--NUM_SAMPLES]

Named Arguments

--MODEL, -M

Model file

Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”

--INPUT_FIELD, -IF

The radiographic/hydrodynamic field the model is trained on

Default: “pRad”

--INPUT_DIR, -ID

Directory path where all of the .npz files are stored

Default: “../examples/tf_coupon/data/”

--FILE_LIST, -FL

The .txt file containing a list of .npz file paths; use “MAKE” to generate a file list given an input directory (passed with -ID) and a number of samples (passed with -NS).

Default: “MAKE”

--DESIGN_FILE, -DF

The .csv file with master design study parameters

Default: “../examples/tf_coupon/coupon_design_file.csv”

--PRINT_SAMPLES, -PS

Prints number of samples in a directory (passed with -ID) matching a fixed key (passed with -XK) and quits program

Default: False

--D_SCALE, -DS

Scaling factor used in feature derivatives.

Default: 0.001

--NUM_SAMPLES, -NS

Number of samples to use; pass “All” to use all samples in a given input dirrectory (passed with -ID)

Default: “All”