Derivatives on Tensorflow Coupon Models
Feature dervivatives have been implemented on the Tensorflow Coupon Models.
Table of Contents:
Feature Derivatives Script
Calculates the derivatives of model outputs with respect to scaling an internal feature using a calico network.
Only operational on tensorflow models trainined on the coupon experiment.
- Exports derivatives and other sample information as a pandas-readable .csv, including:
Difference (diff_) between the scaled and unscaled outputs
Derivative (derv_) of the outputs; equivalent to the difference divided by the dScale
Prediction (pred_) of the unscaled network
Truth (true_) TePla parameters corresponding to the input sample, and identifying sample information
Input Line:
python ftderivatives_tf_coupon.py -M ../examples/tf_coupon/trained_pRad2TePla_model.h5 -IF pRad -ID ../examples/tf_coupon/data/ -DF ../examples/tf_coupon/coupon_design_file.csv -NF ../examples/tf_coupon/coupon_normalization.npz -L activation_15 -T 1 -S ../examples/tf_coupon/figures/
Arguments
Uncollapse Arguments
Calculates the derivatives of model outputs with respect to scaling an internal feature using a calico network.
usage: python ftderivatives_tf_coupon [-h] [--MODEL] [--INPUT_FIELD] [--INPUT_DIR]
[--FILE_LIST] [--DESIGN_FILE] [--NORM_FILE]
[--PRINT_LAYERS] [--PRINT_FEATURES]
[--PRINT_FIELDS] [--PRINT_KEYS]
[--PRINT_SAMPLES] [--LAYER] [--FEATURES [...]]
[--D_SCALE] [--FIXED_KEY] [--NUM_SAMPLES]
[--SAVE_FIG]
Named Arguments
- --MODEL, -M
Model file
Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”
- --INPUT_FIELD, -IF
The radiographic/hydrodynamic field the model is trained on
Default: “pRad”
- --INPUT_DIR, -ID
Directory path where all of the .npz files are stored
Default: “../examples/tf_coupon/data/”
- --FILE_LIST, -FL
The .txt file containing a list of .npz file paths; use “MAKE” to generate a file list given an input directory (passed with -ID) and a number of samples (passed with -NS).
Default: “MAKE”
- --DESIGN_FILE, -DF
The .csv file with master design study parameters
Default: “../examples/tf_coupon/coupon_design_file.csv”
- --NORM_FILE, -NF
The .npz file normalization values
Default: “../examples/tf_coupon/coupon_normalization.npz”
- --PRINT_LAYERS, -PL
Prints list of layer names in a model (passed with -M) and quits program
Default: False
- --PRINT_FEATURES, -PT
Prints number of features extracted by a layer (passed with -L) and quits program
Default: False
- --PRINT_FIELDS, -PF
Prints list of hydrodynamic/radiographic fields present in a given .npz file (passed with -IN) and quits program
Default: False
- --PRINT_KEYS, -PK
Prints list of choices for the fixed key avialable in a given input dirrectory (passed with -ID) and quits program
Default: False
- --PRINT_SAMPLES, -PS
Prints number of samples in a directory (passed with -ID) matching a fixed key (passed with -XK) and quits program
Default: False
- --LAYER, -L
Name of model layer that features will be extracted from
Default: “None”
- --FEATURES, -T
List of features to include; “Grid” plots all features in one figure using subplots; “All” plots all features each in a new figure; A list of integers can be passed to plot those features each in a new figure. Integer convention starts at 1.
Default: [‘1’]
- --D_SCALE, -DS
Scaling factor used in feature derivatives.
Default: 0.001
- --FIXED_KEY, -XK
The identifying string for some subset of all data samples; pass “None” to consider all samples
Default: “None”
- --NUM_SAMPLES, -NS
Number of samples to use; pass “All” to use all samples in a given input dirrectory (passed with -ID)
Default: “All”
- --SAVE_FIG, -S
Directory to save the outputs to.
Default: “../examples/tf_coupon/figures/”
Calico Model Diagram
Calico Model Creation
Defines the calico model for the branched tensorflow coupon models
Execution will print unit test information, perform unit tests, and print the results to the terminal.
Input Line:
python tf_coupon_calico_model.py -M ../../../examples/tf_coupon/trained_pRad2TePla_model.h5 -IF pRad -IN ../../../examples/tf_coupon/data/r60um_tpl112_complete_idx00110.npz -DF ../../../examples/tf_coupon/coupon_design_file.csv -L activation_15
- fns.derivatives.tf_coupon_calico_model.make_calico(model, lay)
Function that creates a tensorflow “calcio network” from an existing coupon neural net
- Parameters
model (loaded keras model) – model to copy into calico
lay (str) – the name of the layer in model that will become the split layer
- Returns
calico (keras.model) – calico network
Unit Tests
Unit Test for the Difference Output: The Calico difference output is the difference between the original branch adn the multiple branch. This test sets the tensor multiplier to one. Therefore, the difference should be zero.
Unit Test for the Prediction Output: The Calico prediction output is the output from the origial branch. This test compares the Calico prediction output to the original model prediction output. The difference should be zero.
- Unit Test for the Truth Output: The Calico truth output is the the same as the ground truth passed into Calico. This test compares the truth inputed to Calcio with the truth output. The difference should be zero.
It is commonly observed that this difference is \(O(10^{-8})\). The developers hypothesize that this is due to the conversion between float64 and float32 that occurs when loading the inputs onto and off of GPUs.]
Arguments
Uncollapse Arguments
Creates and tests a calcio network given an input model
usage: python tf_coupon_calico_model [-h] [--MODEL] [--INPUT_FIELD] [--INPUT_NPZ]
[--DESIGN_FILE] [--PRINT_LAYERS] [--PRINT_FIELDS]
[--LAYER]
Named Arguments
- --MODEL, -M
Model file
Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”
- --INPUT_FIELD, -IF
The radiographic/hydrodynamic field the model is trained on
Default: “pRad”
- --INPUT_NPZ, -IN
The .npz file with an input image to the model
Default: “../examples/tf_coupon/data/r60um_tpl112_complete_idx00110.npz”
- --DESIGN_FILE, -DF
The .csv file with master design study parameters
Default: “../examples/tf_coupon/coupon_design_file.csv”
- --PRINT_LAYERS, -PL
Prints list of layer names in a model (passed with -M) and quits program
Default: False
- --PRINT_FIELDS, -PF
Prints list of hydrodynamic/radiographic fields present in a given .npz file (passed with -IN) and quits program
Default: False
- --LAYER, -L
Name of model layer that features will be extracted from
Default: “None”
Calico Sequence Creation
Contains the keras sequence classes to use with the when calculating feature derivities with the calico models
Execution will print unit test information, perform unit tests, and print the results to the terminal.
Input Line:
python tf_coupon_calico_seq.py -M ../../../examples/tf_coupon/trained_pRad2TePla_model.h5 -IF pRad -ID ../../../examples/tf_coupon/data/ -DF ../../../examples/tf_coupon/coupon_design_file.csv -NF ../../../examples/tf_coupon/coupon_normalization.npz -L activation_15 -T 1 -NS 40
- class fns.derivatives.tf_coupon_calico_seq.calicoSEQ(input_field='rho', input_dir='/data/coupon_data/', filelist='../../coupon_ml/yellow_r60um_tpl_testing.txt', design_file='../../coupon_ml/design_res60um_tepla_study220620_MASTER.csv', normalization_file='../../coupon_ml/r60um_normalization.npz', batch_size=8, epoch_length=10, layshape=(300, 1000, 12), ftIDX=0, dScale=0.001)
The definition of a sequence object used as input to the tensorflow coupon calico neural networks.
- Parameters
input_field (str) – The radiographic/hydrodynamic field the model is trained on
input_dir (str) – The directory path where all of the .npz files are stored
filelist (str) – Text file listing file names to read.
design_file (str) – .csv file with master design study parameters
normalization_file (str) – Full-path to file containing normalization information.
batch_size (int) – Number of samples in each batch
epoch_length (int) – Number of batches in an epoch
layshape (3 tuple) – the size of the output of the specified layer (outY, outX, Nfeatures)
ftIDX (int) – index of the feature to scale; feature w.r.t. the derivative is taken
dScale (float) – derivative scaling factor
- __len__()
Return the epoch length. Epoch length is the number of batches that will be trained on per epoch.
- __getitem__(idx)
Return a tuple of a batch’s input and output data for training at a given index within the epoch.
Unit Tests
Unit Test of Input and Output Shapes: The calcio sequence creates four inputs to the calico network: [img_input, const_input, prms_input, truth_input]; and one output: truth_output. The unit tests print the shapes of all the inputs and outputs. The user must determine if the shapes are correct
Unit Test for Const_Input Construction: The unit tests check if the const_input contians all ones in unselected features and contains 1+dScale for the selected feature.
Unit Test for the Truth Output: The unit tests check if the truth input and truth output are identical.
Arguments
Uncollapse Arguments
Creates and tests a calcio sequence (for input to a calico model) given an input model, layer, and feature
usage: python tf_coupon_calico_seq [-h] [--MODEL] [--INPUT_FIELD] [--INPUT_DIR]
[--FILE_LIST] [--DESIGN_FILE] [--NORM_FILE]
[--PRINT_LAYERS] [--PRINT_FEATURES]
[--PRINT_SAMPLES] [--LAYER] [--FEATURES [...]]
[--D_SCALE] [--NUM_SAMPLES]
Named Arguments
- --MODEL, -M
Model file
Default: “../examples/tf_coupon/trained_pRad2TePla_model.h5”
- --INPUT_FIELD, -IF
The radiographic/hydrodynamic field the model is trained on
Default: “pRad”
- --INPUT_DIR, -ID
Directory path where all of the .npz files are stored
Default: “../examples/tf_coupon/data/”
- --FILE_LIST, -FL
The .txt file containing a list of .npz file paths; use “MAKE” to generate a file list given an input directory (passed with -ID) and a number of samples (passed with -NS).
Default: “MAKE”
- --DESIGN_FILE, -DF
The .csv file with master design study parameters
Default: “../examples/tf_coupon/coupon_design_file.csv”
- --NORM_FILE, -NF
The .npz file normalization values
Default: “../examples/tf_coupon/coupon_normalization.npz”
- --PRINT_LAYERS, -PL
Prints list of layer names in a model (passed with -M) and quits program
Default: False
- --PRINT_FEATURES, -PT
Prints number of features extracted by a layer (passed with -L) and quits program
Default: False
- --PRINT_SAMPLES, -PS
Prints number of samples in a directory (passed with -ID) matching a fixed key (passed with -XK) and quits program
Default: False
- --LAYER, -L
Name of model layer that features will be extracted from
Default: “None”
- --FEATURES, -T
List of features to include; “Grid” plots all features in one figure using subplots; “All” plots all features each in a new figure; A list of integers can be passed to plot those features each in a new figure. Integer convention starts at 1.
Default: [1]
- --D_SCALE, -DS
Scaling factor used in feature derivatives.
Default: 0.001
- --NUM_SAMPLES, -NS
Number of samples to use; pass “All” to use all samples in a given input dirrectory (passed with -ID)
Default: “All”