pyCP_APR.applications.ktensor_utils module# contains the utility functions for KRUSKAL tensor M.

@author: Maksim Ekin Eren

pyCP_APR.applications.ktensor_utils.get_X_hat(components, indices)[source]#

Calculate X hat from KRUSKAL tensor M, given the non-zero indicies.

components: KRUSKAL tensor components

indices: non-zero coordinates

  • components (dict) -- KRUSKAL Tensor M in dict format.

  • indices (array) -- Array of indices in X hat.


lambdas -- Array of lambdas in X calculated from M using the indices.

Get tensor shape from the components.


components (dict) -- KRUSKAL Tensor M in dict format.


shape -- Tensor X shape.

pyCP_APR.applications.sptensor_utils module# contains the utility functions for tensor X.

@author: Maksim Ekin Eren


Returns the shape of X. i.e. size of each mode.


X (array) -- Tensor X in COO format. i.e. X is the coordinates of the non-zero values.


size -- Tensor X shape.

Returns the number of dimensions that tensor X has.


X (array) -- Tensor X in COO format. i.e. X is the coordinates of the non-zero values.


dimensions -- Number of dimensions that X has.

Calculates the number of non-zero elements in X.


X (array) -- Tensor X in COO format. i.e. X is the coordinates of the non-zero values.


non-zeros -- Number of non-zeros in X.

Calculates the total number of zeros in X.


X (array) -- Tensor X in COO format. i.e. X is the coordinates of the non-zero values.


zeros -- Number of zeros in X.

Calculates the total number of elements in X including non-zeros and zeros.


X (array) -- Tensor X in COO format. i.e. X is the coordinates of the non-zero values.


size -- Number of elements in X.

pyCP_APR.applications.stat_utils module# contains the tensor statistic utilities.

@author: Maksim Ekin Eren

pyCP_APR.applications.stat_utils.mrr_fuse_ranks(x, weights=None, axis=0, k=60.0, y=None)[source]#

Calculates Mean Reciprocal Rank (MRR).

Under development.

  • x (array) -- Tensor x.

  • weights (array, optional) -- Array of weights. The default is None.

  • axis (int, optional) -- Dimension number. The default is 0.

  • k (int, optional) -- Top k. The default is 60..

  • y (array, optional) -- Labels. The default is None.


result -- MRR score.

pyCP_APR.applications.tensor_anomaly_detection module# performs p-value scoring over the tensor decomposition, i.e. the KRUSKAL tensor M. The calculated p-values are used to detect anomalies.

This method was introduced by Eren et al. in [1].

CyberToaster, Project 1, Summer 2020

Los Alamos National Laboratory

Anomaly detection using Tensors and their Decompositions.

Student: Maksim E. Eren

Primary Mentor: Juston Moore

Secondary Mentors: Boian Alexandrov and Patrick Avery


[1] M. E. Eren, J. S. Moore and B. S. Alexandro, "Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization," 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), 2020, pp. 1-6, doi: 10.1109/ISI49825.2020.9280524.

@author: Maksim Ekin Eren, Juston S. Moore

class pyCP_APR.applications.tensor_anomaly_detection.PoissonTensorAnomaly(dimensions={}, weights=[], objective='p_value', lambda_method='single_tensor', p_value_fusion_index=[0], ensemble_dimensions={}, ensemble_weights=[], ensemble_significance=[0.1, 0.9], mode_weights=[1], ignore_dimensions_indx=[])[source]#

Anomaly detection using Poisson Distribution and Canonical Polyadic (CP) with Alternating Poisson Regression tensor decomposition (CP-APR).

Componenets of the CP-APR used to calculate the p-values for each instance through Poisson cumulative distribution function (cdf).

p-values are then used to determine if the event is an anomaly. Lower p-values are more anomalous.

v2: Utilizes Numpy vectorization for the calculations.


1) Chi, Eric C. and Tamara G. Kolda. “On Tensors, Sparsity, and Nonnegative Factorizations.” SIAM J. Matrix Anal. Appl. 33 (2012): 1272-1299.

2) Turcotte, Melissa J. M. et al. “Unified Host and Network Data Set. ” ArXiv abs/1708.07518 (2017): n. pag.

3) Wikipedia contributors. "Poisson distribution." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 29 Jun. 2020. Web. 6 Jul. 2020.

Initilize the anomaly detector class.

  • dimensions (dict, required) -- Components of the KRUSKAL Tensor Decomposition. The default is dict.n Each element is a dimension (factors of a component) and each dimension has (nxK) elements for that factor for rank K.

  • weights (list, required) -- Weights of each component of parameter dimensions. The default is list.

  • objective (string, optional) -- What to calculate.n Options: p_value, p_value_fusion_harmonic, p_value_fusion_harmonic_observed, p_value_fusion_chi2, p_value_fusion_chi2_observed, p_value_fusion_arithmetic, log_likelihood The default is 'p_value'.

  • lambda_method (string, optional) -- How to calculate lambda.n If 'single_tensor', it will use single ktensor passed in dimensions when calculating lambda.n If 'ensemble', it will use two ktensors where parameter dimensions is a K>=1 rank tensor with lambda weight ensemble_significance[0] and parameter ensemble_dimensions is a ktensor with K>1 rank tensor with lambda weight ensemble_significance[1]. The default is 'single_tensor'.

  • p_value_fusion_index (list) -- Index to fix, or calculate the p-value fusions. Only used when objective is set to p_value_fusion. The default is [0].

  • ensemble_dimensions (dict, optional) -- Components of the KRUSKAL Tensor Decomposition.n Each element is a dimension (factors of a component) and each dimension has (nxK) elements for that factor for rank K.n This is the second ktensor dimension passed. It will be used if lambda_method is set to 'ensemble'. Its lambda weight is ensemble_significance[1]. The default is dict().

  • ensemble_weights (list, optional) -- Weights of each component of ensemble_dimensions. The default is list(). Only used if lambda_method is 'ensemble'.

  • ensemble_significance (list, optional) -- lambda weight of each ktensor when using 'ensemble' lambda_method.n Weight of dimensions: ensemble_significance[0]n. Weight of ensemble_dimensions: ensemble_significance[1]n The default is [0.1, 0.9].

  • mode_weights (list, optional) -- Weight of each dimension.n The default is [1].

  • ignore_dimensions_indx (list, optional) -- If any dimension in latent factors should be ignored when calculating the lambdas.n The default is [].

predict(coords, values, from_matlab=False)[source]#

Get the scores using the KRUSKAL components given the non-zero coordinates and values and the objective.

  • coords (list of list) -- Coordinates of the non-zero elements within the sparse tensor.

  • values (list) -- Non-zero values that are in the sparse tensor.

  • from_matlab (bool) --

    Set True if need to substract 1 to the coordinates, since matlab starts at 1.

    The default is False.


prediction -- Dictionary of calculated objective.

pyCP_APR.applications.tensor_anomaly_detection_v2 module# performs p-value scoring over the tensor decomposition, i.e. the KRUSKAL tensor M. The calculated p-values are used to detect anomalies.

This method was introduced by Eren et al. in [1].

The second version performs faster calculation of the inner products of the components to extract the lambdas.

This version also provides dimension fusion methods for lambda calculations.


[1] M. E. Eren, J. S. Moore and B. S. Alexandro, "Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization," 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), 2020, pp. 1-6, doi: 10.1109/ISI49825.2020.9280524.

@author: Juston S. Moore, Maksim Ekin Eren

class pyCP_APR.applications.tensor_anomaly_detection_v2.PoissonTensorAnomaly_v2(components, indicies, tensor_weights=[1])[source]#

Initilize the anomaly detection class.

Calculates the lambdas, and obtains tensor information.

  • components (dict) -- KRUSKAL Tensor M in dict format.

  • indicies (array) -- Non-zero coordinates.

  • tensor_weights (list, optional) --

    Weight of each lambda for the tensors.

    Used only when ensemble of tensors used in lambda calculations. The default is [1].

get_dimension_fusion_scores(axis_map, y_true)[source]#

Calculates the prediction scores given fuzed lambdas and the true labels y.

Fusion is performed for the dimension in axis_map.

  • axis_map (list) -- Which dimensions to fuse.

  • y_true (list) -- List of true labels for each entry.


df -- Fusion scores.

Return type:

Pandas DataFrame


Returns the lambda values that are calculated.


lambdas -- Array of lambda values for the indices.

Calculates the prediction scores given lambdas and the true labels y.


y (list) --

True labels.

Label of each index.


score -- Prediction scores. {"roc_auc": float, "pr_auc": float}

