TELF.factorization.decompositions.utilities package#

Submodules#

TELF.factorization.decompositions.utilities.bool_clustering module#

TELF.factorization.decompositions.utilities.bool_noise module#

TELF.factorization.decompositions.utilities.clustering module#

TELF.factorization.decompositions.utilities.clustering.custom_bool_clustering(W_all, centroids=None, max_iters=100, distance='hamming', use_gpu=False)[source]#

options for distance: ‘ false negative’, ‘false positive’, ‘distance from cdist change this function to use different distance, and use different centroids

TELF.factorization.decompositions.utilities.clustering.custom_k_means(W_all, centroids=None, max_iters=100, use_gpu=False)[source]#

Greedy algorithm to approximate a quadratic assignment problem to cluster vectors. Given p groups of k vectors, construct k clusters, each cluster containing a single vector from each of the p groups. This clustering approximation uses cos distances and mean centroids.

Parameters:
  • W_all (ndarray) – Order three tensor of shape m by k by p, where m is the ambient dimension of the vectors, k is the number of vectors in each group, and p is the number of groups of vectors.

  • centroids (ndarray) – The m by k initialization of the centroids of the clusters. None corresponds to using the first slice, W_all[:,:,0], as the initial centroids. Defaults to None.

  • max_iters (int) – The maximum number of iterations of the algorithm. If a stable point has been been reached in max_iters iterations, then a warning is given. Defaults to 100.

Returns:

The m by k centroids of the clusters. W_all (ndarray): Clustered organization of the vectors. W_all[:,i,:] is all p, m dimensional vectors in the ith cluster.

Return type:

centroids (ndarray)

TELF.factorization.decompositions.utilities.concensus_matrix module#

TELF.factorization.decompositions.utilities.concensus_matrix.compute_connectivity_mat(h)[source]#
TELF.factorization.decompositions.utilities.concensus_matrix.compute_consensus_matrix(H_all, pruned=False, pruned_cols=None)[source]#
TELF.factorization.decompositions.utilities.concensus_matrix.reorder_con_mat(C, k, return_index=False, method='HC')[source]#

TELF.factorization.decompositions.utilities.data_reshaping module#

TELF.factorization.decompositions.utilities.data_reshaping.fold(X, axis, shape)[source]#

Create a tensor from matrix. :param X: an unfolded array :type X: ndarray/sparse array :param axis: Dimension number to fold on. :type axis: int :param shape: :type shape: target tensor shape

TELF.factorization.decompositions.utilities.data_reshaping.move_axis(X, source=None, target=None)[source]#

Create a matricized tensor. :param X: A tensor as a Numpy/Sparse Array :type X: ndarray/sparse array :param axis: Dimension number to unfold on. :type axis: int

TELF.factorization.decompositions.utilities.data_reshaping.unfold(X, axis=0)[source]#

Create a matricized tensor. :param X: A tensor as a Numpy/Sparse Array :type X: ndarray/sparse array :param axis: Dimension number to unfold on. :type axis: int

TELF.factorization.decompositions.utilities.generic_utils module#

TELF.factorization.decompositions.utilities.generic_utils.bary_proj(num_vars)[source]#

Computes generalized Barycentric coordinates from an activation matrix.

Parameters:

num_vars (int) – Number of extreme rays in barycentric plot.

Returns:

Angles of extreme points.

Return type:

theta (ndarray)

TELF.factorization.decompositions.utilities.generic_utils.get_cupyx(use_gpu)[source]#
TELF.factorization.decompositions.utilities.generic_utils.get_np(*args, **kwargs)[source]#
TELF.factorization.decompositions.utilities.generic_utils.get_scipy(*args, **kwargs)[source]#
TELF.factorization.decompositions.utilities.generic_utils.grid_eval(fn, grid, num_cpus=None, **kwargs)[source]#
TELF.factorization.decompositions.utilities.generic_utils.update_opts(defaults, custom)[source]#

TELF.factorization.decompositions.utilities.math_utils module#

TELF.factorization.decompositions.utilities.math_utils.bary_coords(H)[source]#

Computes generalized Barycentric coordinates from an activation matrix.

Parameters:

H (ndarray) – Nonnegative k by n activation matrix

Returns:

Generalized Barycentric x coordinate. y (ndarray): Generalized Barycentric y coordinate.

Return type:

x (ndarray)

TELF.factorization.decompositions.utilities.math_utils.fro_norm(X, use_gpu=False)[source]#
TELF.factorization.decompositions.utilities.math_utils.get_pac(C, use_gpu=False, verbose=False)[source]#

Calculates PAC score from consensus matrices

Parameters:

C (ndarray, dense matrix) – 3D consensus matrix where dimensions are (num. of k, N, N)

Returns:

cdf – PAC calculation

Return type:

1d np.array

TELF.factorization.decompositions.utilities.math_utils.kl_divergence(X, Y)[source]#
TELF.factorization.decompositions.utilities.math_utils.masked_nmf(X, W, H, mask, itr=1000)[source]#
TELF.factorization.decompositions.utilities.math_utils.nan_to_num(X, num, copy=False)[source]#

Replaces nan in X with num

TELF.factorization.decompositions.utilities.math_utils.norm_X(X)[source]#
TELF.factorization.decompositions.utilities.math_utils.nz_indices(X, use_gpu=False)[source]#
TELF.factorization.decompositions.utilities.math_utils.prune(X, use_gpu=False, other=None, keys_to_check_other=['MASK'])[source]#

Removes zero rows and columns from a matrix

Parameters:
  • X (ndarray, sparse matrix) – Matrix to prune

  • use_gpu (Boolean) – Flag for whether decomposition is being performed on GPU or not.

Returns:

  • Y (scipy.sparse._csr.csr_matrix) – Pruned matrix

  • rows (ndarray) – Boolean array for all rows; True if non-zero row, else False

  • cols (ndarray) – Boolean array for all cols; True if non-zero col, else False

TELF.factorization.decompositions.utilities.math_utils.relative_error(X, W, H, MASK=None, normX=None)[source]#
input:

X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times k$ left factor of X. H (ndarray): shape $k times n$ right factor of X. MASK (ndarray, optional): shape $m times n$ matrix. Only consider errors where MASK == 1. normX, optional (double): Optional argument if you already know the norm of X.

output:

rel_err (double): the relative error $||X-WH||_F/||X||_F$.

TELF.factorization.decompositions.utilities.math_utils.relative_error_rescal(X, A, R, normX=None)[source]#
input:

X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times k$ left factor of X. H (ndarray): shape $k times n$ right factor of X. normX, optional (double): Optional argument if you already know the norm of X.

output:

rel_err (double): the relative error $||X-WH||_F/||X||_F$.

TELF.factorization.decompositions.utilities.math_utils.relative_trinmf_error(X, W, S, H)[source]#
input:

X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times kw$ left factor of X. S (ndarray): shape $kw times kk$ middle factor of X. H (ndarray): shape $kh times n$ right factor of X.

output:

rel_err (double): the relative error $||X-WSH||_F/||X||_F$.

TELF.factorization.decompositions.utilities.math_utils.sparse_divide_product(X, A, B, nz_rows=None, nz_cols=None, use_gpu=False)[source]#

Efficiently computes X/(A@B).

TELF.factorization.decompositions.utilities.math_utils.sparse_dot_product(X, A, B, nz_rows=None, nz_cols=None)[source]#

Efficiently computes (A@B).

TELF.factorization.decompositions.utilities.math_utils.unprune(A, indices, axis, use_gpu=False)[source]#

TELF.factorization.decompositions.utilities.nnsvd module#

TELF.factorization.decompositions.utilities.nnsvd.nnsvd(X, k, use_gpu=False)[source]#

Nonnegative SVD algorithm for NMF initialization based off of Gillis et al. in https://arxiv.org/pdf/1807.04020.pdf.

Parameters:
  • X (ndarray) – Nonnegative m by n matrix to approximate with nnsvd.

  • k (int) – The desired rank of the nonnegative approximation.

Returns:

Nonnegative m by k left factor of X. H (ndarray): Nonnegative k by n right factor of X.

Return type:

W (ndarray)

TELF.factorization.decompositions.utilities.resample module#

TELF.factorization.decompositions.utilities.resample.boolean(X, epsilon, use_gpu=False, random_state=None)[source]#

positive noise: flip 0s to 1s (additive noise), negative noise: flip 1s to 0s (subtractive noise)

Parameters:
  • X (ndarray, sparse matrix) – Array of which to find a perturbation.

  • epsilon (float) – The perturbation amount.

  • random_state (int) – Random seed

Returns:

The perturbed matrix.

Return type:

Y (ndarray)

TELF.factorization.decompositions.utilities.resample.poisson(X, use_gpu=False, random_state=None)[source]#

Resamples each element of a matrix from a Poisson distribution with the mean set by that element. Y_{i,j} = Poisson(X_{i,j})

Parameters:
  • X (ndarray, sparse matrix) – Array of which to find a perturbation.

  • random_state (int) – Random seed

Returns:

The perturbed matrix.

Return type:

Y (ndarray)

TELF.factorization.decompositions.utilities.resample.uniform_product(X, epsilon, use_gpu=False, random_state=None)[source]#

Multiplies each element of X by a uniform random number in (1-epsilon, 1+epsilon).

Parameters:
  • X (ndarray, sparse matrix) – Array of which to find a perturbation.

  • epsilon (float) – The perturbation amount.

  • random_state (int) – Random seed

Returns:

The perturbed matrix.

Return type:

Y (ndarray)

TELF.factorization.decompositions.utilities.silhouettes module#

TELF.factorization.decompositions.utilities.silhouettes.silhouettes(W_all, use_gpu=False)[source]#

Computes the cosine distances silhouettes of a clustering of vectors.

Parameters:

W_all (ndarray) – Order three tensor of clustered vectors of shape m by k by p, where m is the ambient dimension of the vectors, k is the number of vectors in each group, and p is the number of groups of vectors.

Returns:

The k by p array of silhouettes where sils[i,j] is the silhouette measure for the vector W_all[:,i,j].

Return type:

sils (ndarray)

TELF.factorization.decompositions.utilities.silhouettes.silhouettes_with_distance(W_all, distance='hamming', use_gpu=False)[source]#

Module contents#