TELF.factorization.decompositions.utilities package#
Submodules#
TELF.factorization.decompositions.utilities.bool_clustering module#
TELF.factorization.decompositions.utilities.bool_noise module#
TELF.factorization.decompositions.utilities.clustering module#
- TELF.factorization.decompositions.utilities.clustering.custom_bool_clustering(W_all, centroids=None, max_iters=100, distance='hamming', use_gpu=False)[source]#
options for distance: ‘ false negative’, ‘false positive’, ‘distance from cdist change this function to use different distance, and use different centroids
- TELF.factorization.decompositions.utilities.clustering.custom_k_means(W_all, centroids=None, max_iters=100, use_gpu=False)[source]#
Greedy algorithm to approximate a quadratic assignment problem to cluster vectors. Given p groups of k vectors, construct k clusters, each cluster containing a single vector from each of the p groups. This clustering approximation uses cos distances and mean centroids.
- Parameters:
W_all (ndarray) – Order three tensor of shape m by k by p, where m is the ambient dimension of the vectors, k is the number of vectors in each group, and p is the number of groups of vectors.
centroids (ndarray) – The m by k initialization of the centroids of the clusters. None corresponds to using the first slice, W_all[:,:,0], as the initial centroids. Defaults to None.
max_iters (int) – The maximum number of iterations of the algorithm. If a stable point has been been reached in max_iters iterations, then a warning is given. Defaults to 100.
- Returns:
The m by k centroids of the clusters. W_all (ndarray): Clustered organization of the vectors. W_all[:,i,:] is all p, m dimensional vectors in the ith cluster.
- Return type:
centroids (ndarray)
TELF.factorization.decompositions.utilities.concensus_matrix module#
TELF.factorization.decompositions.utilities.data_reshaping module#
- TELF.factorization.decompositions.utilities.data_reshaping.fold(X, axis, shape)[source]#
Create a tensor from matrix. :param X: an unfolded array :type X: ndarray/sparse array :param axis: Dimension number to fold on. :type axis: int :param shape: :type shape: target tensor shape
TELF.factorization.decompositions.utilities.generic_utils module#
- TELF.factorization.decompositions.utilities.generic_utils.bary_proj(num_vars)[source]#
Computes generalized Barycentric coordinates from an activation matrix.
- Parameters:
num_vars (int) – Number of extreme rays in barycentric plot.
- Returns:
Angles of extreme points.
- Return type:
theta (ndarray)
TELF.factorization.decompositions.utilities.math_utils module#
- TELF.factorization.decompositions.utilities.math_utils.bary_coords(H)[source]#
Computes generalized Barycentric coordinates from an activation matrix.
- Parameters:
H (ndarray) – Nonnegative k by n activation matrix
- Returns:
Generalized Barycentric x coordinate. y (ndarray): Generalized Barycentric y coordinate.
- Return type:
x (ndarray)
- TELF.factorization.decompositions.utilities.math_utils.get_pac(C, use_gpu=False, verbose=False)[source]#
Calculates PAC score from consensus matrices
- Parameters:
C (ndarray, dense matrix) – 3D consensus matrix where dimensions are (num. of k, N, N)
- Returns:
cdf – PAC calculation
- Return type:
1d np.array
- TELF.factorization.decompositions.utilities.math_utils.nan_to_num(X, num, copy=False)[source]#
Replaces nan in X with num
- TELF.factorization.decompositions.utilities.math_utils.prune(X, use_gpu=False, other=None, keys_to_check_other=['MASK'])[source]#
Removes zero rows and columns from a matrix
- Parameters:
X (ndarray, sparse matrix) – Matrix to prune
use_gpu (Boolean) – Flag for whether decomposition is being performed on GPU or not.
- Returns:
Y (scipy.sparse._csr.csr_matrix) – Pruned matrix
rows (ndarray) – Boolean array for all rows; True if non-zero row, else False
cols (ndarray) – Boolean array for all cols; True if non-zero col, else False
- TELF.factorization.decompositions.utilities.math_utils.relative_error(X, W, H, MASK=None, normX=None)[source]#
- input:
X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times k$ left factor of X. H (ndarray): shape $k times n$ right factor of X. MASK (ndarray, optional): shape $m times n$ matrix. Only consider errors where MASK == 1. normX, optional (double): Optional argument if you already know the norm of X.
- output:
rel_err (double): the relative error $||X-WH||_F/||X||_F$.
- TELF.factorization.decompositions.utilities.math_utils.relative_error_rescal(X, A, R, normX=None)[source]#
- input:
X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times k$ left factor of X. H (ndarray): shape $k times n$ right factor of X. normX, optional (double): Optional argument if you already know the norm of X.
- output:
rel_err (double): the relative error $||X-WH||_F/||X||_F$.
- TELF.factorization.decompositions.utilities.math_utils.relative_trinmf_error(X, W, S, H)[source]#
- input:
X (sparse array, ndarray): shape $m times n$ array or sparse array. W (ndarray): shape $m times kw$ left factor of X. S (ndarray): shape $kw times kk$ middle factor of X. H (ndarray): shape $kh times n$ right factor of X.
- output:
rel_err (double): the relative error $||X-WSH||_F/||X||_F$.
- TELF.factorization.decompositions.utilities.math_utils.sparse_divide_product(X, A, B, nz_rows=None, nz_cols=None, use_gpu=False)[source]#
Efficiently computes X/(A@B).
TELF.factorization.decompositions.utilities.nnsvd module#
- TELF.factorization.decompositions.utilities.nnsvd.nnsvd(X, k, use_gpu=False)[source]#
Nonnegative SVD algorithm for NMF initialization based off of Gillis et al. in https://arxiv.org/pdf/1807.04020.pdf.
- Parameters:
X (ndarray) – Nonnegative m by n matrix to approximate with nnsvd.
k (int) – The desired rank of the nonnegative approximation.
- Returns:
Nonnegative m by k left factor of X. H (ndarray): Nonnegative k by n right factor of X.
- Return type:
W (ndarray)
TELF.factorization.decompositions.utilities.resample module#
- TELF.factorization.decompositions.utilities.resample.boolean(X, epsilon, use_gpu=False, random_state=None)[source]#
positive noise: flip 0s to 1s (additive noise), negative noise: flip 1s to 0s (subtractive noise)
- Parameters:
X (ndarray, sparse matrix) – Array of which to find a perturbation.
epsilon (float) – The perturbation amount.
random_state (int) – Random seed
- Returns:
The perturbed matrix.
- Return type:
Y (ndarray)
- TELF.factorization.decompositions.utilities.resample.poisson(X, use_gpu=False, random_state=None)[source]#
Resamples each element of a matrix from a Poisson distribution with the mean set by that element. Y_{i,j} = Poisson(X_{i,j})
- Parameters:
X (ndarray, sparse matrix) – Array of which to find a perturbation.
random_state (int) – Random seed
- Returns:
The perturbed matrix.
- Return type:
Y (ndarray)
- TELF.factorization.decompositions.utilities.resample.uniform_product(X, epsilon, use_gpu=False, random_state=None)[source]#
Multiplies each element of X by a uniform random number in (1-epsilon, 1+epsilon).
- Parameters:
X (ndarray, sparse matrix) – Array of which to find a perturbation.
epsilon (float) – The perturbation amount.
random_state (int) – Random seed
- Returns:
The perturbed matrix.
- Return type:
Y (ndarray)
TELF.factorization.decompositions.utilities.silhouettes module#
- TELF.factorization.decompositions.utilities.silhouettes.silhouettes(W_all, use_gpu=False)[source]#
Computes the cosine distances silhouettes of a clustering of vectors.
- Parameters:
W_all (ndarray) – Order three tensor of clustered vectors of shape m by k by p, where m is the ambient dimension of the vectors, k is the number of vectors in each group, and p is the number of groups of vectors.
- Returns:
The k by p array of silhouettes where sils[i,j] is the silhouette measure for the vector W_all[:,i,j].
- Return type:
sils (ndarray)