TELF.post_processing.Fox: Report generation tool for text data from HNMFk using local LLMs#

Report generation tool for text data from HNMFk using local LLMs.

Available Functions#

Fox.__init__([summary_model, api_key, ...])

Fox.post_process(npz_path, vocabulary_path, ...)

Fox.post_process_stats(processing_path[, ...])

Fox.makeSummariesAndLabelsOpenAi(processing_path)

Fox.rename_cluster_dirs_from_stats(...)

Fox.getApiKey()

Fox.setApiKey(api_key)

Fox.getSummaryModel()

Fox.setSummaryModel(summary_model)

ClusterSummarizer.__init__([api_key, ...])

ClusterSummarizer.generate_labels_and_summaries(...)

DirectoryManager.__init__()

DirectoryManager.rename_from_stats(...)

Renames cluster directories using label and document count from stats.csv.

Module Contents#

class TELF.post_processing.Fox.fox.ClusterSummarizer(api_key=None, summary_model=None)[source]#

Bases: object

generate_labels_and_summaries(processing_path: str)[source]#
class TELF.post_processing.Fox.fox.DirectoryManager[source]#

Bases: object

rename_from_stats(post_processed_df_path: str)[source]#

Renames cluster directories using label and document count from stats.csv.

Format:

<cluster>-<label>_<num_papers>-documents OR if no label: <cluster>-unlabeled_<num_papers>-documents

class TELF.post_processing.Fox.fox.Fox(summary_model=None, api_key=None, verbose=False, debug=False)[source]#

Bases: object

getApiKey()[source]#
getSummaryModel()[source]#
makeSummariesAndLabelsOpenAi(processing_path: str)[source]#
post_process(npz_path: str, vocabulary_path: str, src_decomp_data_path: str, output_dir: str = None, top_words_per_cluster: int = 50, clean_cols_name: str = 'clean_title_abstract', terms: list = None) str[source]#
post_process_stats(processing_path: str, clean_cols_name: str = 'clean_title_abstract')[source]#
rename_cluster_dirs_from_stats(processing_path: str)[source]#
setApiKey(api_key)[source]#
setSummaryModel(summary_model)[source]#