TELF.post_processing.ArcticFox: Report generation tool for text data from HNMFk using local LLMs#
Report generation tool for text data from HNMFk using local LLMs.
Available Functions#
|
|
|
Run any subset of the pipeline while preserving order: |
|
|
|
|
|
Module Contents#
- class TELF.post_processing.ArcticFox.arcticfox.ArcticFox(model, embedding_model='SCINCL', distance_metric='cosine', center_metric='centroid', text_cols=None, top_n_words=50, clean_cols_name='clean_title_abstract', col_year='year', col_type='type', col_cluster='cluster', col_cluster_coords='cluster_coordinates', col_similarity='similarity_to_cluster_centroid')[source]#
Bases:
object
- run_full_pipeline(vocab, data_df, text_column: str | None = None, ollama_model: str = 'llama3.2:3b-instruct-fp16', label_clusters: bool = True, generate_stats: bool = True, generate_visuals: bool = True, process_parents: bool = True, skip_completed: bool = True, label_criteria=None, label_info=None, number_of_labels: int = 5, steps: Sequence[Literal['post', 'label', 'stats']] | None = None)[source]#
Run any subset of the pipeline while preserving order:
‘post’ → post_process_hnmfk ‘label’ → _label_all_clusters (requires ‘post’ artifacts) ‘stats’ → generate_cluster_stats (requires ‘post’ artifacts)
- Rules:
‘label’ and/or ‘stats’ can be run without ‘post’ only if artifacts already exist.
Order is always post → label → stats, even if you request multiple.