AdaptNLP training API

class SequenceClassifierTrainer[source]

SequenceClassifierTrainer(corpus:Union[Corpus, Path, str], encoder:Union[EasyDocumentEmbeddings, Path, str], column_name_map:None, corpus_in_memory:bool=True, predictive_head:str='flair', **kwargs)

Sequence Classifier Trainer

Usage:

>>> sc_trainer = SequenceClassifierTrainer(corpus="/Path/to/data/dir")

Parameters:

  • corpus - A flair corpus data model or Path/string to a directory with train.csv/test.csv/dev.csv
  • encoder - A EasyDocumentEmbeddings object if training with a flair prediction head or Path/string if training with Transformer's prediction models
  • column_name_map - Required if corpus is not a Corpus object, it's a dictionary specifying the indices of the text and label columns of the csv i.e. {1:"text",2:"label"}
  • corpus_in_memory - Boolean for whether to store corpus embeddings in memory
  • predictive_head - For now either "flair" or "transformers" for the prediction head
  • **kwargs - Keyword arguments for Flair's TextClassifier model class

Usage Examples:

Sample code from: 20c_tutorial.flair_seq_class_trainer.ipynb (View Notebook for more context)

sc_configs = {
              "corpus": corpus,
              "encoder": doc_embeddings,
              "column_name_map": {0: "text", 1: "label"},
              "corpus_in_memory": True,
              "predictive_head": "flair",
             }
sc_trainer = SequenceClassifierTrainer(**sc_configs)

Sample code from: 20c_tutorial.flair_seq_class_trainer.ipynb (View Notebook for more context)

sc_configs = {
              "corpus": corpus,
              "encoder": doc_embeddings,
              "column_name_map": {0: "text", 1: "label"},
              "corpus_in_memory": True,
              "predictive_head": "flair",
             }
sc_trainer = SequenceClassifierTrainer(**sc_configs)