AdaptNLP Embeddings Module

class EasyWordEmbeddings[source]

EasyWordEmbeddings()

Word embeddings from the latest language models

Usage:

>>> embeddings = adaptnlp.EasyWordEmbeddings()
>>> embeddings.embed_text("text you want embeddings for", model_name_or_path="bert-base-cased")

Usage Examples:

Sample code from: 04a_tutorial.embeddings.ipynb (View Notebook for more context)

embeddings = EasyWordEmbeddings()

EasyWordEmbeddings.embed_text[source]

EasyWordEmbeddings.embed_text(text:Union[List[Sentence], Sentence, List[str], str], model_name_or_path:Union[str, HFModelResult, FlairModelResult]='bert-base-cased')

Produces embeddings for text

Parameters:

  • text - Text input, it can be a string or any of Flair's Sentence input formats
  • model_name_or_path - The hosted model name key, model path, or an instance of either HFModelResult or FlairModelResult

Return:

  • A list of Flair's Sentences

EasyWordEmbeddings.embed_all[source]

EasyWordEmbeddings.embed_all(text:Union[List[Sentence], Sentence, List[str], str], *model_names_or_paths:str)

Embeds text with all embedding models loaded

Parameters:

  • text - Text input, it can be a string or any of Flair's Sentence input formats
  • model_names_or_paths - A variable input of model names or paths to embed

Return:

  • A list of Flair's Sentences

class EasyStackedEmbeddings[source]

EasyStackedEmbeddings(*embeddings:str)

Word Embeddings that have been concatenated and "stacked" as specified by flair

Usage:

>>> embeddings = adaptnlp.EasyStackedEmbeddings("bert-base-cased", "gpt2", "xlnet-base-cased")

Parameters:

  • *embeddings - Non-keyword variable number of strings specifying the embeddings you want to stack

Usage Examples:

Sample code from: 04a_tutorial.embeddings.ipynb (View Notebook for more context)

embeddings = EasyStackedEmbeddings("bert-base-cased", "distilbert-base-cased")

Sample code from: 04_embeddings.ipynb (View Notebook for more context)

embeddings = EasyStackedEmbeddings("bert-base-cased", "xlnet-base-cased")
sentences = embeddings.embed_text("This is Albert.  My last name is Einstein.  I like physics and atoms.")
test_eq(sentences[0][0].get_embedding().shape, torch.Size([1536]))

EasyStackedEmbeddings.embed_text[source]

EasyStackedEmbeddings.embed_text(text:Union[List[Sentence], Sentence, List[str], str])

Stacked embeddings

Parameters:

  • text - Text input, it can be a string or any of Flair's Sentence input formats

Return:

  • A list of Flair's Sentences

Usage Examples:

Sample code from: 04_embeddings.ipynb (View Notebook for more context)

embeddings = EasyStackedEmbeddings("bert-base-cased", "xlnet-base-cased")
sentences = embeddings.embed_text("This is Albert.  My last name is Einstein.  I like physics and atoms.")
test_eq(sentences[0][0].get_embedding().shape, torch.Size([1536]))

class EasyDocumentEmbeddings[source]

EasyDocumentEmbeddings(*embeddings:str, methods:List[str]=['rnn', 'pool'], configs:Dict[KT, VT]={'pool_configs': {'fine_tune_mode': 'linear', 'pooling': 'mean'}, 'rnn_configs': {'hidden_size': 512, 'rnn_layers': 1, 'reproject_words': True, 'reproject_words_dimension': 256, 'bidirectional': False, 'dropout': 0.5, 'word_dropout': 0.0, 'locked_dropout': 0.0, 'rnn_type': 'GRU', 'fine_tune': True}})

Document Embeddings generated by pool and rnn methods applied to the word embeddings of text

Usage:

>>> embeddings = adaptnlp.EasyDocumentEmbeddings("bert-base-cased", "xlnet-base-cased", methods["rnn"])

Parameters:

  • *embeddings - Non-keyword variable number of strings referring to model names or paths
  • methods - A list of strings to specify which document embeddings to use i.e. ["rnn", "pool"] (avoids unncessary loading of models if only using one)
  • configs - A dictionary of configurations for flair's rnn and pool document embeddings
    >>> example_configs = {"pool_configs": {"fine_tune_mode": "linear", "pooling": "mean", },
    ...                   "rnn_configs": {"hidden_size": 512,
    ...                                   "rnn_layers": 1,
    ...                                   "reproject_words": True,
    ...                                   "reproject_words_dimension": 256,
    ...                                   "bidirectional": False,
    ...                                   "dropout": 0.5,
    ...                                   "word_dropout": 0.0,
    ...                                   "locked_dropout": 0.0,
    ...                                   "rnn_type": "GRU",
    ...                                   "fine_tune": True, },
    ...                  }
    

Usage Examples:

Sample code from: 04a_tutorial.embeddings.ipynb (View Notebook for more context)

embeddings = EasyDocumentEmbeddings("bert-base-cased", "distilbert-base-cased")

Sample code from: 20c_tutorial.flair_seq_class_trainer.ipynb (View Notebook for more context)

OUTPUT_DIR = "Path/to/model/output/directory"
FINETUNED_MODEL_DIR = "Path/to/finetuned/model/directory"
doc_embeddings = EasyDocumentEmbeddings(FINETUNED_MODEL_DIR, methods = ["rnn"])

Sample code from: 04_embeddings.ipynb (View Notebook for more context)

embeddings = EasyDocumentEmbeddings("bert-base-cased", "xlnet-base-cased")
text = embeddings.embed_pool("This is Albert.  My last name is Einstein.  I like physics and atoms.")
test_eq(text[0].get_embedding().shape, torch.Size([1536]))

EasyDocumentEmbeddings.embed_pool[source]

EasyDocumentEmbeddings.embed_pool(text:Union[List[Sentence], Sentence, List[str], str])

Generate stacked embeddings with DocumentPoolEmbeddings

Parameters:

  • text - Text input, it can be a string or any of Flair's Sentence input formats

Return:

  • A list of Flair's Sentences

Usage Examples:

Sample code from: 04_embeddings.ipynb (View Notebook for more context)

embeddings = EasyDocumentEmbeddings("bert-base-cased", "xlnet-base-cased")
text = embeddings.embed_pool("This is Albert.  My last name is Einstein.  I like physics and atoms.")
test_eq(text[0].get_embedding().shape, torch.Size([1536]))

EasyDocumentEmbeddings.embed_rnn[source]

EasyDocumentEmbeddings.embed_rnn(text:Union[List[Sentence], Sentence, List[str], str])

Generate stacked embeddings with DocumentRNNEmbeddings

Parameters:

  • text - Text input, it can be a string or any of Flair's Sentence input formats

Return:

  • A list of Flair's Sentences