AdaptiveModels for using Transformers and Flair for token tagging and classification

class TransformersTokenTagger[source]

TransformersTokenTagger(tokenizer:PreTrainedTokenizer, model:PreTrainedModel) :: AdaptiveModel

Adaptive model for Transformer's Token Tagger Model

Usage:

>>> tagger = TransformersTokenTagger.load("dbmdz/bert-large-cased-finetuned-conll03-english")
>>> tagger.predict(text="Example text", mini_batch_size=32)

Parameters:

  • tokenizer - A tokenizer object from Huggingface's transformers (TODO) and tokenizers
  • model - A transformers token tagger model

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = TransformersTokenTagger.load("dbmdz/bert-large-cased-finetuned-conll03-english")
pred = tagger.predict(text='Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.', mini_batch_size=32)
baseline = [[{'entity_group': 'I-ORG',
   'score': 0.998292068640391,
   'word': 'Novetta Solutions',
   'offsets': (0, 3)},
  {'entity_group': 'I-PER',
   'score': 0.9985582232475281,
   'word': 'Albert Einstein',
   'offsets': (7, 9)},
  {'entity_group': 'I-ORG',
   'score': 0.9970489343007406,
   'word': 'Novetta Solutions',
   'offsets': (14, 17)},
  {'entity_group': 'I-PER',
   'score': 0.9961656928062439,
   'word': 'Wright',
   'offsets': (19, 20)},
  {'entity_group': 'I-ORG',
   'score': 0.9933501183986664,
   'word': 'JBF',
   'offsets': (25, 27)}]]

for base, p in zip(baseline, pred):
    for base_items, p_items in zip(base, p):
        test_eq(base_items['entity_group'], p_items['entity_group'])
        test_close(base_items['score'], p_items['score'], 1e-3)
        test_eq(base_items['word'], p_items['word'])
        test_eq(base_items['offsets'], p_items['offsets'])

TransformersTokenTagger.load[source]

TransformersTokenTagger.load(model_name_or_path:str)

Class method for loading and constructing this tagger

  • model_name_or_path - A key string of one of Transformer's pre-trained Token Tagger Model or a HFModelResult

Note: To search for valid models, you should use the AdaptNLP model_hub API

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = TransformersTokenTagger.load("dbmdz/bert-large-cased-finetuned-conll03-english")
pred = tagger.predict(text='Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.', mini_batch_size=32)
baseline = [[{'entity_group': 'I-ORG',
   'score': 0.998292068640391,
   'word': 'Novetta Solutions',
   'offsets': (0, 3)},
  {'entity_group': 'I-PER',
   'score': 0.9985582232475281,
   'word': 'Albert Einstein',
   'offsets': (7, 9)},
  {'entity_group': 'I-ORG',
   'score': 0.9970489343007406,
   'word': 'Novetta Solutions',
   'offsets': (14, 17)},
  {'entity_group': 'I-PER',
   'score': 0.9961656928062439,
   'word': 'Wright',
   'offsets': (19, 20)},
  {'entity_group': 'I-ORG',
   'score': 0.9933501183986664,
   'word': 'JBF',
   'offsets': (25, 27)}]]

for base, p in zip(baseline, pred):
    for base_items, p_items in zip(base, p):
        test_eq(base_items['entity_group'], p_items['entity_group'])
        test_close(base_items['score'], p_items['score'], 1e-3)
        test_eq(base_items['word'], p_items['word'])
        test_eq(base_items['offsets'], p_items['offsets'])

TransformersTokenTagger.predict[source]

TransformersTokenTagger.predict(text:Union[List[str], str], mini_batch_size:int=32, grouped_entities:bool=True, **kwargs)

Predict method for running inference using the pre-trained token tagger model. Returns a list of lists of tagged entities.

  • text - String, list of strings, sentences, or list of sentences to run inference on
  • mini_batch_size - Mini batch size
  • grouped_entities - Set True to get whole entity span strings (Default True)
  • **kwargs(Optional) - Optional arguments for the Transformers tagger

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = TransformersTokenTagger.load("dbmdz/bert-large-cased-finetuned-conll03-english")
pred = tagger.predict(text='Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.', mini_batch_size=32)
baseline = [[{'entity_group': 'I-ORG',
   'score': 0.998292068640391,
   'word': 'Novetta Solutions',
   'offsets': (0, 3)},
  {'entity_group': 'I-PER',
   'score': 0.9985582232475281,
   'word': 'Albert Einstein',
   'offsets': (7, 9)},
  {'entity_group': 'I-ORG',
   'score': 0.9970489343007406,
   'word': 'Novetta Solutions',
   'offsets': (14, 17)},
  {'entity_group': 'I-PER',
   'score': 0.9961656928062439,
   'word': 'Wright',
   'offsets': (19, 20)},
  {'entity_group': 'I-ORG',
   'score': 0.9933501183986664,
   'word': 'JBF',
   'offsets': (25, 27)}]]

for base, p in zip(baseline, pred):
    for base_items, p_items in zip(base, p):
        test_eq(base_items['entity_group'], p_items['entity_group'])
        test_close(base_items['score'], p_items['score'], 1e-3)
        test_eq(base_items['word'], p_items['word'])
        test_eq(base_items['offsets'], p_items['offsets'])

class FlairTokenTagger[source]

FlairTokenTagger(model_name_or_path:str) :: AdaptiveModel

Adaptive Model for Flair's Token Tagger...very basic

Usage:

>>> tagger = FlairTokenTagger.load("flair/chunk-english-fast")
>>> tagger.predict(text="Example text", mini_batch_size=32)

Parameters:

  • model_name_or_path - A key string of one of Flair's pre-trained Token tagger Model

To find a list of available models, see here

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = FlairTokenTagger.load("flair/chunk-english-fast")
preds = tagger.predict(text="Example text", mini_batch_size=32)[0]
test_eq(preds.tokens[0].text, 'Example')
test_eq(preds.tokens[1].text, 'text')

FlairTokenTagger.load[source]

FlairTokenTagger.load(model_name_or_path:str)

Class method for loading a constructing this tagger

  • model_name_or_path - A key string of one of Flair's pre-trained Token tagger Model

To find a list of available models, see here

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = FlairTokenTagger.load("flair/chunk-english-fast")
preds = tagger.predict(text="Example text", mini_batch_size=32)[0]
test_eq(preds.tokens[0].text, 'Example')
test_eq(preds.tokens[1].text, 'text')

FlairTokenTagger.predict[source]

FlairTokenTagger.predict(text:Union[List[Sentence], Sentence, List[str], str], mini_batch_size:int=32, **kwargs)

Predict method for running inference using the pre-trained token tagger model

  • text - String, list of strings, sentences, or list of sentences to run inference on
  • mini_batch_size - Mini batch size
  • **kwargs(Optional) - Optional arguments for the Flair tagger

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = FlairTokenTagger.load("flair/chunk-english-fast")
preds = tagger.predict(text="Example text", mini_batch_size=32)[0]
test_eq(preds.tokens[0].text, 'Example')
test_eq(preds.tokens[1].text, 'text')

class EasyTokenTagger[source]

EasyTokenTagger()

Token level classification models

Usage:

>>> tagger = adaptnlp.EasyTokenTagger()
>>> tagger.tag_text(text="text you want to tag", model_name_or_path="ner-ontonotes")

Usage Examples:

Sample code from: 05a_tutorial.token_tagging.ipynb (View Notebook for more context)

tagger = EasyTokenTagger()
_ = tagger.tag_text(text=example_text, model_name_or_path="ner-ontonotes")
_ = tagger.tag_text(text=example_text, model_name_or_path="pos")

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = EasyTokenTagger()
example_text = '''Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. 
The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.'''
sentences = ["Jack walked through the park on a Sunday.", "Sunday was a nice and breezy afternoon.", "Jack was going to meet Jill for dinner."]
text = tagger.tag_text(text=example_text, model_name_or_path="ner-ontonotes")
text = tagger.tag_text(text=example_text, model_name_or_path="pos")

Sample code from: 05a_tutorial.token_tagging.ipynb (View Notebook for more context)

example_text = '''Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. 
The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.'''
tagger = EasyTokenTagger()

EasyTokenTagger.tag_text[source]

EasyTokenTagger.tag_text(text:Union[List[Sentence], Sentence, List[str], str], model_name_or_path:Union[str, FlairModelResult, HFModelResult]='ner-ontonotes', mini_batch_size:int=32, **kwargs)

Tags tokens with labels the token classification models have been trained on

  • text - Text input, it can be a string or any of Flair's Sentence input formats
  • model_name_or_path - The hosted model name key or model path
  • mini_batch_size - The mini batch size for running inference
  • **kwargs - Keyword arguments for Flair's SequenceTagger.predict() method return - A list of Flair's Sentence's

Usage Examples:

Sample code from: 05_token_classification.ipynb (View Notebook for more context)

tagger = EasyTokenTagger()
example_text = '''Novetta Solutions is the best. Albert Einstein used to be employed at Novetta Solutions. 
The Wright brothers loved to visit the JBF headquarters, and they would have a chat with Albert.'''
sentences = ["Jack walked through the park on a Sunday.", "Sunday was a nice and breezy afternoon.", "Jack was going to meet Jill for dinner."]
text = tagger.tag_text(text=example_text, model_name_or_path="ner-ontonotes")
text = tagger.tag_text(text=example_text, model_name_or_path="pos")

Sample code from: 05a_tutorial.token_tagging.ipynb (View Notebook for more context)

tagger = EasyTokenTagger()
_ = tagger.tag_text(text=example_text, model_name_or_path="ner-ontonotes")
_ = tagger.tag_text(text=example_text, model_name_or_path="pos")

EasyTokenTagger.tag_all[source]

EasyTokenTagger.tag_all(text:Union[List[Sentence], Sentence, List[str], str], mini_batch_size:int=32, **kwargs)

Tags tokens with all labels from all token classification models

  • text - Text input, it can be a string or any of Flair's Sentence input formats
  • mini_batch_size - The mini batch size for running inference
  • **kwargs - Keyword arguments for Flair's SequenceTagger.predict() method return A list of Flair's Sentence's