Using the text generation API with AdaptNLP
 

What is Text Generation?

Text generation is the NLP task of generating a coherent sequence of words, usually from a language model. The current leading methods, most notably OpenAI’s GPT-2 and GPT-3, rely on feeding tokens (words or characters) into a pre-trained language model which then uses this seed data to construct a sequence of text. AdaptNLP provides simple methods to easily fine-tune these state-of-the-art models and generate text for any use case.

Below, we'll walk through how we can use AdaptNLP's EasyTextGenerator module to generate text to complete a given String.

Getting Started with TextGeneration

We'll get started by importing the EasyTextGenerator class from AdaptNLP.

from adaptnlp import EasyTextGenerator

Then we'll write some sample text to use:

text = "China and the U.S. will begin to"

And finally instantiating our EasyTextGenerator:

generator = EasyTextGenerator()

Generating Text

Now that we have the summarizer instantiated, we are ready to load in a model and compress the text with the built-in generate() method.

Here is one example using the gpt2 model:

generated_text = generator.generate(text, model_name_or_path="gpt2", mini_batch_size=2, num_tokens_to_produce=50)

print(generated_text)
Special tokens have been added in the vocabulary, make sure the associated word embedding are fine-tuned or trained.
100.00% [1/1 00:00<00:00]
['China and the U.S. will begin to see the effects of the new sanctions on the Russian economy.\n\n"The U.S. is going to be the first to see the effects of the new sanctions," said Michael O\'Hanlon, a senior fellow at the Center for Strategic']

Finding Models with the HFModelHub

Rather than searching through HuggingFace for models to use, we can use Adapt's HFModelHub to search for valid text generation models.

First, let's import it:

from adaptnlp.model_hub import HFModelHub

And then search for some models by task:

hub = HFModelHub()
models = hub.search_model_by_task('text-generation'); models
[Model Name: distilgpt2, Tasks: [text-generation],
 Model Name: gpt2-large, Tasks: [text-generation],
 Model Name: gpt2-medium, Tasks: [text-generation],
 Model Name: gpt2-xl, Tasks: [text-generation],
 Model Name: gpt2, Tasks: [text-generation],
 Model Name: openai-gpt, Tasks: [text-generation],
 Model Name: transfo-xl-wt103, Tasks: [text-generation],
 Model Name: xlnet-base-cased, Tasks: [text-generation],
 Model Name: xlnet-large-cased, Tasks: [text-generation]]

We'll use our gpt2 model again:

model = models[4]; model
Model Name: gpt2, Tasks: [text-generation]

And pass it into our generator:

generated_text = generator.generate(text, model_name_or_path=model, mini_batch_size=2, num_tokens_to_produce=50)

print(generated_text)
100.00% [1/1 00:00<00:00]
['China and the U.S. will begin to see the effects of the new sanctions on the Russian economy.\n\n"The U.S. is going to be the first to see the effects of the new sanctions," said Michael O\'Hanlon, a senior fellow at the Center for Strategic']