Tuning a base Language model on the IMDB dataset
 

Introduction

In this tutorial we will be showing an end-to-end example of fine-tuning a Transformer language model on a custom dataset in CSV file format.

By the end of this you should be able to:

  1. Build a dataset with the LanguageModelDatasets class, and their DataLoaders
  2. Build a LanguageModelTuner quickly, find a good learning rate, and train with the One-Cycle Policy
  3. Save that model away, to be used with deployment or other HuggingFace libraries
  4. Apply inference using both the Tuner available function as well as with the EasyTextGenerator class within AdaptNLP

Installing the Library

This tutorial utilizies the latest AdaptNLP version, as well as parts of the fastai library. Please run the below code to install them:

!pip install adaptnlp -U

(or pip3)

Getting the Dataset

First we need a dataset. We will use the fastai library to download the IMDB_SAMPLE dataset, a subset of IMDB Movie Reviews.

from fastai.data.external import URLs, untar_data

URLs holds a namespace of many data endpoints, and untar_data is a function that can download and extract any data from a given URL.

Combining both, we can download the data:

data_path = untar_data(URLs.IMDB_SAMPLE)

If we look at what was downloaded, we will find a texts.csv file:

data_path.ls()
(#1) [Path('/root/.fastai/data/imdb_sample/texts.csv')]

This is our data we want to use. This CSV is formatted with a table of columns with label, text, and is_valid dictating whether it is part of the validation set or not.

Now that we have the dataset, and we know the format it is in, let's pick a viable model to train with

Picking a Model with the Hub

AdaptNLP has a HFModelHub class that allows you to communicate with the HuggingFace Hub and pick a model from it, as well as a namespace HF_TASKS class with a list of valid tasks we can search by.

Let's try and find one suitable for sequence classification.

First we need to import the class and generate an instance of it:

from adaptnlp import HFModelHub, HF_TASKS
hub = HFModelHub()

Next we can search for a model:

models = hub.search_model_by_task(HF_TASKS.TEXT_GENERATION)

Let's look at a few:

models[:10]
[Model Name: distilgpt2, Tasks: [text-generation],
 Model Name: gpt2-large, Tasks: [text-generation],
 Model Name: gpt2-medium, Tasks: [text-generation],
 Model Name: gpt2-xl, Tasks: [text-generation],
 Model Name: gpt2, Tasks: [text-generation],
 Model Name: openai-gpt, Tasks: [text-generation],
 Model Name: transfo-xl-wt103, Tasks: [text-generation],
 Model Name: xlnet-base-cased, Tasks: [text-generation],
 Model Name: xlnet-large-cased, Tasks: [text-generation]]

These are models specifically tagged with the text-generation tag, so you may not see a few models you would expect such as bert_base_cased.

We'll use that first model, distilgpt2:

model = models[0]
model
Model Name: distilgpt2, Tasks: [text-generation]

Now that we have picked a model, let's use the data API to prepare our data

Each task has a high-level data wrapper around the TaskDatasets class. In our case this is the LanguageModelDatasets class:

from adaptnlp import LanguageModelDatasets

There are multiple different constructors for the LanguageModelDatasets class, and you should never call the main constructor directly.

We will be using from_csvs, which wraps around the from_dfs constructor:

LanguageModelDatasets.from_csvs[source]

LanguageModelDatasets.from_csvs(train_csv:Path, text_col:str, tokenizer_name:str, block_size:int=128, masked_lm:bool=False, valid_csv:Path=None, split_func:callable=None, split_pct:float=0.1, tokenize_kwargs:dict={}, auto_kwargs:dict={}, **kwargs)

Builds LanguageModelDatasets from a single csv or set of csvs. A convience constructor for from_dfs

Parameters:

  • train_csv : <class 'pathlib.Path'>

    A training csv file

  • text_col : <class 'str'>

    The name of the text column

  • tokenizer_name : <class 'str'>

    The name of the tokenizer

  • block_size : <class 'int'>, optional

    The size of each block

  • masked_lm : <class 'bool'>, optional

    Whether the language model is a MLM

  • valid_csv : <class 'pathlib.Path'>, optional

    An optional validation csv

  • split_func : <built-in function callable>, optional

    Optionally a splitting function similar to RandomSplitter

  • split_pct : <class 'float'>, optional

    What % to split the df between training and validation

  • tokenize_kwargs : <class 'dict'>, optional

    kwargs for the tokenize function

  • auto_kwargs : <class 'dict'>, optional

    kwargs for the AutoTokenizer.from_pretrained constructor

  • kwargs : <class 'inspect._empty'>

Anything you would normally pass to the tokenizer call (such as max_length, padding) should go in tokenize_kwargs, and anything going to the AutoTokenizer.from_pretrained constructor should be passed to the auto_kwargs.

In our case we only have a train_csv and we have a tokenizer name. We also want to split 90%/10% (which is the default)

Also, we will set a block_size of 128, and it is not a masked language model:

dsets = LanguageModelDatasets.from_csvs(
    train_csv=data_path/'texts.csv',
    text_col='text',
    tokenizer_name=model.name,
    block_size=128,
    masked_lm=False
)
No value for `max_length` set, automatically adjusting to the size of the model and including truncation
Sequence length set to: 1024




And finally turn it into some AdaptiveDataLoaders.

These are just fastai's DataLoaders class, but it overrides a few functions to have it work nicely with HuggingFace's Dataset class

LanguageModelDatasets.dataloaders[source]

LanguageModelDatasets.dataloaders(batch_size=8, shuffle_train=True, collate_fn=default_data_collator, mlm_probability:float=0.15, path='.', device=None)

Build DataLoaders from self

Parameters:

  • batch_size : <class 'int'>, optional

    A batch size

  • shuffle_train : <class 'bool'>, optional

    Whether to shuffle the training dataset

  • collate_fn : <class 'function'>, optional

    A custom collation function

  • mlm_probability : <class 'float'>, optional

    Token masking probablity for Masked Language Models

  • path : <class 'str'>, optional

  • device : <class 'NoneType'>, optional

dls = dsets.dataloaders(batch_size=8)

Finally, let's view a batch of data with the show_batch function:

dls.show_batch()
Input Labels
0 2".<br /><br />It starts out trying to borrow its comic relief style of Star Wars, but mercifully (since the humor doesn't work) gives up on comedy and plays it serious. In that sense, it's superior to the Star Wars franchise, which started with a clever sense of humor, and eventually deteriorated to Jar-Jar's annoying silliness.<br /><br />The agricultural details were apparently drawn by someone who had never seen a farm. The harvester was driving through the unharvested middle of a field, dumping silage onto unharvested crops, rather than working from one side to the other and dumping the silage onto already-harvested rows or into a truck. Corn (maize) was pouring out the grain chute, but the farm lands were drawn like a wheat field.<br /><br />When it was time for Kim's father had to face his fate, there wasn't any dramatic weight to the scene. That could have been partly the fault of the English-language voice actor, but the drawings didn't show much weight either. Kim's reactions in that scene were similarly unconvincing.<br /><br />Similarly, when a character named Henderson was killed, Chris showed very little reaction, even though they were apparently supposed to have been close. (Henderson's death is no spoiler; his name isn't revealed until his death scene.) She seems to promptly forget him. Someone's expression of sympathy shows more feeling than she does. I think the voice actor deserves most of the blame in that case; there's at least a hint of feeling in the drawings of Chris.<br /><br />On several occasions, villains fail to accomplish their orders. A villain leader often punishes those failures with miserable deaths. I can't say whether that's lifted from Star Wars, or if that comes from an earlier source -- possibly the Lensman books.<br /><br />There's a scene where a space ship crash-lands. As it plunges toward the ground, parts are break off the ship. But so many pieces are fall off that there should be nothing left of it by the time it lands.<br /><br />While in most cases Chris seems like a competent, tough space hero, there's a scene where she shrieks like an incompetent damsel in distress. Someone tough enough to get over Henderson's death so quickly should at least be able to shout, "help, it's got me and I can't 2".<br /><br />It starts out trying to borrow its comic relief style of Star Wars, but mercifully (since the humor doesn't work) gives up on comedy and plays it serious. In that sense, it's superior to the Star Wars franchise, which started with a clever sense of humor, and eventually deteriorated to Jar-Jar's annoying silliness.<br /><br />The agricultural details were apparently drawn by someone who had never seen a farm. The harvester was driving through the unharvested middle of a field, dumping silage onto unharvested crops, rather than working from one side to the other and dumping the silage onto already-harvested rows or into a truck. Corn (maize) was pouring out the grain chute, but the farm lands were drawn like a wheat field.<br /><br />When it was time for Kim's father had to face his fate, there wasn't any dramatic weight to the scene. That could have been partly the fault of the English-language voice actor, but the drawings didn't show much weight either. Kim's reactions in that scene were similarly unconvincing.<br /><br />Similarly, when a character named Henderson was killed, Chris showed very little reaction, even though they were apparently supposed to have been close. (Henderson's death is no spoiler; his name isn't revealed until his death scene.) She seems to promptly forget him. Someone's expression of sympathy shows more feeling than she does. I think the voice actor deserves most of the blame in that case; there's at least a hint of feeling in the drawings of Chris.<br /><br />On several occasions, villains fail to accomplish their orders. A villain leader often punishes those failures with miserable deaths. I can't say whether that's lifted from Star Wars, or if that comes from an earlier source -- possibly the Lensman books.<br /><br />There's a scene where a space ship crash-lands. As it plunges toward the ground, parts are break off the ship. But so many pieces are fall off that there should be nothing left of it by the time it lands.<br /><br />While in most cases Chris seems like a competent, tough space hero, there's a scene where she shrieks like an incompetent damsel in distress. Someone tough enough to get over Henderson's death so quickly should at least be able to shout, "help, it's got me and I can't
1 a script that gives the heroine some intelligence and agency, and an actress who can convey those qualities.<br /><br />Hugh Jackman is similarly cheated by the script. Allen apparently can't stand it that Jackman is so stunningly good looking and young, and so he gives Jackman nothing to say or do. Like Johansson, he is used merely for his good looks. This is a shame, because, as Jackman has shown in any number of productions, from "Oklahoma" to "X Men," he CAN act.<br /><br />Here's the big plot twist -- Jackman, suave, charming English Lord, really is a killer. So, though the movie says it is all about letting someone else, other than Allen, get the girl, she doesn't get anyone. Jackman, the man she's been making love to, is a man who murdered a prostitute. Nice, Woody. Nice way to punish your heroine for being beyond your grasp.<br /><br />In a passive aggressive touch, Allen deprives his heroine of his own presence, as well, killing off his character, the magician, leaving Scarlett Johansson all alone at the end of the film.<br /><br />A final note: at my screening, not a single audience member laughed at any point during the film. Always a bad sign when a film is advertised as a comedy.The movie is plain bad. Simply awful. The string of bad movies from Bollywood has no end! They must be running out of excuses for making such awful movies (or not).<br /><br />The problem seems to be with mainly the directors. This movie has 2 good actors who have proved in the past that the have the ability to deliver great performance...but they were directed so poorly. The poor script did not help either.<br /><br />This movie has plenty of ridiculous moments and very bad editing in the first half. For instance :<br /><br />After his 1st big concert, Ajay Devgan, meets up with Om Puri (from whom he ran away some 30 years ago and talked to again) and all Om Puri finds to say is to beware of his friendship with Salman!!! What a load of crap. Seriously. Not to mention the baaad soundtrack. Whatever happened to Shankar Ehsaan Loy?<br /><br />Ajay Devgun is total miscast for portraying a rockstar.<br /><br />Only saving grace a script that gives the heroine some intelligence and agency, and an actress who can convey those qualities.<br /><br />Hugh Jackman is similarly cheated by the script. Allen apparently can't stand it that Jackman is so stunningly good looking and young, and so he gives Jackman nothing to say or do. Like Johansson, he is used merely for his good looks. This is a shame, because, as Jackman has shown in any number of productions, from "Oklahoma" to "X Men," he CAN act.<br /><br />Here's the big plot twist -- Jackman, suave, charming English Lord, really is a killer. So, though the movie says it is all about letting someone else, other than Allen, get the girl, she doesn't get anyone. Jackman, the man she's been making love to, is a man who murdered a prostitute. Nice, Woody. Nice way to punish your heroine for being beyond your grasp.<br /><br />In a passive aggressive touch, Allen deprives his heroine of his own presence, as well, killing off his character, the magician, leaving Scarlett Johansson all alone at the end of the film.<br /><br />A final note: at my screening, not a single audience member laughed at any point during the film. Always a bad sign when a film is advertised as a comedy.The movie is plain bad. Simply awful. The string of bad movies from Bollywood has no end! They must be running out of excuses for making such awful movies (or not).<br /><br />The problem seems to be with mainly the directors. This movie has 2 good actors who have proved in the past that the have the ability to deliver great performance...but they were directed so poorly. The poor script did not help either.<br /><br />This movie has plenty of ridiculous moments and very bad editing in the first half. For instance :<br /><br />After his 1st big concert, Ajay Devgan, meets up with Om Puri (from whom he ran away some 30 years ago and talked to again) and all Om Puri finds to say is to beware of his friendship with Salman!!! What a load of crap. Seriously. Not to mention the baaad soundtrack. Whatever happened to Shankar Ehsaan Loy?<br /><br />Ajay Devgun is total miscast for portraying a rockstar.<br /><br />Only saving grace
2 to entice Robert into sex. Robert wants none of it, and puts on a jazz record. Ellen turns on the radio; Robert turns up the music; Ellen turns on the TV; Robert turns on another TV. Cacophony ensues. Ellen goes up on the roof, Robert joins her. Ellen confesses that she needs to experience more men, men other than Robert. Robert says that he too needs to experience men.<br /><br />We next follow Robert as he visits an artist, Martin, played by Steve Buscemi. I wish Buscemi could have more roles like this, where he is a sexy, smart, totally desirable guy. Robert praises Martin's work, much more than it deserves, promises to get it into a show. Martin is excited, until it turns out that Robert is speaking out of his groin, it is all a mating dance. Robert tries to kiss Martin, on the lips, and Martin pulls back, saying that he is not gay. Robert asserts that he's not gay either, Martin scoffs. Both admit that the artworks are bad. Robert is about to leave, when Martin allows Robert to kiss him. They make out, and Robert goes down on Martin.<br /><br />Next we follow Martin, as he prepares for an art show at a Manhattan gallery. He is smitten by the receptionist, Anna, played by Rosario Dawson. (I had to cut some of this review to keep it under 1000 words)... and they make love to each other.<br /><br />We next follow Anna, who is sitting at a lunch stand. Her boyfriend, Nick (Adrian Grenier), enters, bearing flowers. She is cold toward him; he tries to figure out why. He coaxes out of her the information that she has had sex with someone while he was in San Francisco. She coaxes out of him the fact that he has stayed with his ex-gf while in San Francisco, and had sex with her. The latter revelation turns out to be a lie. The two of them make out in the luncheonette, but she decides that they must break up. Nick is heartbroken.<br /><br />And we follow Nick, who confesses his troubles to an older woman who he meets on a park bench, Joey (Carol Kane). Joey is sort of weird and child-like, but is a good audience for Nick, who needs a sympathetic ear. The two of them go to Coney Island to entice Robert into sex. Robert wants none of it, and puts on a jazz record. Ellen turns on the radio; Robert turns up the music; Ellen turns on the TV; Robert turns on another TV. Cacophony ensues. Ellen goes up on the roof, Robert joins her. Ellen confesses that she needs to experience more men, men other than Robert. Robert says that he too needs to experience men.<br /><br />We next follow Robert as he visits an artist, Martin, played by Steve Buscemi. I wish Buscemi could have more roles like this, where he is a sexy, smart, totally desirable guy. Robert praises Martin's work, much more than it deserves, promises to get it into a show. Martin is excited, until it turns out that Robert is speaking out of his groin, it is all a mating dance. Robert tries to kiss Martin, on the lips, and Martin pulls back, saying that he is not gay. Robert asserts that he's not gay either, Martin scoffs. Both admit that the artworks are bad. Robert is about to leave, when Martin allows Robert to kiss him. They make out, and Robert goes down on Martin.<br /><br />Next we follow Martin, as he prepares for an art show at a Manhattan gallery. He is smitten by the receptionist, Anna, played by Rosario Dawson. (I had to cut some of this review to keep it under 1000 words)... and they make love to each other.<br /><br />We next follow Anna, who is sitting at a lunch stand. Her boyfriend, Nick (Adrian Grenier), enters, bearing flowers. She is cold toward him; he tries to figure out why. He coaxes out of her the information that she has had sex with someone while he was in San Francisco. She coaxes out of him the fact that he has stayed with his ex-gf while in San Francisco, and had sex with her. The latter revelation turns out to be a lie. The two of them make out in the luncheonette, but she decides that they must break up. Nick is heartbroken.<br /><br />And we follow Nick, who confesses his troubles to an older woman who he meets on a park bench, Joey (Carol Kane). Joey is sort of weird and child-like, but is a good audience for Nick, who needs a sympathetic ear. The two of them go to Coney Island
3 I thought this film was alright; much better than I expected it to be. I was skeptical at first - the idea of a computer virus that can also infect people seemed a little ludicrous to me. But in the end, I thought the film handled the concept well (even if some scenes were a little clichéd).<br /><br />The cast was quite good, and the two leads seemed to take their roles very seriously. I couldn't help thinking, though, that Janine Turner is a bit of a Geena Davis look-a-like. Maybe it's just her face or the make-up, hair and clothes she had in this movie but it just kept nagging at the back of my mind the whole time.<br /><br />While it's not a'must see' or a great film by any standard, 'Fatal Error' is an entertaining flick that will keep you watching until the end.While I count myself as a fan of the Babylon 5 television series, the original movie that introduced the series was a weak start. Although many of the elements that would later mature and become much more compelling in the series are there, the pace of The Gathering is slow, the makeup somewhat inadequate, and the plot confusing. Worse, the characterization in the premiere episode is poor. Although the ratings chart shows that many fans are willing to overlook these problems, I remember The Gathering almost turned me off off what soon grew into a spectacular series.How unfortunate, to have so many of my "a" list, and good "b" list actors agree to do this movie, but they did, and that is what sucked me into watching it. I had never heard of this movie, but there was Cuba Gooding Jr. right on the DVD cover, and James Woods in the background how bad can it be? In a word Very! This movie starts o.k. has some twists and turns, then just lays an egg. The ending was so weak, it was as if the writer got called away and his 4 year old son sat down at the type writer and hacked out the ending. How ironic a for a movie titled "The end game" to have such a poor one. These are the types of movies that can move "a" list actors to the "b" list in hurry. I hope Cuba Gooding JR, and James Woods don't make a habit of this.A definite no. A resounding NO. This movie is an absolute dud.<br /><br />Having I thought this film was alright; much better than I expected it to be. I was skeptical at first - the idea of a computer virus that can also infect people seemed a little ludicrous to me. But in the end, I thought the film handled the concept well (even if some scenes were a little clichéd).<br /><br />The cast was quite good, and the two leads seemed to take their roles very seriously. I couldn't help thinking, though, that Janine Turner is a bit of a Geena Davis look-a-like. Maybe it's just her face or the make-up, hair and clothes she had in this movie but it just kept nagging at the back of my mind the whole time.<br /><br />While it's not a'must see' or a great film by any standard, 'Fatal Error' is an entertaining flick that will keep you watching until the end.While I count myself as a fan of the Babylon 5 television series, the original movie that introduced the series was a weak start. Although many of the elements that would later mature and become much more compelling in the series are there, the pace of The Gathering is slow, the makeup somewhat inadequate, and the plot confusing. Worse, the characterization in the premiere episode is poor. Although the ratings chart shows that many fans are willing to overlook these problems, I remember The Gathering almost turned me off off what soon grew into a spectacular series.How unfortunate, to have so many of my "a" list, and good "b" list actors agree to do this movie, but they did, and that is what sucked me into watching it. I had never heard of this movie, but there was Cuba Gooding Jr. right on the DVD cover, and James Woods in the background how bad can it be? In a word Very! This movie starts o.k. has some twists and turns, then just lays an egg. The ending was so weak, it was as if the writer got called away and his 4 year old son sat down at the type writer and hacked out the ending. How ironic a for a movie titled "The end game" to have such a poor one. These are the types of movies that can move "a" list actors to the "b" list in hurry. I hope Cuba Gooding JR, and James Woods don't make a habit of this.A definite no. A resounding NO. This movie is an absolute dud.<br /><br />Having
4 ><br />All in all pretty dull slasher flick that doesn't go anywhere I'd definitely wouldn't recommend it to Slasher fans.From a plot and movement standpoint, this movie was terrible. I found myself looking at the clock in theater hoping it would end and relieved after 80 long minutes that it mercifully did. Basically, five characters appear in the movie, A Son & Father, son's girl friend, and two male characters of the son's age who appear and then disappear without context or explanation. The movie and scenes seemed to suggest homo-eroticism, but nothing ever actually happened to reveal this one way or another. There were a couple of brilliant scenes. At the beginning of the movie, the son's girl friend shows up at a window outside his room and they engage in an odd conversation. The photography and acting lent an incredible seductiveness to the interaction between the two, ending with her admitting to having another man who was "older". End of that story.Watching It Lives By Night makes you wonder, just who in the world greenlit this crap. A newlywed couple go spelunking on their honeymoon, get attacked by bats and the husband starts to run around in his pajamas attacking various people. And where exactly are they? They're in the desert, then they're skiing, then they're in a small town that looks like it has mountains nearby. The town is run by a sheriff who likes to watch and has a personal vendetta against whiny doctor boy. The ski hospital is run by a really groovy guy with a nice thick mustache and the wife looks like Mary Tyler Moore or Marilyn Quayle. There's no dramatic tension and the ending will leave you filled with anger. Special effects and makeup guru Stan Winston did the effects for this movie. I guess you have to start somewhere.I remember I saw this cartoon when I was 6 or 7. My grandfather picked up the video of it for free at the mall. I remember that it really sucked. The plot had no sense. I hated the fox that became Casper's friend. He was so stupid! Casper cried his head off if he couldn't find a friend. So what? Get over it! The only good part and I don't want to sound mean-spirited was when the fox got shot and died at the end. I laughed my head off in payback because this cartoon sucked so much. The bad news is the fox resurrects and becomes a ghost. I wish he ><br />All in all pretty dull slasher flick that doesn't go anywhere I'd definitely wouldn't recommend it to Slasher fans.From a plot and movement standpoint, this movie was terrible. I found myself looking at the clock in theater hoping it would end and relieved after 80 long minutes that it mercifully did. Basically, five characters appear in the movie, A Son & Father, son's girl friend, and two male characters of the son's age who appear and then disappear without context or explanation. The movie and scenes seemed to suggest homo-eroticism, but nothing ever actually happened to reveal this one way or another. There were a couple of brilliant scenes. At the beginning of the movie, the son's girl friend shows up at a window outside his room and they engage in an odd conversation. The photography and acting lent an incredible seductiveness to the interaction between the two, ending with her admitting to having another man who was "older". End of that story.Watching It Lives By Night makes you wonder, just who in the world greenlit this crap. A newlywed couple go spelunking on their honeymoon, get attacked by bats and the husband starts to run around in his pajamas attacking various people. And where exactly are they? They're in the desert, then they're skiing, then they're in a small town that looks like it has mountains nearby. The town is run by a sheriff who likes to watch and has a personal vendetta against whiny doctor boy. The ski hospital is run by a really groovy guy with a nice thick mustache and the wife looks like Mary Tyler Moore or Marilyn Quayle. There's no dramatic tension and the ending will leave you filled with anger. Special effects and makeup guru Stan Winston did the effects for this movie. I guess you have to start somewhere.I remember I saw this cartoon when I was 6 or 7. My grandfather picked up the video of it for free at the mall. I remember that it really sucked. The plot had no sense. I hated the fox that became Casper's friend. He was so stupid! Casper cried his head off if he couldn't find a friend. So what? Get over it! The only good part and I don't want to sound mean-spirited was when the fox got shot and died at the end. I laughed my head off in payback because this cartoon sucked so much. The bad news is the fox resurrects and becomes a ghost. I wish he

When training a language model, the input and output are made to be the exact same, so there isn't a shown noticable difference here.

Building Tuner

Next we need to build a compatible Tuner for our problem. These tuners contain good defaults for our problem space, including loss functions and metrics.

First let's import the LanguageModelTuner and view it's documentation

from adaptnlp import LanguageModelTuner

class LanguageModelTuner[source]

LanguageModelTuner(dls:DataLoaders, model_name, tokenizer=None, language_model_type:LMType='causal', loss_func=CrossEntropyLoss(), metrics=[<fastai.metrics.Perplexity object at 0x7faf54c22070>], opt_func=Adam, additional_cbs=None, expose_fastai_api=False, **kwargs) :: AdaptiveTuner

An AdaptiveTuner with good defaults for Language Model fine-tuning Valid kwargs and defaults:

  • lr:float = 0.001
  • splitter:function = trainable_params
  • cbs:list = None
  • path:Path = None
  • model_dir:Path = 'models'
  • wd:float = None
  • wd_bn_bias:bool = False
  • train_bn:bool = True
  • moms: tuple(float) = (0.95, 0.85, 0.95)

Parameters:

  • dls : <class 'fastai.data.core.DataLoaders'>

    A set of DataLoaders or AdaptiveDataLoaders

  • model_name : <class 'inspect._empty'>

    A HuggingFace model

  • tokenizer : <class 'NoneType'>, optional

    A HuggingFace tokenizer

  • language_model_type : <class 'fastcore.basics.LMType'>, optional

    The type of language model to use

  • loss_func : <class 'fastai.losses.CrossEntropyLossFlat'>, optional

    A loss function

  • metrics : <class 'list'>, optional

    Metrics to monitor the training with

  • opt_func : <class 'function'>, optional

    A fastai or torch Optimizer

  • additional_cbs : <class 'NoneType'>, optional

    Additional Callbacks to have always tied to the Tuner,

  • expose_fastai_api : <class 'bool'>, optional

    Whether to expose the fastai API

  • kwargs : <class 'inspect._empty'>

Next we'll pass in our DataLoaders, the name of our model, and the tokenizer:

tuner = LanguageModelTuner(dls, model.name, dls.tokenizer)

By default we can see that it used CrossEntropyLoss as our loss function, and Perplexity as our metric

tuner.loss_func
FlattenedLoss of CrossEntropyLoss()
_ = [print(m.name) for m in tuner.metrics]
perplexity

Finally we just need to train our model!

Fine-Tuning

To fine-tune, AdaptNLP's tuner class provides only a few functions to work with. The important ones are the tune and lr_find class.

As the Tuner uses fastai under the hood, lr_find calls fastai's Learning Rate Finder to help us pick a learning rate. Let's do that now:

AdaptiveTuner.lr_find[source]

AdaptiveTuner.lr_find(start_lr=1e-07, end_lr=10, num_it=100, stop_div=True, show_plot=True, suggest_funcs=valley)

Runs fastai's LR Finder

Parameters:

  • start_lr : <class 'float'>, optional

  • end_lr : <class 'int'>, optional

  • num_it : <class 'int'>, optional

  • stop_div : <class 'bool'>, optional

  • show_plot : <class 'bool'>, optional

  • suggest_funcs : <class 'function'>, optional

tuner.lr_find()
/opt/venv/lib/python3.8/site-packages/fastai/callback/schedule.py:270: UserWarning: color is redundantly defined by the 'color' keyword argument and the fmt string "ro" (-> color='r'). The keyword argument will take precedence.
  ax.plot(val, idx, 'ro', label=nm, c=color)
SuggestedLRs(valley=6.30957365501672e-05)

It recommends a learning rate of around 5e-5, so we will use that.

lr = 5e-5

Let's look at the documentation for tune:

AdaptiveTuner.tune[source]

AdaptiveTuner.tune(epochs:int, lr:float=None, strategy:Strategy='fit_one_cycle', callbacks:list=[], **kwargs)

Fine tune self.model for epochs with an lr and strategy

Parameters:

  • epochs : <class 'int'>

    Number of iterations to train for

  • lr : <class 'float'>, optional

    If None, finds a new learning rate and uses suggestion_method

  • strategy : <class 'fastcore.basics.Strategy'>, optional

    A fitting method

  • callbacks : <class 'list'>, optional

    Extra fastai Callbacks

  • kwargs : <class 'inspect._empty'>

We can pass in a number of epochs, a learning rate, a strategy, and additional fastai callbacks to call.

Valid strategies live in the Strategy namespace class, and consist of:

from adaptnlp import Strategy

In this tutorial we will train with the One-Cycle policy, as currently it is one of the best schedulers to use.

tuner.tune(3, lr, strategy=Strategy.OneCycle)
epoch train_loss valid_loss perplexity time
0 4.061049 3.879425 48.396393 00:55
1 3.973648 3.857744 47.358402 00:54
2 3.900359 3.858645 47.401066 00:54

Saving Model

Now that we have a trained model, let's save those weights away.

Calling tuner.save will save both the model and the tokenizer in the same format as how HuggingFace does:

AdaptiveTuner.save[source]

AdaptiveTuner.save(save_directory)

Save a pretrained model to a save_directory

Parameters:

  • save_directory : <class 'inspect._empty'>

    A folder to save our model to

tuner.save('good_model')
'good_model'

Performing Inference

There are two ways to get predictions, the first is with the .predict method in our tuner. This is great for if you just finished training and want to see how your model performs on some new data! The other method is with AdaptNLP's inference API, which we will show afterwards

In Tuner

First let's write a sentence to test with

sentence = "Hugh Jackman is a terrible "

And then predict with it:

LanguageModelTuner.predict[source]

LanguageModelTuner.predict(text:Union[List[str], str], bs:int=64, num_tokens_to_produce:int=50, **kwargs)

Predict some text for sequence classification with the currently loaded model

Parameters:

  • text : typing.Union[typing.List[str], str]

    Some text or list of texts to do inference with

  • bs : <class 'int'>, optional

    A batch size to use for multiple texts

  • num_tokens_to_produce : <class 'int'>, optional

    Number of tokens to generate

  • kwargs : <class 'inspect._empty'>
tuner.predict(sentence, num_tokens_to_produce=13)
100.00% [1/1 00:00<00:00]
{'generated_text': ["Hugh Jackman is a terrible icky, and I'm not sure if he's a good actor"]}

With the Inference API

Next we will use the EasyTextGenerator class, which AdaptNLP offers:

from adaptnlp import EasyTextGenerator

We simply construct the class:

classifier = EasyTextGenerator()

And call the tag_text method, passing in the sentence, the location of our saved model, and some names for our classes:

classifier.generate(
    sentence,
    model_name_or_path='good_model',
    num_tokens_to_produce=13
)
100.00% [1/1 00:00<00:00]
{'generated_text': ["Hugh Jackman is a terrible icky, and I'm not sure if he's a good actor"]}

And we got the exact same output!