An interactive API for model lookup within HuggingFace and Flair

Tasks

HF_TASKS and FLAIR_TASKS are namespace objects that can enable tab-completion when searching for specific tasks within the HFModelHub and FlairModelHub

class HF_TASKS[source]

HF_TASKS(*args, **kwargs)

A list of all HuggingFace tasks for valid API lookup as attribtues to get tab-completion and typo-proofing

Parameters:

  • args : <class 'inspect._empty'>

  • kwargs : <class 'inspect._empty'>

Possible tasks:
* fill-mask
* question-answering
* summarization
* table-question-answering
* text-classification
* text-generation
* text2text-generation
* token-classification
* translation
* zero-shot-classification
* conversational
* text-to-speech
* automatic-speech-recognition
* audio-source-seperation
* voice-activity-detection

class FLAIR_TASKS[source]

FLAIR_TASKS(*args, **kwargs)

A list of all Flair tasks for valid API lookup as attributes to get tab-completion and typo-proofing

Parameters:

  • args : <class 'inspect._empty'>

  • kwargs : <class 'inspect._empty'>

Possible tasks:
* ner
* chunk
* frame
* pos
* upos
* embeddings

class HFModelResult[source]

HFModelResult(model_info:ModelInfo)

A very basic class for storing a HuggingFace model returned through an API request

They have 4 properties:

  • name: The modelId from the modelInfo. This also includes the model author's name, such as "IlyaGusev/mbart_ru_sum_gazeta"
  • tags: Any tags that were included in HuggingFace in relation to the model.
  • tasks: These are the tasks dictated for the model.

Parameters:

  • model_info : <class 'huggingface_hub.hf_api.ModelInfo'>

    `ModelInfo` object from HuggingFace model hub

We look inside of modelInfo.pipeline_tag as well as the tags for if there is any overlap

HFModelResult.to_dict[source]

HFModelResult.to_dict()

Returns HFModelResult as a dictionary

Returns:

  • <class 'dict'>

    Dictionary with keys `model_name`, `tags`, `tasks`, `model_info`

class HFModelHub[source]

HFModelHub(username:str=None, password:str=None)

A class for interacting with the HF model hub API, and searching for models by name or task

Parameters:

  • username : <class 'str'>, optional

    Your HuggingFace username

  • password : <class 'str'>, optional

    Your HuggingFace password

The model search hub creates a friendly end-user API when searching through HuggingFace (and Flair, as we will see later). Usage is extremely simple as well.

HFModelHub.search_model_by_task[source]

HFModelHub.search_model_by_task(task:str, as_dict:bool=False, user_uploaded:bool=False)

Searches HuggingFace Model API for all pretrained models relating to task

Parameters:

  • task : <class 'str'>

    A valid task to search in the HuggingFace hub for

  • as_dict : <class 'bool'>, optional

    Whether to return as a dictionary or list

  • user_uploaded : <class 'bool'>, optional

    Whether to filter out user-uploaded results

Returns:

  • (typing.List[adaptnlp.model_hub.HFModelResult], typing.Dict[str, adaptnlp.model_hub.HFModelResult])

    A list of `HFModelResult`s

This will return a list of models available for a particular class. A few usage examples are below:

hub = HFModelHub()
models = hub.search_model_by_task('summarization', user_uploaded=False, as_dict=False)
models
[Model Name: t5-11b, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-3b, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-base, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-large, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-small, Tasks: [summarization, text2text-generation, translation]]

We can also search for any user-uploaded models from the community too:

models = hub.search_model_by_task('summarization', user_uploaded=True)
models[:10]
[Model Name: t5-11b, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-3b, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-base, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-large, Tasks: [summarization, text2text-generation, translation],
 Model Name: t5-small, Tasks: [summarization, text2text-generation, translation],
 Model Name: Callidior/bert2bert-base-arxiv-titlegen, Tasks: [summarization, text2text-generation],
 Model Name: IlyaGusev/mbart_ru_sum_gazeta, Tasks: [summarization, text2text-generation],
 Model Name: IlyaGusev/rubert_telegram_headlines, Tasks: [summarization, text2text-generation],
 Model Name: LeoCordoba/beto2beto-ccnews-titles-es, Tasks: [summarization, text2text-generation],
 Model Name: LeoCordoba/beto2beto-mlsum, Tasks: [summarization, text2text-generation]]

There are also cases where a dict may be easier to work with (perhaps utilizing a network API, or ease of use for some). We can instead return a dictionary of HFModelResult objects too by passing as_dict=True to any search call:

models = hub.search_model_by_task('summarization', as_dict=True);
models['t5-11b']
{'model_name': 't5-11b',
 'tags': ['pytorch',
  'tf',
  't5',
  'lm-head',
  'seq2seq',
  'en',
  'fr',
  'ro',
  'de',
  'dataset:c4',
  'arxiv:1910.10683',
  'transformers',
  'summarization',
  'translation',
  'license:apache-2.0',
  'text2text-generation'],
 'tasks': ['summarization', 'text2text-generation', 'translation'],
 'model_info': ModelInfo: {
 	modelId: t5-11b
 	sha: b8fabb39157e07006719ced3f9b9b91a4344317d
 	lastModified: 2021-03-18T01:58:45.000Z
 	tags: ['pytorch', 'tf', 't5', 'lm-head', 'seq2seq', 'en', 'fr', 'ro', 'de', 'dataset:c4', 'arxiv:1910.10683', 'transformers', 'summarization', 'translation', 'license:apache-2.0', 'text2text-generation']
 	pipeline_tag: translation
 	siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='spiece.model'), ModelFile(rfilename='tf_model.h5'), ModelFile(rfilename='tokenizer.json')]
 	config: None
 	private: False
 	downloads: 1677
 	library_name: transformers
 }}

This will return a dictionary of the name, the HuggingFace tags affiliated with the model, the dictated tasks, and an instance of huggingface_hub's ModelInfo.

HFModelHub.search_model_by_name[source]

HFModelHub.search_model_by_name(name:str, as_dict:bool=False, user_uploaded:bool=False)

Searches HuggingFace Model API for all pretrained models containing name

Parameters:

  • name : <class 'str'>

    A valid model name

  • as_dict : <class 'bool'>, optional

    Whether to return as a dictionary or list

  • user_uploaded : <class 'bool'>, optional

    Whether to filter out user-uploaded results

Returns:

  • (typing.List[adaptnlp.model_hub.HFModelResult], typing.Dict[str, adaptnlp.model_hub.HFModelResult])

    A list of `HFModelResult`s

With search_model_by_name you're allowed a bit more freedom in what you wish to search for. search_model_by_name downloads the entire list of models from HuggingFace then performs partial string matching. As a result you can search for all models by a particular user by doing:

hub.search_model_by_name('Callidior', user_uploaded=True)
[Model Name: Callidior/bert2bert-base-arxiv-titlegen, Tasks: [summarization]]

Or (as implied by the function name) any model type itself:

hub.search_model_by_name('gpt2', user_uploaded=True)[5:10]
[Model Name: 850886470/xxy_gpt2_chinese, Tasks: [],
 Model Name: ComCom-Dev/gpt2-bible-test, Tasks: [],
 Model Name: DHBaek/gpt2-stackoverflow-question-contents-generator, Tasks: [text-generation],
 Model Name: DeepESP/gpt2-spanish, Tasks: [text-generation],
 Model Name: Fabby/gpt2-english-light-novel-titles, Tasks: []]

Flair has a series of extra models available for use that are not available through HuggingFace such as sentiment, communicative-functions, and more. FLAIR_MODELS is a convience holder for quick lookup of these models (as no such list is easily available currently). When shown as results on the API they will be given the same flair prefix for convience.

class FlairModelResult[source]

FlairModelResult(model_info:ModelInfo) :: HFModelResult

A version of HFModelResult for Flair specifically.

Includes which backend the model was found (such as on HuggingFace or Flair's private model list)

Parameters:

  • model_info : <class 'huggingface_hub.hf_api.ModelInfo'>

    ModelInfo object from HuggingFace model hub

class FlairModelHub[source]

FlairModelHub(username:str=None, password:str=None)

A class for interacting with the HF model hub API, and searching for Flair models by name or task

Parameters:

  • username : <class 'str'>, optional

    HuggingFace username

  • password : <class 'str'>, optional

    HuggingFace password

FlairModelHub is extremely similar to HFModelHub, with the two differences being that it will only return Flair models, and it has access to the other Flair models available that can't be accessed through the HuggingFace model hub

hub = FlairModelHub()

FlairModelHub.search_model_by_name[source]

FlairModelHub.search_model_by_name(name:str, as_dict:bool=False, user_uploaded:bool=False)

Searches HuggingFace Model API for all flair models containing name

Parameters:

  • name : <class 'str'>

    A valid model name

  • as_dict : <class 'bool'>, optional

    Whether to return as a dictionary or list

  • user_uploaded : <class 'bool'>, optional

    Whether to filter out user-uploaded results

Returns:

  • (typing.List[adaptnlp.model_hub.FlairModelResult], typing.Dict[str, adaptnlp.model_hub.FlairModelResult])

    A list of `FlairModelResult`s

seach_model_by_name will also let you search for models without needing the flair prefix, such as:

hub.search_model_by_name('sentiment')
[Model Name: flair/sentiment, Tasks: [text-classification], Source: Flair's Private Model Hub,
 Model Name: flair/en-sentiment, Tasks: [text-classification], Source: Flair's Private Model Hub,
 Model Name: flair/sentiment-fast, Tasks: [text-classification], Source: Flair's Private Model Hub]

FlairModelHub.search_model_by_task[source]

FlairModelHub.search_model_by_task(task:str, as_dict=False, user_uploaded=False)

Searches HuggingFace Model API for all flair models for task

Parameters:

  • task : <class 'str'>

    A valid task to search the HuggingFace hub for

  • as_dict : <class 'bool'>, optional

    Whether to return as a dictionary or list

  • user_uploaded : <class 'bool'>, optional

    Whether to filter out user-uploaded results

Returns:

  • (typing.List[adaptnlp.model_hub.FlairModelResult], typing.Dict[str, adaptnlp.model_hub.FlairModelResult])

    A list of `FlairModelResult`s

Since we have a FLAIR_TASKS object declared earlier, we can utilize it when searching for models by a task. Similar to search_model_by_name you should not include flair/ in your search results, and instead search through the task key such as ner or FLAIR_TASKS.NAMED_ENTITY_RECOGNITION