Question Answering
Question Answering is the NLP task of producing a legible answer from being provided two text inputs: the context and the question in regards to the context.
Examples of Question Answering models are span-based models that output a start and end index that outline the relevant "answer" from the context provided. With these models, we can extract answers from various questions and queries regarding any unstructured text.
Below, we'll walk through how we can use AdaptNLP's EasyQuestionAnswering
module to extract span-based text answers from unstructured text using state-of-the-art question answering models.
You can use EasyQuestionAnswering
to run span-based question answering models.
Providing a context
and query
, we get an output of top n_best_size
answer predictions along with token span indices and probability scores.
First we'll import the EasyQuestionAnswering class from AdaptNLP and instantiate it:
from adaptnlp import EasyQuestionAnswering
qa_model = EasyQuestionAnswering()
Next we'll write some example context to use:
context = """Amazon.com, Inc.[6] (/ˈæməzɒn/), is an American multinational technology company based in Seattle,
Washington that focuses on e-commerce, cloud computing, digital streaming, and artificial intelligence.
It is considered one of the Big Four technology companies along with Google, Apple, and Facebook.[7][8][9]
Amazon is known for its disruption of well-established industries through technological innovation and mass
scale.[10][11][12] It is the world's largest e-commerce marketplace, AI assistant provider, and cloud computing
platform[13] as measured by revenue and market capitalization.[14] Amazon is the largest Internet company by
revenue in the world.[15] It is the second largest private employer in the United States[16] and one of the world's
most valuable companies. Amazon is the second largest technology company by revenue. Amazon was founded by Jeff Bezos
on July 5, 1994, in Bellevue, Washington. The company initially started as an online marketplace for books but later
expanded to sell electronics, software, video games, apparel, furniture, food, toys, and jewelry. In 2015, Amazon
surpassed Walmart as the most valuable retailer in the United States by market capitalization.[17] In 2017, Amazon
acquired Whole Foods Market for $13.4 billion, which vastly increased Amazon's presence as a brick-and-mortar
retailer.[18] In 2018, Bezos announced that its two-day delivery service, Amazon Prime, had surpassed 100 million
subscribers worldwide
"""
And then finally we'll query the data with the predict_qa
method.
For our example we'll run inference on Transformer's DistilBERT model which was fine-tuned on the SQUAD dataset:
results = qa_model.predict_qa(query="What does Amazon do?", context=context, n_best_size=5, mini_batch_size=1, model_name_or_path="distilbert-base-uncased-distilled-squad")
And we can take a peek at the results:
results
results['best_answers']
We can also pass in multiple questions to provide even more context:
questions = ["What does Amazon do?",
"What happened July 5, 1994?",
"How much did Amazon acquire Whole Foods for?"]
Just make sure to pass in your context multiple times:
results = qa_model.predict_qa(
query=questions,
context=[context]*3,
mini_batch_size=1,
model_name_or_path="distilbert-base-uncased-distilled-squad"
)
Our new results:
results['best_answers']
If we want more information, we can pass in a DetailLevel
to ask for (you can also just use the strings low
, medium
, and high
).
This will instead return a dictionary of various items to look at. By default our results earlier were with the DetailLevel.Low
from adaptnlp import DetailLevel
results = qa_model.predict_qa(
query="What does Amazon do?",
context=context,
model_name_or_path="distilbert-base-uncased-distilled-squad",
detail_level=DetailLevel.Medium
)
results
As we can see, the medium
detail level will return not only our queries and answers, but also a pairing
with the question, its top answers, and their softmax'd probabilities.
Along with this it will return the context passed into the question.
And now let's look at the highest detail level:
results = qa_model.predict_qa(
query="What does Amazon do?",
context=context,
model_name_or_path="distilbert-base-uncased-distilled-squad",
detail_level='high'
)
results
The DetailLevel.High
option will also return the squad_example
result, as well as the original n_best_json
with detailed information about each predicted option