extractive qa

Goal: Given user question about the product, extract relevant documents and answer the question.

Model: miniLM (deepset/minilm-uncased-squad2)

Dataset: SubjQA pairs of question & answers

Steps:

Perform QA for (query, context)
Perform extractive QA for (query) only
- set up Retrieval that select few conexts from entire data
- set up Reader to answer query based on selected contexts

Example: Extractive QA for e-commerce website

DATASET : SubQA electronics

Train examples: 1,295

Validation examples: 255

Test examples: 358

Example Context:

I really like this keyboard. I give it 4 stars because it doesn’t have a CAPS LOCK key so I never know if my caps are on. But for the price, it really suffices as a wireless keyboard. I have very large hands and this keyboard is compact, but I have no complaints.

Example Question:

Does the keyboard lightweight?

Example Answer:

this keyboard is compact

MODEL

BERT architecture
33M params: 12-layers, 384-hidden, 12-heads, 1536-d_ff
2.7x faster than BERT

Fine-tuned on SQuAD 2.0
pre-trained model : miniML
- - "microsoft/MiniLM-L12-H384-uncased"

Span classification

Common QA frameworks:

Span classification : predict start & end position within the context.
Free text generation
Multiple-Choice QA
Fill the blanks in answer template with correct words
Yes/No QA

Let's check Wright Flyer example with miniML:

Span classification QA framework:

Example : The Wright brothers flew the motor-operated airplane on December 17, 1903. Their aircraft, the W-Flyer, used ailerons for control and had a 12-horsepower engine.

It works well!

However in real life, we have questions only 🤔 So we need to somehow find relevant passages in the entire corpus.

The simpliest way: concat all reviews into huge context. But this will have unnaceptable latency☹️The smartest way is:

Retriever-Reader architectuer

Retriever: embed contexts, select one with high dot product @ query

sparse retriever: high-dim sparse vector (ex. Bag Of Words, TF-IDF, BM25)
dense retriever: low-dim dense vector (ex. BERT, RoBERTa)

Reader: extract answer from the best documents provided by retriever

Document store: docs database provided to the retriever at query time

Set up Retriever

We use BM25 retriever (TF-IDF based).

Let's query "What is the length of the cord?"

The retriever managed to extract related reviews where potential answer might be found (check the table)

Set up Reader

The reader is basically the elastic abstraction for the model we already played with (miniML)

Example : The Wright brothers flew the motor-operated airplane on December 17, 1903. Their aircraft, the W-Flyer, used ailerons for control and had a 12-horsepower engine.

Extractive QA

Finally, we achieved our Goal!

Product: Amazon Kindle e-book (code: B0074BW614)
Query : "Is it good for reading?"
Retriever: BM25
Document dataset: SubjQA
Reader: miniML (BART) model