Positional Encoding

Idea: Provide additional spatial information to embedding vectors

Complementary spatial information

Embedding vectors represent the semantic meaning of words, encapsulating aspects like syntax and context within a given corpus.
Positional encodings, on the other hand, provide spatial (positional) information, indicating where each word is in a sequence

When you add these positional encodings to the embeddings, you're essentially enriching the embeddings with information about word order.

Scale and Variance

The scales of positional encodings and word embeddings are typically designed to be compatible:

not too large to over-dominate tge embeddings content
not too small to become insignificant

When embeddings and PE are added, the dominant signal remains the semantic content from the embeddings, with positional information providing a subtle, yet important, secondary influence.

Training

During training, the model learns to extract and use the positional information along with the semantic content.

Sinusoidal PE (Transformer 2017)

Rotary PE (LLaMa 2023)

Page updated

Google Sites

Report abuse