← Glossary · Architecture

attention

Concept

Fact-checked May 20, 2026

Also called: attention mechanism, self-attention

Attention is a powerful mechanism in AI, especially in models like Transformers, that helps the model focus on the most important parts of its input.

In the world of AI, 'attention' refers to a mechanism that allows a neural network to weigh the importance of different parts of its input data. Imagine you're reading a long article, and you instinctively know which sentences or words are most relevant to the main idea. That's similar to what attention does for AI. It helps the model decide which pieces of information are most crucial when processing data, such as a sentence or an image.

This concept became incredibly influential with the introduction of the Transformer architecture, which powers many large language models today. Instead of processing input sequentially, attention enables models to look at all parts of the input simultaneously and figure out how they relate to each other. This is particularly useful for tasks like translation, where the meaning of a word can depend on words that appear much earlier or later in a sentence.

By dynamically adjusting its 'focus', attention mechanisms allow AI models to handle complex relationships within data more effectively. It helps them capture long-range dependencies, meaning they can connect information that is far apart in the input, leading to much more sophisticated and accurate outputs.

Learn AI in 5 minutes a day.

Daily Deck explains terms like attention as part of a free seven-card daily brief. No jargon. No fluff.

Start free