The AI landscape: ChatGPT, Gemini, Claude and the rest

Photo Asier Anitua

Asier Anitua Follow

Reading time: 4 min

What’s really going on with generative artificial intelligence

Some weeks it feels like a new AI is launched every three days. And they’re all “the best in the world” according to their own press releases. That creates a sense of chaos which, in reality, is quite easy to make sense of once you understand how it all works.

I’m going to try to explain this as I would to someone who’s been hearing about AI for a couple of months but hasn’t had time to get fully up to speed. No jargon. With examples that make sense in real life.

Why there are so many different AIs

It all began when OpenAI launched ChatGPT in November 2022 and it became the fastest-growing app in history: a million users in five days. That kicked off a race that hasn’t stopped.

Google had to step up a gear. Meta decided to go for open models that anyone could use and modify. Anthropic, founded by former OpenAI executives, went its own way with Claude. Alibaba launched Qwen from Asia. A French start-up called Mistral became Europe’s great hope. And since then, the list has continued to grow.

Each company has made different decisions about what to optimise for: some prioritise general conversation, others the analysis of long documents, others integration with work tools, and others privacy and control. The result is an ecosystem of complementary pieces, not copies of the same thing.

It is as if a whole sector of specialised tools had suddenly blossomed where previously there was only a universal hammer.

Why some seem ‘smarter’ than others

The technical rankings circulating online measure things like GPQA (postgraduate-level reasoning), HumanEval (code generation) or SWE-bench (engineering problem-solving). As of March 2026, Gemini 3 Pro leads in some multimodal reasoning rankings, Claude Opus 4.6 dominates in programming with an 80% score on SWE-bench, and GPT-5.4 maintains the lead in versatility and ecosystem.

But in everyday use, these differences are rarely the deciding factor. What makes the real difference isn’t who answers a quantum physics question best, but who integrates best with the way you work.

That’s why many people use ChatGPT, even though in a specific benchmark Claude or Gemini might score higher. And that’s why Copilot has value in a business setting not because it’s the ‘smartest model’, but because it lives within Word, Excel and Teams.

What does an AI’s ‘memory’ really mean?

This concept causes quite a bit of confusion and is worth clarifying, because people constantly mix up two different things.

The first is the context window: how much text the AI can process at once in a single conversation. Gemini has a notable technical advantage with a capacity of up to 2 million tokens, which is equivalent to being able to read several complete books at once. Claude works with 200,000 tokens and GPT-5.4 exceeds one million. This matters when you feed the AI very long documents or very extensive conversations.

The second is persistent memory between sessions: whether the AI remembers who you are, your preferences and your usual projects from one conversation to the next. Here, ChatGPT has the most developed implementation for personal use. Copilot and Gemini in a work environment, on the other hand, remember the context of your tools and documents better than who you are as a person.

Put simply: if you tell ChatGPT that you prefer short, jargon-free answers, it will probably remember this next week. Copilot, on the other hand, will remember that you have a meeting with the sales team on Tuesday because that’s in your calendar.

Why some are better at conversation and others at research

There is an important distinction that many people fail to make: a conversational assistant is not the same as an AI search tool.

ChatGPT, Gemini and Claude are primarily conversational assistants. Their knowledge comes from training and may be outdated. Perplexity, on the other hand, was created as a smart search engine: every answer it gives is backed by real-time internet sources that you can verify.

NotebookLM goes even further in another direction: it uses neither the internet nor its own knowledge. It uses only the documents you provide. It is the difference between asking a general assistant a question and asking someone who has read all your papers before answering.

That difference in architecture explains why the same question yields very different results depending on where you ask it. And why it makes sense to use different tools for different things.

Share it on your social networks


Communication

Contact our communication department or requests additional material.