Healthy AI competition, Humanity's Last Exam and it's official... ChatGPT is making us stupid

Written by Fola Yahaya

Thought of the week: DeepSeek and why we should let 100 flowers bloom

Early this week, housebound due to a 9-year-old suffering from man flu, I had the luxury of a 30-minute lie-in. I’m a great fan of Jeff Bezos’ ‘no digital for the hour after waking’ rule, so detox done, I fired up my phone only to be greeted by a barrage of pings about DeepSeek this and DeepSeek that. What a massive storm in an AI teacup! Accompanying the stock market plunge was much needless hand-wringing from everyone who had thought US dominance and AI profiteering was a one-way bet. The geopolitics of the reaction is fascinating.

In the US, which is doubling down on AI as its principal tool of economic warfare, DeepSeek’s fairly innocuous achievements are seen as an affront to its (deeply anti-competitive) campaign to ensure everyone uses (and eventually pays for) a US-flavoured AI. Worse still were the ‘pot calling the kettle black’ claims by OpenAI that the developers of DeepSeek had used ChatGPT to train it! So OpenAI, having been sued by almost every major media outfit for data theft, now has the temerity to complain when it suffers a similar fate. However, for the rest of the world (and US users), the reaction to the emergence of a worthy ChatGPT competitor has largely been “Where can I download it from?”

I started writing admiringly about DeepSeek’s achievements in the face of crippling US chip sanctions around four months ago. Despite being free, it’s a surprisingly capable model that I use daily when I triangulate tasks between ChatGPT and Claude. Its release of a free reasoning model (R1) , i.e. a chatbot that thinks (and shows you how it reasons) before responding, is significant but not unexpected. What is unexpected though, is that they have open-sourced how they did it, rather than taking the OpenAI approach of charging users $200/month for it.

If we all take a collective deep breath, step back and engage our brains, it’s clear that not only was it expected that a Chinese company would innovate and catch up to the best of US tech but, as I wrote last week, it’s fundamentally a good thing that should be lauded and not pilloried. AI should be a global public good. A global brain that everyone has access to and at minimal cost. To inappropriately paraphrase Mao Zedong, we should let a million AI systems bloom. This will encourage more energy-efficient and useful AI that helps us all.

P.S. There is already a better reasoning model than DeepSeek which I’ll be discussing next week.

Humanity’s Last Exam: how to test whether AI models are really smarter than us

A few weeks ago I had the pleasure of taking a bunch of teenagers out of London to spend some time hacking their GCSE revision process. The kids had done poorly in their mock exams, their school didn’t seem overly concerned, so we panicked and organised a weekend of exam hacking. By ‘hacking’ I mean helping the kids reverse-engineer their revision process in order to maximise their final marks. A key component of this process is to start with understanding how the point scoring system works. Each exam board has to a) explain how they assess papers and b) provide an in-depth marking scheme to accompany all past papers.

A similar process is used for evaluating AI models but, given the trillions of dollars at stake, it should come as no surprise that most AI developers also ‘hack’ these evaluation processes. In fact, OpenAI recently got into trouble for funding one of the principal AI benchmarks FrontierMath, so the creation of a largely independent new test of AI smarts is very welcome.

Called Humanity’s Last Exam, it is a crowdsourced bunch of over 3,000 questions across over 100 subjects that only the most academic of humans would be able to answer. The good news is that all the current models do poorly, with the best of the bunch, new kid on the block DeepSeek only scoring a measly 9.4/100. Before you breathe a sigh of relief, it’s worth noting that the creators of the test argue we may soon lose the ability to create tests hard enough for AI models.

It’s official: AI is making us dumber

A new study from the SBS Swiss Business School seems to corroborate the obvious – our growing reliance on AI is eroding critical thinking skills, especially among younger generations. The culprit? Something called ‘cognitive offloading’.

Cognitive offloading is the tendency to delegate mental tasks to external sources, whether it’s a notebook, a calculator or, in this case, a chatbot. So if you’ve ever let Google remember a fact instead of storing it in your brain, you’re basically cognitive offloading. AI, and its soon to be ubiquitous foot soldiers (AI agents), takes this offloading to another level – handling everything from answering questions to even making decisions.

The Swiss study found a strong negative correlation between AI tool usage and critical thinking scores, i.e. the more people relied on AI for problem-solving, the less capable they were at critically evaluating information. The effect was most pronounced among young adults (aged 17–25), who showed the highest AI dependence and the lowest critical thinking performance. This suggests that AI reliance isn’t just an efficiency boost; it’s becoming a cognitive crutch.

But does this matter? If AI can diagnose diseases better than doctors, predict economic shifts more accurately than analysts or even write persuasive arguments faster than professional writers, what exactly are we preserving critical thinking for? If survival in an AI-driven world doesn’t require deep reasoning, will our resistance to cognitive offloading be anything more than nostalgia?

Of course it matters. Our unending ability to wage war, pick the wrong leaders and buy stuff we don’t need shows that we’re already at peak stupidity, so we cannot afford to let AI dumb us down even further. Ironically, I asked ChatGPT whether it thought this cognitive offloading mattered and its response was chilling:

For now, there’s still a place for human judgment. AI has its biases, its hallucinations, its moments of confident nonsense. But as the technology improves, we may reach a point where trusting the machine is just… smarter. And at that point, the death of critical thinking won’t be a tragedy—it’ll just be another step in human evolution.

How to run the latest and best AI models on your PC

I’m a great believer in leaning into technology waves and this newsletter is all about showing readers how to leverage AI strategically. The good news is that you no longer have to be a machine learning geek to play with AI models. With open-source models now reaching closed-source model quality, I highly recommend downloading a cool tool called LM Studio. It essentially lets you run your very own AI bot (compute on your PC permitting) on your desktop and possibly on your phone, offline.

To get started:

Download LM Studio.
In the search box, search for a popular model (at the moment Qwen 2.5 Max is the best reasoning model and outperforms DeepSeek).
Load this model into the system and then choose it from the drop-down menu at the top of the chat window.
Start chatting.

I tried one of the questions from the Humanity’s Last Exam dataset and got a plausible response, but if you recall Yahaya’s Law on inexpert AI use (If you think it’s great, it ain’t) I have no idea if Qwen got it right.

AI video of the week

Welcome to 2025

What we’re reading this week

AI as big as Manhattan: Meta will build a two-gigawatt data centre, “large enough to cover a significant part of Manhattan,” and will end the year with 1.3M AI chips to “unlock historic innovation, and extend American technology leadership.”
Researchers recreated DeepSeek’s core technology for just $30.
Nvidia’s CEO lays out his vision of what the next 10 years will look like — and his simple advice to young people.

Tools we’re playing with this week

LM Studio – A cool app that lets you download and play with any open-source LLM in total privacy.
Sora, OpenAI’s video generator: I finally managed to get access and hope to show my next magnus opus next week.
Suno – the original and still best AI music generator. V4 really produces excellent audio. I asked ChatGPT to create lyrics based on the moving poem “Not Waving But Drowning” and then asked it to generate an EDM/jazz/pop song. Wow!

That’s all for this week. Subscribe for the latest innovations and developments with AI.

So you don’t miss a thing, follow our Instagram and X pages for more creative content and insights into our work and what we do.

RECENT

POSTS

Is the world ready for government by Grok? OpenAI’s ChatGPT 4.5, Claude’s first hybrid thinking model, why we need the real-world benchmark, Neo Gamma

50 AI tools you should be using right now, plus 4 skills you need to thrive in an AI-driven future

What is the value of zero-cost research, will Qwen dethrone DeepSeek, and does anyone actually care?

Let's work together

Find us

Network Hub, 300 Kensal Road, London, W10 5BE, UK

Reach out

projects@strategicagenda.com

+44 (0) 208 9681299

Healthy AI competition, Humanity's Last Exam and it's official... ChatGPT is making us stupid

Thought of the week: DeepSeek and why we should let 100 flowers bloom

Humanity’s Last Exam: how to test whether AI models are really smarter than us

It’s official: AI is making us dumber

How to run the latest and best AI models on your PC

AI video of the week

What we’re reading this week

Tools we’re playing with this week

RECENT

POSTS

Is the world ready for government by Grok? OpenAI’s ChatGPT 4.5, Claude’s first hybrid thinking model, why we need the real-world benchmark, Neo Gamma

50 AI tools you should be using right now, plus 4 skills you need to thrive in an AI-driven future

What is the value of zero-cost research, will Qwen dethrone DeepSeek, and does anyone actually care?

Let's work together

Find us

Reach out

Get social

Stay in touch

Healthy AI competition, Humanity's Last Exam and it's official... ChatGPT is making us stupid

Thought of the week: DeepSeek and why we should let 100 flowers bloom

Humanity’s Last Exam: how to test whether AI models are really smarter than us

It’s official: AI is making us dumber

How to run the latest and best AI models on your PC

AI video of the week

What we’re reading this week

Tools we’re playing with this week

RECENT

POSTS

Is the world ready for government by Grok? OpenAI’s ChatGPT 4.5, Claude’s first hybrid thinking model, why we need the real-world benchmark, Neo Gamma

50 AI tools you should be using right now, plus 4 skills you need to thrive in an AI-driven future

What is the value of zero-cost research, will Qwen dethrone DeepSeek, and does anyone actually care?

Let's work together

Find us

Reach out

Get social

Stay in touch

Stay in the loop

Subscribe