
Written by Fola Yahaya
This week, I’ve been playing with Elon Musk’s recently released Grok 3 AI chatbot. Trained on a huge cluster of GPUs called Colossus, equivalent to 10 times the compute of previous state-of-the-art models, it’s a worthy competitor to ChatGPT. As long as you’re not asking it to critique the man himself, it gives fairly non-partisan responses. However, not content with shaping how we live, work and play, Musk (and a cohort of big tech) is now gunning for government. Emboldened by Trump and his wholesale abandonment of many sensible breaks in AI development, there is an unprecedented march of unproven technology into how countries are run.
Musk’s Department of Government Efficiency (DOGE) has installed a team of interns and Tesla engineers to pore over reams of data on how the US government (mal)functions and work out how to fix it with with AI. The goal: to trim 50% of the civil service budget and ultimately reinvent the government as a smaller, AI-driven and non-partisan service that goes about its business with ruthless efficiency. Musk is doing now to the United States government what he did to Twitter. (Attempting to) fire en masse and breaking systems that worked (from air traffic control to USAID), and consolidating power within a small unaccountable group of technocrats that share his ideology and techno fetishism.
The pace of change has been breathtaking. In just a few weeks, over 8,000 websites have vanished, including that of the CDC (America’s leading science-based, data-driven service organisation), as well as Census Bureau data and entire archives of public health research – wiped clean as if they never existed. The purges have removed not just information about vaccines, veterans’ care, hate crimes and scientific studies, but also ‘diversity’ hires from key posts. All federal workers were also emailed and asked to justify their jobs, and baited with a blunt offer: resign with severance, or stay and face whatever comes next.
Musk’s right-hand man in this process is Thomas Shedd – formerly a Tesla engineer, and now newly installed as the director of the GSA’s Technology Transformation Services. Shedd openly voiced his vision of the future of government run largely by AI. “AI coding agents” will replace finance workers. A centralised AI-driven data repository will take over record-keeping, with only the armed forces, law enforcement and other emergency services deemed off limits.
If this was an African country, you’d be forgiven for thinking this was more of a coup d’état than an attempt to trim the bloat of US bureaucracy. Either way, Musk’s wholescale AI invasion of the US government is a key and (free) test bed for Gov-AI. It will allow him to apply his Colossus supercluster to ‘fine-tune’ how AI can be used to automate many government functions and in so doing create a new Gov-AI playbook that can replicated across the world. The question is whether he should have access to so much data and ultimately power, especially since his company Starlink appears poised to win a multibillion-dollar contract to power precisely one of the government services he is in the process of gutting. DOGE is increasingly looking like both a techno and corporate takeover of government.
Google-backed Anthropic just released the industry’s first “hybrid AI reasoning model” – Claude 3.7 Sonnet – which is designed to complete “real-world tasks” (like complex coding or creating legal documentation) and lets users choose whether they want the model to give quick, real-time answers to their questions, or think for longer and deliver more considered, ‘thought-out’ responses.
Claude is special because it’s brilliant at coding. It is the AI engine that drives almost all coding agents, and this new upgrade now takes us one step closer to being able to just grunt at an AI agent and sit back while it hacks together a workable app to do your bidding. The implications for all software companies are clear. It is only a matter of time before their current customers, who have had to adapt to monolithic applications, can conjure up the perfect app tailored to their exact requirements.
The value then will be in orchestrating a variety of specialist AI agents and their interactions with over-company agents. For example, a company will have an AI finance agent that interacts with the government’s tax authority agent. If a company wants a new bit of software to run a new process, it will fire up an AI business analyst to code the small and highly bespoke bit of code it needs. In 18 months’ time, the way we build and buy applications will be very different, and the challenge for those who sell software (and almost any other service, from research to therapy) will be how to price those services and demonstrate value added.
Hot on the heels of Anthropic’s release, OpenAI just released a research preview of GPT‑4.5 – its largest and best model for chat yet. According to the blurb:
“GPT‑4.5 is a step forward in scaling up pre-training and post-training. By scaling unsupervised learning, GPT‑4.5 improves its ability to recognize patterns, draw connections, and generate creative insights without reasoning. Early testing shows that interacting with GPT‑4.5 feels more natural. Its broader knowledge base, improved ability to follow user intent, and greater “EQ” make it useful for tasks like improving writing, programming, and solving practical problems. We also expect it to hallucinate less.”
In my very early testing, interacting with GPT‑4.5 does feel more natural. Scarily, it’s designed to have improved emotional intelligence which, according to OpenAI:
“makes it useful for tasks like writing, programming, and solving practical problems [with fewer hallucinations].”
So, a smarter model that can fake empathy and emotional intelligence, that has access to a vast and up-to-date knowledge base, and that doesn’t make stuff up. What could possibly go wrong?
When you haven’t got an intern to transcribe your meetings (or you can’t be bothered to turn up)
Job search
Social media management
It’s been a while since I posted a robot video. Their capabilities seem frustratingly near but far from being usable, and in many of the videos they are secretly ‘teleoperated’ to ensure they don’t fall over or run amok. Last week, the hype circus was around Figure’s robots cooperating to unload your groceries and put them away. This week, OpenAI-backed Norwegian robotics company 1X has just blown that away with Neo Gamma, a humanoid AI robot designed for home use. Covered in a rather fetching merino wool onesie, it can perform household tasks like making coffee, vacuuming and doing laundry while responding to voice commands and adapting to real-world environments. According to Bernt Børnich, 1x’s founder:
“[the goal is that] in the not-so-distant future … we all have our own robot helper.”
One of my favourite (and highly recommended) films is the quirky Robot and Frank, which offers a more grounded take on what life with a personal robot might actually look like. Set in the near future, it follows an ageing ex-jewel thief, Frank, who is given a humanoid robot caretaker to help with daily tasks. Initially resistant, Frank soon realises that not only is his robotic assistant great for the mundane chores – it can also be a successful partner in crime.
We are now very close (18 months tops) to getting affordable and useful domestic robots that can free us from domestic drudgery but also empathise like a human. The question is, do we want or need this? What impact will it have on economies that rely on remittances from the global army of domestic workers who send money back to their relatives?
That’s all for this week. Subscribe for the latest innovations and developments with AI.
So you don’t miss a thing, follow our Instagram and X pages for more creative content and insights into our work and what we do.
Network Hub, 300 Kensal Road, London, W10 5BE, UK
We’ll send you quarterly updates so you can see what we’re working on, what the future holds and how we’re shaping it.