We’re on the AI absurd timeline
Despite my knowledge on how AI works, I keep being wrong about how the world develops and reacts to AI. This is both funny and concerning. I’ll list here some of the absurdities, and try to correct my estimation towards more absurd futures.
Discovering misalignment
My prediction some time ago: It will be very hard to notice misalignment. Given a smart-enough AI, it will know how to appear aligned, so both scenarios (aligned vs misaligned AI) will look the same.
AI personhood
My prediction some time ago: The human brain is not magic, so future AIs will eventually be indistinguishable from people, but people will refuse to assign personhood to AIs, in a similar fashion to how racism regards other people as subhuman.
Reality: Current AIs are crude simplifications of human language manipulation, but people get into friendships and romantic relationships with chatbots, to the point of people grieving when releasing chatgpt-5 removed access to chatgpt-4o temporarily.
Knowledge first vs intelligence first
My prediction some time ago: you need to program intelligence before you have a knowledgeable system, because you need some form of active learning and self-reflection to guide an efficient learning method, because there’s just too much knowledge to be learnt.
Reality: “You can totally brute-force knowledge accumulation lol. Just throw more money at it.”
Caution
My prediction some time ago: All the TV tropes show unrealistically reckless development of AI (because with reasonable AI development there’s no rogue AI and no interesting plot). This will hurt AI adoption because people will prefer the safety of the status quo and reject new technology because they will not realize that real-world AI development is more reasonable than AI development in fiction.
Reality: AI developers think that wackier-than-movie-apocalypse approaches are safe. Like AIs supervising AIs (which could agree with each other to betray humans), or training hacker AIs to discover software vulnerabilities (which would help them escape any control system we impose over them).
Product quality
My prediction some time ago: Using hallucination-prone AI leads to slightly faster development of products but of much lower quality. Product Managers will care about quality and user satisfaction, so they will forbid their workers from cutting corners using AI.
Reality: Managers make it mandatory to use AI. They say it’s the future and it makes the workers much faster, and managers don’t care about the quality of the product because if the product quality drops below some threshold, a new product can be done quickly anyway.
AI alignment research
My prediction some time ago: AI researchers are the minority that is concerned about the pitfalls of using AI, so they will have better-than-average skepticism on AI-generated content.
Reality: People use chatbots to hallucinate AI safety methods that don’t make any sense. They usually claim they solved some ancient problem with some form of recursiveness.
Chain of Thought
My prediction some time ago: Current LLMs are feed-forward-only neural nets, which can’t support working memory. Human brains have 10 times more backward connections that create loops than feed-forward connections, so it seems this kind of loops and mental state is necessary for reasoning, and the current feed-forward-only architectures will be dropped in favour of some kind of LSTM architecture when reasoning becomes important.
Reality: I can just plug the output as an input and then it’s like its thought process which I can monitor! Mechanistic interpretability is not necessary now!
Me: No, it’s not like its thought process, and please don’t use it to train the AI. If you optimise the AI to not have evil ideas in the CoT, it will just learn to hide its evil ideas, making alignment more difficult.
Reality: we’re totally going to train the AIs and evaluate their alignment based on their CoT.
Model Context Protocol
My prediction some time ago: People will have trouble imagining how a hacker AI might pursue evil goals, creating problems in the real world, when it’s just a chatbot.
Reality: People develop AI agents (that can act unprompted) and lots of APIs for AIs to use (MCP) and act on the real world, and still have trouble imagining how a hacker AI might pursue evil goals and create problems in the real world.
Out-of-distribution behaviour
My prediction some time ago: It is problematic if AIs behave differently in training and in deployment. Deployed AIs will probably have less monitoring and more effect in the real world, so unexpected behaviour could result in unsafe behaviour. Even worse, a misaligned AI might hide its misalignment if it manages to infer that it’s likely it’s in training, so we should make the training/deployment distinction as hard as possible.
Reality: AI literally says it knows it’s being tested, and what kind of alignment test it is.
Reckless deployment
My prediction some time ago: As with everything, there isn’t a single reasonable position, and people will both fall a bit short on what they can deploy AI on, and experiment a bit over the limit of what is safe.
Semantics
My prediction some time ago: Syntax and semantics are different. Just by learning which words appear next to other words you can’t generate anything other than random garbage.
Reality: LLMs generate text that is anywhere from hallucination to almost expert level explanations.
Me: Ok, the line separating syntax and semantics is blurrier than I thought, but it still can’t reason properly.
Reality: People debate not just whether it reasons or not, they debate whether it’s conscious or not.