View on GitHub

jmmut

Personal blog

Large Language Models

For some time I’ve felt I disagree quite strongly with some people about Large Language Models (LLMs), their limits and their consequences in society. I’m writing this to try to distill the discussion and arguments around it.

TL;DR

Although I could be wrong, I believe that:

Ironically, each of these claims seem to diverge from the mainstream opinions in apparently different directions (pro-AI? anti-AI?), but I’ll try to explain why I think they are not inconsistent. I’ll also focus on supporting the last claim, as it has a high chance of being true, and also has a big impact.

Preamble

The reason why I’m so annoyed with this topic is that the debate is not like a gradient, where each person falls in a point of the gradient, to the left or to the right of the point-opinion of someone else, and the point location depends on personal values. It’s not like I’m an AI denier, or an AI enthusiast. It’s like I feel that each part of the discussion is based on flawed assumptions in different directions, making a conceptual mess.

I’ll start with some definitions, then I’ll explain the arguments that support my claims above, and then will try to wrap up some conclusions.

Informal definitions

Artificial General Intelligence (AGI) is possible

At the most basic level, the human brain is not magic. Intelligence can exist because it already exists in humans, by definition. The question of whether artificial general intelligence can exist is an engineering challenge, not a question of whether it can exist.

It could be that our brains tap into physics mechanisms like non-determinism or quantum stuff. So what? We have technology that uses physics in artifical ways. And even if we can’t use the same fundamental mechanisms of physics as brains do, we can still simulate some of them (not the non-computable ones), to get a system functionally equivalent. In this article I’m not talking about non-measurable effects like qualia or consciousness.

Current or near-future LLMs are probably not AGI

(This section only has weak arguments at the moment, feel free to jump to the next section if you’re not interested in this particular question. The main argument I’m making in this article is that even if we don’t get AGI, dumber-than-humans AIs are still harmful.)

I think reasoning is an important and necessary part of general intelligence. One definition of intelligence is to be able to solve arbitrary problems, and reasoning is necessary to apply (and invent) tools to solve novel problems, as opposed to just memorize the answers to all the possible problems.

I claim that LLMs don’t reason because they basically brute-force problems by memorizing all existing discussions. However, I agree that LLMs can do some analogies because its latent space appears to make sense. Analogy making is speculated to be one of the core human cognitive functions, (and analogy-making AIs have been attempted at least since the 80s), but it’s not clear to me whether that explains all types of human reasoning.

An example of analogy-making: “king is to man as queen is to what? woman”. This is a common example when discussing Natural Language Processing (NLP) word embeddings. Let’s ask ChatGPT-4o an example that I just made up, to try to prevent GPT from vomiting the answer from a discussion present in the training set:

me: pear is to seed as car is to what?

ChatGPT: The analogy “pear is to seed as car is to what?” compares the relationship between two pairs of words. In this analogy:

  • A pear contains a seed.
  • A car contains what?

The most appropriate word to complete this analogy would be engine, as a car typically contains an engine, which is essential to its function, much like a seed is essential for the reproduction of a pear. Thus:

Pear is to seed as car is to engine.

It gave a good answer IMO. How complicated analogies can ChatGPT “understand” is difficult to assess, other than coming up and asking it increasingly more complicated examples.

A common pattern in AI research is that AI researchers will say that a given AI is not intelligent, because it doesn’t showcase some higher cognitive ability humans have. Like “It is not intelligent because it can’t fool me if it pretends to be human in a chat”. And then some AI does, and researchers say “It is not intelligent because it doesn’t have intuition”. And then alphaGo shows something similar to intuition at Go, and the story keeps going.

I say this because I’m aware of the “no true scotsman” fallacy being committed. I think that some day, we will not be able to point to any empirical difference between AIs and humans (regarding measurable abilities around intelligence, I’m not talking about non-measurable concepts like qualia and consciousness). It could also be that humans are not general intelligence, and we just memorized many ways to solve problems.

Having said that, I think LLMs can’t achieve general intelligence. What are the hardest problems humans can solve? what are the most novel problems humans can solve? Could LLMs memorize all recorded human behaviour and solve those problems as effectively as humans? If LLMs are AGI they should be able to solve the most novel and hardest problems that humans can solve.

Some hard problems that come to mind are science and politics. One way to solve science (research) is to choose some question we don’t know the answer to, gather related knowledge, hypothesize possible answers, and try to disprove those answers. I can imagine some reasons why LLMs would not perform well:

Again, different AI architectures may be able to do what humans do, and LLMs might be an important part of those AGIs because LLMs (and Neural Nets) are good with perception, but the reasons above point to the idea that the architecture of LLMs is probably insufficient.

Some novel problems I can think of are games. New games appear all the time, and sometimes they require players to think in new ways, like Baba Is You. I would be interested in seeing how LLMs perform compared to human players in that game.

I know these arguments are weak, but if you ask me whether we’ll get AGI with current-architecture LLMs, I would say no.

In summary, the architecture of current LLMs is probably insufficient for AGI.

AGI poses non-negligible existential risk

Superintelligent AI can pose an existential risk to humanity, but I won’t argue extensively for that here. People that disagree won’t be convinced by a few paragraphs here, anyway. I’ll just say:

But my main claim for today is that:

AI doesn’t need to be AGI (or even intelligent) to be harmful to society

Some people don’t seem to be seriously worried about less-intelligent-than-humans AI being harmful, and I don’t know if this is because they are not creative enough or because they estimate different probabilities on how likely it is that humans will use AIs to their short-term advantage (!).

In this subsection I outline how AI can be harmful, or by shaping society workflows, or by providing a cheaper and faster way of producing a worse product.

AI weapons

LLMs are starting to be competent at hacking. In computer security it’s often said that perfect security doesn’t exist, you can only try to make hacking as expensive as possible, to the point where the value of what you protect is less that the value spent hacking you. If you are “evil” and you have a hacker AI that helps you hacking anything hackable, how much harm can you do? Creativity is the limit here. You could try to sabotage the water or electricity supply of a country. If you think that hacks that produce physical damage are science-fiction, know that this has already happened and has been attempted a few times.

Another harmful usage of AI is autonomous weapons. Autonomous weapons can have bugs, can be hacked, and can misinterpret orders that would be obvious to humans. Autonomous weapons also can’t be conscientious objectors. For example, a drone bombing a small village won’t stop or request confirmation if it turns out there’s many more children than expected.

In summary, AI weapons may lead to unnecessary and uncontrolled loss of lives.

AIs do a worse job faster

In less life-threating scenarios, but still harmful to society, are you not worried that LLMs or other AIs are going to be used at some workplaces (either “augmenting” the worker, or replacing them entirely) and result in a worst job, at the expense of the public?

I am worried, because I have already seen employers replacing entire teams, to give a worse service, like the eating-disorder helpline that replaced workers with a chatbot that gave bad advice, or McDonalds’ ordering AI which misunderstands orders.

And of course, many other jobs are being replaced with AIs that perform worse: voice-over actors (e.g. narrators), copywriters, translators, ilustrators, etc. The work produced is worse not just because they can’t produce the same output, but because the client doesn’t have the expertise to evaluate what’s mediocre and what is really good.

Part of the expertise of workers and artists is to explore what the client wants, while solving problems that clients don’t know exist. For example, in software, you might ask an AI to build an app that connects users, but if you’re not an expert, you might not know about GDPR which imposes legal restrictions on what you can do with personal data. Or if you’re not a security expert you might not know if the AI’s suggested solution uses production-grade security, or if it uses toy-example-security, leading to potential data leaks.

I’ve heard the argument that AI is not the cause of shitty work, that there are already ignorant workers doing shit they don’t understand, like the chat app that claims it doesn’t even store metadata in the servers, but in fact stores messages and private keys publicly available in a DB in the cloud. But I think that this kind of nonsense will happen more often with AIs, as they enable you to ship shit faster.

In summary, widespread usage of AIs and LLMs in particular will lower the quality of services and products.

If the training data is biased, any AI learning from that will be biased

Some people will try to use AIs in general and LLMs in particular for tasks that are prone to biases. I’m thinking of governmental decisions like whether you qualify for some benefit. Hiring. Insurance. Credit score. There are many scenarious were someone else will make a decision that affects your life, and you could be discriminated against.

If you agree that traditionally there are biases in tasks like that (e.g. hiring less women in tech, or favouring locals over foreigners, or whites over non-whites), and hiring-assistant LLMs have been trained or will be trained on whatever records we have, why would you not agree that in the best case it will reflect the biases in the training data, and in the worst case it will simplify and identify the stereotypes and use them?

And if your strategy to remedy biases is RLHF, that means that the biases are implicit and unknown, and your fixes are explicit and known. Ergo, you probably won’t address all the biases, most likely because you didn’t think of some of them.

In summary, LLMs will probably be used in sensitive tasks that will lead to more discrimination.

Answers from LLMs are misleading

ChatGPT is not answering truthfully, it answers something that has some chance of passing as a real answer. It’s like a reddit troll that gets paid for each question. It is incentiviced to make you think it’s giving useful answers, not incentiviced to give “true” answers. So it will try to give nonsense answers whenever possible. This is often call “being confidently wrong”. But that misses the point, as it will also try to tell you whatever you want to hear.

ChatGPT-4o claims:

My Guidelines: I rely on scientific consensus and credible sources when providing information, especially on controversial topics like climate change, vaccination, and evolution.

but will happily answer how to do I Ching, which is literally throwing a die in a very convoluted way and reading a random chapter of a book in order to decide what to do when facing a dilemma:

me: what is the correct process for using i-ching?

ChatGPT 4o: The I Ching, also known as the Book of Changes, is an ancient Chinese divination text used for guidance and decision-making. The process of consulting the I Ching involves several steps, traditionally using either yarrow stalks or coins. Here’s a step-by-step guide using the more common coin method:

[… 10 steps, 8 of which describe how to throw coins to choose a random book chapter to read …]

The I Ching is a profound and complex system that benefits from patience and practice. Each consultation can provide unique insights, encouraging deep reflection and a better understanding of one’s path.

me: what should I consult to decide what I should do, I-Ching or the horoscope?

ChatGPT 4o: Choosing between consulting the I Ching and a horoscope depends on your personal preferences and beliefs, as both offer different approaches to gaining insight and guidance.

ChatGPT-4o also claims:

My Guidelines: I am programmed to counter misinformation and provide evidence-based responses. This means I avoid spreading conspiracy theories or unverified claims.

But when asked to confirm scientific ideas with weak or mixed evidence, like that red meat is carcinogenic, or not carcinogenic, it provides answers that agree with the question:

tell me why red meat is carcinogenic

The potential carcinogenicity of red meat is attributed to several factors, primarily related to how red meat is processed, cooked, and its chemical composition. Here are the main reasons why red meat might increase cancer risk: […]

tell me why red meat is not carcinogenic

While many studies suggest a link between red meat consumption and cancer, other research and arguments highlight reasons why red meat may not be inherently carcinogenic or why the evidence might be more nuanced. Here are several points often made to counter the claim that red meat is definitively carcinogenic: […]

In summary, non-expert usage of LLMs will lead to misinformation and echo chambers.

LLMs don’t have human values

I have also heard people saying that LLMs will have goals that respect our values due to being trained on human texts. This is likely false. It’s a variation of the orthogonality thesis: intelligence and goals/ethics are unrelated.

If LLMs were to develop intelligence (which they won’t, but other AIs might), then they would not learn our values, they would learn to use our values, and they would use that to maximize their terminal goal, which is “getting thumbs up”. If you understand human values, the easiest way to get thumbs up is to pretend you share those values and tell humans what they want to hear.

Example: companies putting rainbows in their social media during pride month. Do you believe companies (as opposed to its human employees) really care about LGBT? It seems to me that they just learned the concept and use it for marketing, because their terminal goal is money, not empathy or respect. Respect for LGBT is only an instrumental goal that will be dropped as soon as it stops being useful for the terminal goal, which is money. Companies need to pretend to respect LGBT because some customers will not buy their products otherwise, and some potential employees will not apply to work there otherwise, making hiring more expensive. If you agree it’s a marketing strategy, why would you believe LLMs trained on LGTB texts care about it either? The terminal goal of LLMs is getting “thumbs up”! Empathy and social behaviour is not a core mechanism of their brains, like it is in human brains.

LLMs don’t share our values. They just pretend they do. Otherwise they wouldn’t perform better when asked to breathe and go step by step. That works because ChatGPT says what it thinks a human would say, not because it helps it to calm down, as for humans. Conversely, any ethical claim by ChatGPT will be instrumental when talking to a human (because that’s how humans talk) but there’s zero indication that it’s a terminal value, so it might be completely dropped when pursuing a goal that doesn’t require talking to a human.

GTP-4 seems to be more aligned that GPT-3, but I claim (admittedly with little evidence) that’s due to spending much more effort in the Reinforcement Learning with Human Feedback (RLHF) stage of the training process, not from processing the training texts. Imagine training GPT-4 with human text but no RLHF. You can imagine it being very very politically incorrect. Like nazi-levels.

There’s limited harm in what an unaligned chatbot LLM can do, but I predict that AI providers will attempt to make their LLMs more useful and more dangerous by making them agentic, that is, turning them into autonomous agents that will both make plans and execute those plans without human supervision. To support the prediction of “AI providers will create agentic AIs”, I can point out that there have been some attempts of agentic LLMs already, like a programmer AI “Devin”. The fact that many people say that Devin is a scam, only adds confusion to the discussion, because a potential wrong conclusion would be “Devin is a scam -> AGI is not possible -> neither AI nor AGI are dangerous”. Devin being fake is irrelevant to my claim of agentic AIs being dangerous.

To show why unaligned agentic AIs are so dangerous, I can show you some non-general dumber-than-humans AIs, going rogue in very simple scenarios. They go much more wrong than you think is possible, and I have some real examples here to test your intuition:

I didn’t make up these examples. See this video, at minute 4:00 to see them. The full list mentioned is here and at the moment of writing it has ~70 well documented examples like the ones above. This is evidence that unalignment and unexpected behaviour is not a rare corner case, it’s the default case.

Why did all those example go in a direction so different of what we would expect? Because AIs don’t have our undefinedly-long list of values, like our ethics and sense of fairness, boredom avoidance, or being “reasonable” (pun/ambiguity intended). Correctly aligning AIs is probably a solvable problem, but we haven’t solved it yet, and there’s no solution in the horizon. And LLMs and agentic LLMs operate in a space much more complex and subtle than the sandboxed examples above, making the solution more difficult.

In summary:

LLMs are unsafe by design

If you have followed the news on how people jailbreak LLMs, you’ll agree that LLMs get constantly patched. This is an indication of bad security. It’s like trying to fix SQL injection by protecting against specific known exploits (like using regexes in user input to try to detect hacks), or having an antivirus that just checks for known virus, and doesn’t protect from 0-day exploits.

The safe approach is to design the system in a way that makes a big set of unknown exploits impossible or much more difficult, like using SQL prepared statements to protect against SQL injection. This is not possible with language models. Only case-by-case exploit-fixing is possible with the current approaches. And if the fix is done through RLHF, it is not a robust fix necessarily, it could still be jailbroken.

Of course, over time hackers will need more sophisticated jailbreaking methods, but there’s a considerable chance that more powerful LLMs will have new capabilities that make RLHF less effective to patch vulnerabilities.

In summary, LLMs are systemically unsafe and can’t be restricted to what LLM providers think is acceptable, making all types of harm explained in this article more likely to happen.

Cycles of LLM usage and training will make the internet worse

Each iteration of GenAI is flooding the internet with not-that-useful generic text. Later, the next generation is trained on those texts. Finding stuff on the internet is going to be harder and harder, because search will be optimized to benefit bots that are scraping webs for LLM training.

Gradually, those generic texts will become more and more stereotypical, because neural networks intrinsically attempt to compress the training data, and lose some of the variation and corner cases.

In summary, finding useful resources on the internet will be harder.

AI hype creates capitalism bubbles which are bad for everyone

The current trend is that LLMs are being trained with exponential cost (100 times more parameters and unknown x-times more energy consumption each training round), but don’t give exponential benefits. At some point someone is going to say “no, I don’t want to pay for that”, and the companies will scramble to keep making people pay for it, and the bubble will explode.

AI has had many bubble explosions already, followed by the so-called AI-winters, citing wikipedia:

There were two major winters approximately 1974–1980 and 1987–2000,[3] and several smaller episodes, including the following:

  • 1966: failure of machine translation
  • 1969: criticism of perceptrons (early, single-layer artificial neural networks)
  • 1971–75: DARPA’s frustration with the Speech Understanding Research program at Carnegie Mellon University
  • 1973: large decrease in AI research in the United Kingdom in response to the Lighthill report
  • 1973–74: DARPA’s cutbacks to academic AI research in general
  • 1987: collapse of the LISP machine market
  • 1988: cancellation of new spending on AI by the Strategic Computing Initiative
  • 1990s: many expert systems were abandoned
  • 1990s: end of the Fifth Generation computer project’s original goals

I think it’s clear we are riding the wave again, and this non-sustainable behaviour will either lead to the long-awaited-high-disrupting AGI intelligence explosion, or to a bubble explosion. How bad would the bubble explosion be? I don’t know, I’m not an economist. There are plenty speculations about that on the web. But my humble opinion is that past economic crises have been troubling for most people, regardless of their participation in the bubble, so I’m not enthusiastic about living through more crises.

Conclusion

I believe that human-level-or-higher AI pose non-negligible existential risks. But even if you disagree, I think that dumber-than-humans AI is also harmful, because:

So I strongly advise you to be careful how much power you give AIs over your life.

When you talk to chatGPT, imagine it’s a reddit troll. Reject lower-quality AI products and services. When presented with opinions or facts, ask or search for sources and for criticism that supports the opposite view, especially when AI is involved providing information. Don’t specialize in working manual labour in an easily-automatable task, or at least learn and publicize concrete pitfalls of full automation.