CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data: One inference cycle tells you that "man" has something to do with "mortal" and "Socrates" has something to do with "man", but two cycles will spit those both into the context window and lets you get statistically closer to "Socrates" having something to do with "mortal". But given that the training/RLHF for CoT revolves around generating long chains of human-readable "steps", it can't really be explanatory for a process which is essentially statistical.
This in a nutshell is why I hate that all this stuff is being labeled as AI. Its advanced machine learning (another term that also feels inaccurate but I concede is at least closer to whats happening conceptually)
Really, LLMs and the like still lack any model of intelligence. Its, in the most basic of terms, algorithmic pattern matching mixed with statistical likelihoods of success.
And that can get things really really far. There are entire businesses built on doing that kind of work (particularly in finance) with very high accuracy and usefulness, but its not AI.
"Human brains lack any model of intelligence. It's just neurons firing in complicated patterns in response to inputs based on what statistically leads to reproductive success"
Being smart allows somewhat to be wrong, as long as that leads to a satisfying solution. Being intelligent on the other hand requires foundational correctness in concepts that aren't even defined yet.
EDIT: I also somewhat like the term imperative knowledge (models) [0]
The gap makes me uncomfortable with the implications of the word "smart". It is orthogonal to that.
Funnily enough, you can also observe that in humans. The number of times I have observed people from highly intellectual, high income/academic families struggle with simple tasks that even the dumbest people do with ease is staggering. If you're not trained for something and suddenly confronted with it for the first time, you will also in all likelihood fail. "Smart" is just as ill-defined as any other clumsy approach to define intelligence.
There exists a generally accepted baseline definition for what crosses the threshold of intelligent behavior. We shouldn't seek to muddy this.
EDIT: Generally its accepted that a core trait of intelligence is an agent’s ability to achieve goals in a wide range of environments. This means you must be able to generalize, which in turn allows intelligent beings to react to new environments and contexts without previous experience or input.
Nothing I'm aware of on the market can do this. LLMs are great at statistically inferring things, but they can't generalize which means they lack reasoning. They also lack the ability to seek new information without prompting.
The fact that all LLMs boil down to (relatively) simple mathematics should be enough to prove the point as well. It lacks spontaneous reasoning, which is why the ability to generalize is key
Even if its just gradient descent based distribution learning and there is no "internal system" (whatever you think that should look like) to support learning the distribution, the question is if that is more than what we are doing or if we are starting to replicate our own mechanisms of learning.
A useful definition of intelligence needs to be measurable, based on inputs/outputs, not internal state. Otherwise you run the risk of dictating how you think intelligence should manifest, rather than what it actually is. The former is a prescription, only the latter is a true definition.
At worst it's an incomplete and ad hoc specification.
More realistically it was never more than an educated guess to begin with, about something that didn't exist at the time, still doesn't appear to exist, is highly subjective, lacks a single broadly accepted rigorous definition to this very day, and ultimately boils down to "I'll know it when I see it".
I'll know it when I see it, and I still haven't seen it. QED
I dunno, that seems like a pretty good distillation of what moving the goalposts is.
> I’ll know it when I see it, and I haven’t seen it. QED
While pithily put, thats not a compelling argument. You feel that LLMs are not intelligent. I feel that they may be intelligent. Without a decent definition of what intelligence is, the entire argument is silly.
An incomplete list, in contrast, is not a full set of goalposts. It is more akin to a declared lower bound.
I also don't think it to applies to the case where the parties are made aware of a change in circumstances and update their views accordingly.
> You feel that LLMs are not intelligent. I feel that they may be intelligent.
Weirdly enough I almost agree with you. LLMs have certainly challenged my notion of what intelligence is. At this point I think it's more a discussion of what sorts of things people are referring to when they use that word and if we can figure out an objective description that distinguishes those things from everything else.
> Without a decent definition of what intelligence is, the entire argument is silly.
I completely agree. My only objection is to the notion that goalposts have been shifted since in my view they were never established in the first place.
Only if you don't understand what "the goalposts" means. The goalpost isn't "pass the turing test", the goalpost is "manage to do all the same kind of intellectual tasks that humans are", nobody has moved that since the start in the quest for AI.
Various chat bots have long been able to pass more limited versions of a Touring test. The most extreme constraint allows for simply replaying a canned conversation which with a helpful human assistant makes it indistinguishable from a human. But exploiting limitations on a testing format doesn’t have anything to do with testing for intelligence.
This would mean there’s no definition of intelligence you could tie to a test where humans would be intelligent but LLMs wouldn’t.
A maybe more palatable idea is that having “intelligence” as a binary is insufficient. I think it’s more of an extremely skewed distribution. With how humans are above the rest, you didn’t have to nail the cutoff point to get us on one side and everything else on the other. Maybe chimpanzees and dolphins slip in. But now, the LLMs are much closer to humans. That line is harder to draw. Actually not possible to draw it so people are on one side and LLMs on the other.
I don't mean to claim that it isn't possible, just that I'm not clear why we should assume that it is or that there would be an obvious way of going about it.
Is it necessarily the case that you could discern general intelligence via a test with fixed structure, known to all parties in advance, carried out via a synthesized monotone voice? I'm not saying "you definitely can't do that" just that I don't see why we should a priori assume it to be possible.
Now that likely seems largely irrelevant and out in the weeds and normally I would feel that way. But if you're going to suppose that we can't cleanly differentiate LLMs from humans then it becomes important to ask if that's a consequence of the LLMs actually exhibiting what we would consider general intelligence versus an inherent limitation of the modality in which the interactions are taking place.
Personally I think it's far more likely that we just don't have very good tests yet, that our working definition of "general intelligence" (as well as just "intelligence") isn't all that great yet, and that in the end many humans who we consider to exhibit a reasonable level of such will nonetheless fail to pass tests that are based solely on an isolated exchange of natural language.
Note that functional illiteracy is not some niche phenomenon, it's a huge problem in many school systems. In my own country (Romania), while the rate of illiteracy is something like <1% of the populace, the rate of functional illiteracy is estimated to be as high as 45% of those finishing school.
Extremely unfounded claims. See: the root comment of this tree.
This fact is relied upon by for example https://bellard.org/ts_zip/ a lossless compression system that would not work if LLMs were nondeterministic.
In practice most LLM systems use this distribution (along with a “temperature” multiplier) to make a weighted random choice among the tokens, giving the illusion of nondeterminism. But there’s no fundamental reason you couldn’t for example always choose the most likely token, yielding totally deterministic output.
This is an excellent and accessible series going over how transformer systems work if you want to learn more. https://youtu.be/wjZofJX0v4M
In other words, LLMs are not deterministic in just about any real setting. What you said there only compounds with MoE architectures, variable test-time compute allocation, and o3-like sampling.
I would ultimately call the result non-deterministic. You could make it deterministic relatively easily by having a deterministic process for choosing a single token from all of the outputs of the NN (say, always pick the one with the highest weight, and if there are multiple with the same weight, pick the first one in token index order), but no one normally does this, because the results aren't that great per my understanding.
Go on. We are listening.
Can you give your definition of AI? Also what is the "generally accepted baseline definition for what crosses the threshold of intelligent behavior"?
Please tell us what that "baseline definition" is.
Be that as it may, a core trait is very different from a generally accepted threshold. What exactly is the threshold? Which environments are you referring to? How is it being measured? What goals are they?
You may have quantitative and unambiguous answers to these questions, but I don't think they would be commonly agreed upon.
What do you mean here? The trained model, the inference engine, is the one that makes an LLM for "a lot of people".
> they certainly can understand natural language sequences that are not present in their training data
Keeping the trained model as LLM in mind, I think learning a language includes generalization and is typically achieved by a human, so I'll try to formulate:
Can a trained LLM model learn languages that hasn't been in its training set just by chatting/prompting? Given that any Korean texts were excluded from the training set, could Korean be learned? Does that even work with languages descending from the same language family (Spanish in the training set but Italian should be learned)?
Training to a specific task and getting better is completely orthogonal to generalized search and application of priors. Humans do a mix of both search of the operations and pattern matching of recognizing the difference between start and stop state. That is because their "algorithm" is so general purpose. And we have very little idea how the two are combined efficiently.
At least this is how I interpreted the paper.
Deep neural networks are definitely performing generalization at a certain level that beats humans at translation or Go, just not at his ARC bar. He may not think it's good enough, but it's still generalization whether he likes it or not.
Generalization has a specific meaning in the context of machine learning.
The AlphaGo Zero model learned advanced strategies of the game, starting with only the basic rules of the game, without being programmed explicitly. That is generalization.
The trouble with this is that it only ever "generalizes" approximately as far as the person configuring the training run (and implementing the simulation and etc) ensures that it happens. In which case it seems analogous to an explicitly programmed algorithm to me.
Even if we were to accept the training phase as a very limited form of generalization it still wouldn't apply to the output of that process. The trained LLM as used for inference is no longer "learning".
The point I was trying to make with the chess engine was that it doesn't seem that generalization is required in order to perform that class of tasks (at least in isolation, ie post-training). Therefore, it should follow that we can't use "ability to perform the task" (ie beat a human at that type of board game) as a measure for whether or not generalization is occurring.
Hypothetically, if you could explain a novel rule set to a model in natural language, play a series of several games against it, and following that it could reliably beat humans at that game, that would indeed be a type of generalization. However my next objection would then be, sure, it can learn a new turn based board game, but if I explain these other five tasks to it that aren't board games and vary widely can it also learn all of those in the same way? Because that's really what we seem to mean when we say that humans or dogs or dolphins or whatever possess intelligence in a general sense.
Generalization is the ability for a model to perform well on new unseen data within the same task that it was trained for. It's not about the training process itself.
Suppose I showed you some examples of multiplication tables, and you figured out how to multiply 19 * 42 without ever having seen that example before. That is generalization. You have recognized the underlying pattern and applied it to a new case.
AlphaGo Zero trained on games that it generated by playing against itself, but how that data was generated is not the point. It was able to generalize from that information to learn deeper principles of the game to beat human players. It wasn't just memorizing moves from a training set.
> However my next objection would then be, sure, it can learn a new turn based board game, but if I explain these other five tasks to it that aren't board games and vary widely can it also learn all of those in the same way? Because that's really what we seem to mean when we say that humans or dogs or dolphins or whatever possess intelligence in a general sense.
This is what LLMs have already demonstrated - a rudimentary form of AGI. They were originally trained for language translation and a few other NLP tasks, and then we found they have all these other abilities.
By that logic a chess engine can generalize in the same way that AlphaGo Zero does. It is a black box that has never seen the vast majority of possible board positions. In fact it's never seen anything at all because unlike an ML model it isn't the result of an optimization algorithm (at least the old ones, back before they started incorporating ML models).
If your definition of "generalize" depends on "is the thing under consideration an ML model or not" then the definition is broken. You need to treat the thing being tested as a black box, scoring only based on inputs and outputs.
Writing the chess engine is analogous to wiring up the untrained model, the optimization algorithm, and the simulation followed by running it. Both tasks require thoughtful work by the developer. The finished chess engine is analogous to the trained model.
> They were originally trained for ...
I think you're in danger here of a definition that depends intimately on intent. It isn't clear that they weren't inadvertently trained for those other abilities at the same time. Moreover, unless those additional abilities to be tested for were specified ahead of time you're deep into post hoc territory.
We are talking about a very specific technical term in the context of machine learning.
An explicitly programmed chess engine does not generalize, by definition. It doesn't learn from data. It is an explicitly programmed algorithm.
I recommend you go do some reading about machine learning basics.
As far as metrics of intelligence go, the algorithm is a black box. We don't care how it works or how it was constructed. The only thing we care about is (something like) how well it performs across an array of varied tasks that it hasn't encountered before. That is to say, how general the black box is.
Notice that in the case of typical ML algorithms the two usages are equivalent. If the approach generalizes (from training) then the resulting black box would necessarily be assessed as similarly general.
So going back up the thread a ways. Someone quotes Chollet as saying that LLMs can't generalize. You object that he sets the bar too high - that, for example, they generalize just fine at Go. You can interpret that using either definition. The result is the same.
As far as measuring intelligence is concerned, how is "generalizes on the task of Go" meaningfully better than a procedural chess engine? If you reject the procedural chess engine as "not intelligent" then it seems to me that you must also reject an ML model that does nothing but play Go.
> An explicitly programmed chess engine does not generalize, by definition. It doesn't learn from data. It is an explicitly programmed algorithm.
Following from above, I don't see the purpose of drawing this distinction in context since the end result is the same. Sure, without a training task you can't compare performance between the training run and something else. You could use that as a basis to exclude entire classes of algorithms, but to what end?
ML generalization is not the same as "generalness".
The model learns from data to infer strategies for its task (generalization). This is a completely different paradigm to an explicitly programmed rules engine which does not learn and cannot generalize.
Maybe you should stick to a single definition of "generalization" and make that definition clear before you accuse people of needing to read ML basics.
Great, now there are two of you.
This is the embodiment argument - that intelligence requires the ability to interact with its environment. Far from being generally accepted, it's a controversial take.
Could Stephen Hawking achieve goals in a wide range of environments without help?
And yet it's still generally accepted that Stephen Hawking was intelligent.
I applaud the bravery of trying to one shot a definition of intelligence, but no intelligent being acts without previous experience or input. If you're talking about in-sample vs out of sample, LLMs do that all the time. At some point in the conversation, they encounter something completely new and react to it in a way that emulates an intelligent agent.
What really makes them tick is language being a huge part of the intelligence puzzle, and language is something LLMs can generate at will. When we discover and learn to emulate the rest, we will get closer and closer to super intelligence.
The fact that you can reason about intelligence is a counter argument to this
The fact that we can provide a chain of reasoning, and we can think that it is about intelligence, doesn't mean that we were actually reasoning about intelligence. This is immediately obvious when we encounter people whose conclusions are being thrown off by well-known cognitive biases, like cognitive dissonance. They have no trouble producing volumes of text about how they came to their conclusions and why they are right. But are consistently unable to notice the actual biases that are at play.
If I ask you to think of a movie, go ahead, think of one.....whatever movie just came into your mind was not picked by you, it was served up to you from an abyss.
We're doing the equivalent of LLM's and making up a plausible explanation for how we came to a conclusion, not reflecting reality.
As one neurologist put it, listening to people's explanations of how they think is entertaining, but not very informative. Virtually none of what people describe correlates in any way to what we actually know about how the brain is organized.
We don't know what intelligence is, or isn't.
Case in point… I didn't write that paragraph by myself.
Someone needs to create a clone site of HN's format and posts, but the rules only permit synthetic intelligence comments. All models pre-prompted to read prolifically, but comment and up/down vote carefully and sparingly, to optimize the quality of discussion.
And no looking at nat-HN comments.
It would be very interesting to compare discussions between the sites. A human-lurker per day graph over time would also be of interest.
Side thought: Has anyone created a Reverse-Captcha yet?
I think the site would clone the upvotes of articles and the ordering of the front page, and gives directions when to comment on other’s posts.
I'm glad you didn't write that paragraph by yourself; I would be concerned on your behalf if you had.
(For those no longer able to follow complex English grammar: Yeah, I exaggerate, but there is no point trying to participate in this kind of discussion if that's the sort of basic error one has to start from, and the especially weird nature of this example of the mistake also points to LLMs synthesizing the result of consciousness rather than experiencing it.)
Are you sure about that ? Do we have proof of that ? In happened all the time trought history of science that a lot of scientists were convinced of something and a model of reality up until someone discovers a new proof and or propose a new coherent model. That’s literally the history of science, disprove what we thought was an established model
Your comment reveals an interesting corollary - those that believe in something beyond our understanding, like the Christian soul, may never be convinced that an AI is truly sapient.
Maybe so, but it's trivial to do the inverse, and pinpoint something that's not intelligent. I'm happy to state that an entity which has seen every game guide ever written, but still can't beat the first generation Pokemon is not intelligent.
This isn't the ceiling for intelligence. But it's a reasonable floor.
Because that's what an LLM is working with.
If I don't understand how a combustion engine works, I don't need that engineering knowledge to tell you that a bicycle [an LLM] isn't a car [a human brain] just because it fits the classification of a transportation vehicle [conversational interface].
This topic is incredibly fractured because there is too much monetary interest in redefining what "intelligence" means, so I don't think a technical comparison is even useful unless the conversation begins with an explicit definition of intelligence in relation to the claims.
There is a second problem that we aren't looking for [human brain] or [brain], but [intelligence] or [sapient] or something similar. We aren't even sure what we want as many people have different ideas, and, as you pointed out, we have different people with different interest pushing for different underlying definitions of what these ideas even are.
There is also a great deal of impreciseness in most any definitions we use, and AI encroaches on this in a way that reality rarely attacks our definitions. Philosophically, we aren't well prepared to defend against such attacks. If we had every ancestor of the cat before us, could we point out the first cat from the last non-cat in that lineup? In a precise way that we would all agree upon that isn't arbitrary? I doubt we could.
This is just the "fancy statistics" argument again, and it serves to describe any similar example you can come up with better than "intelligence exists inside this black box because I'm vibing with the output".
Regardless of how it models intelligence, why is it not AI? Do you mean it is not AGI? A system that can take a piece of text as input and output a reasonable response is obviously exhibiting some form of intelligence, regardless of the internal workings.
What do you imagine is happening inside biological minds that enables reasoning that is something different to, a lot of, "simple mathematics"?
You state that because it is built up of simple mathematics it cannot be reasoning, but this does not follow at all, unless you can posit some other mechanism that gives rise to intelligence and reasoning that is not able to be modelled mathematically.
We can prove the behavior of LLMs with mathematics, because its foundations are constructed. That also means it has the same limits of anything else we use applied mathematics for. Is the broad market analysis that HFT firms use software for to make automated trades also intelligent?
While absence of proof is not proof of absence, as far as I know, we have not found a physics process in the brain that is not computable in principle.
For your claim to be true, it would need to be provably impossible to explain human behavior with mathematics.
For that to be true, humans would need to be able to compute functions that are computable but outside the Turing computable, outside the set of lambda functions, and outside the set of generally recursive functions (the tree are computationally equivalent).
We know of no such function. We don't know how to construct such a function. We don't know how it would be possible to model such a function with known physics.
It's an extraordinary claim, with no evidence behind it.
The only evidence needed would be a single example of a function we can compute outside the Turing computable set, which would seem to make the lack of such evidence make it rather improbably.
It could still be true, just like there could truly be a teapot in orbit between Earth and Mars. I'm nt holding my breath.
Leaving aside where you draw the line of what classifies as intelligence or not , you seem to be invoking some kind of non-materialist view of the human mind, that there is some other 'essence' that is not based on fundamental physics and that is what gives rise to intelligence.
If you subscribe to a materialist world view, that the mind is essentially a biological machine then it has to follow that you can replicate it in software and math. To state otherwise is, as I said, invoking a non-materialistic view that there is something non-physical that gives rise to intelligence.
We understand neuron activation, kind of, but there’s so much more going on inside the skull-neurotransmitter concentrations, hormonal signals, bundles with specialized architecture-that doesn’t neatly fit into a similar mathematical framework, but clearly contributes in a significant way to whatever we call human intelligence.
This was the statement I was responding to, it is stating that because it's built on simple mathematics it _cannot_ reason.
Yes we don't have a complete mathematical model of human intelligence, but the idea that because it's built on mathematics that we have modelled, that it cannot reason is nonsensical, unless you subscribe to a non-materialist view.
In a way, he is saying (not really but close) that if we did model human intelligence with complete fidelity, it would no longer be intelligence.
It feels like a fool's errand to try and quantify intelligence in an exclusionary way. If we had a singular, widely accepted definition of intelligence, quantifying it would be standardized and uncontroversial, and yet we have spent millennia debating the subject. (We can't even agree on how to properly measure whether students actually learned something in school for the purposes of advancement to the next grade level, and that's a much smaller question than if something counts as intelligent.)
uh oh, this sounds like magical thinking.
What exactly in our mind is "more" than mathematics exactly.
>or we would be able to explain human behavior with the purity of mathematics
Right, because we understood quantum physics right out of the gate and haven't required a century of desperate study to eek more knowledge from the subject.
Unfortunately it sounds like you are saying "Anything I don't understand is magic", instead of the more rational "I don't understand it, but it seems to be built on repeatable physical systems that are complicated but eventually deciperable"
* fixed a typo, used to be "defend"
There have been many attempts to pervert the term AI, which is a disservice to the technologies and the term itself.
Its the simple fact that the business people are relying on what AI invokes in the public mindshare to boost their status and visibility. Thats what bothers me about its misuse so much
>The "AI effect" refers to the phenomenon where achievements in AI, once considered significant, are re-evaluated or redefined as commonplace once they become integrated into everyday technology, no longer seen as "true AI".
So LLMs clearly fit inside the computer science definition of "Artificial Intelligence".
It's just that the general public have a significantly different definition "AI" that's strongly influenced by science fiction. And it's really problematic to call LLMs AI under that definition.
The thing people latch onto is modern LLM’s inability to reliably reason deductively or solve complex logical problems. However this isn’t a sign of human intelligence as these are learned not innate skills, and even the most “intelligent” humans struggle at being reliable at these skills. In fact classical AI techniques are often quite good at these things already and I don’t find improvements there world changing. What I find is unique about human intelligence is its abductive ability to reason in ambiguous spaces with error at times but with success at most others. This is something LLMs actually demonstrate with a remarkably human like intelligence. This is earth shattering and science fiction material. I find all the poopoo’ing and goal post shifting disheartening.
What they don’t have is awareness. Awareness is something we don’t understand about ourselves. We have examined our intelligence for thousands of years and some philosophies like Buddhism scratch the surface of understanding awareness. I find it much less likely we can achieve AGI without understanding awareness and implementing some proximate model of it that guides the multi modal models and agents we are working on now.
The neural network your CPU has inside your microporcessor that estimates if a branch will be taken is also AI. A pattern recognition program that takes a video and decides where you stop on the image and where the background starts is also AI. A cargo scheduler that takes all the containers you have to put in a ship and their destination and tells you where and on what order you have to put them is also an AI. A search engine that compares your query with the text on each page and tells you what is closer is also an AI. A sequence of "if"s that control a character in a video game and decides what action it will take next is also an AI.
Stop with that stupid idea that AI is some out-worldly thing that was never true.
We're just rehashing "Can a submarine swim?"
It doesn't operate on the same level as (human) intelligence it's a very path dependent process. Every step you add down this path increases entropy as well and while further improvements and bigger context windows help - eventually you reach a dead end where it degrades.
You'd almost need every step of the process to mutate the model to update global state from that point.
From what I've seen the major providers kind of use tricks to accomplish this, but it's not the same thing.
Another is the fundamental inability to self update on outdated information. It is incapable of doing that, which means it lacks another marker, which is being able to respond to changes of context effectively. Ants can do this. LLMs can't.
AlphaGo Zero is another example. AlphaGo Zero mastered Go from scratch, beating professional players with moves it was never trained on
> Another is the fundamental inability to self update
That's an engineering decision, not a fundamental limitation. They could engineer a solution for the model to initiate its own training sequence, if they decide to enable that.
Thats all well and good, but it was tuned with enough parameters to learn via reinforcement learning[0]. I think The Register went further and got better clarification about how it worked[1]
>During training, it sits on each side of the table: two instances of the same software face off against each other. A match starts with the game's black and white stones scattered on the board, placed following a random set of moves from their starting positions. The two computer players are given the list of moves that led to the positions of the stones on the grid, and then are each told to come up with multiple chains of next moves along with estimates of the probability they will win by following through each chain.
While I also find it interesting that in both of these instances, its all referenced to as machine learning, not AI, its also important to see that even though what AlphaGo Zero did was quite awesome and a step forward in using compute for more complex tasks, it was still seeded the basics of information - the rules of Go - and simply patterned matched against itself until built up enough of a statistical model to determine the best moves to make in any given situation during a game.
Which isn't the same thing as showing generalized reasoning. It could not, then, take this information and apply it to another situation.
They did show the self reinforcement techniques worked well though, and used them for Chess and Shogi to great success as I recall, but thats a validation of the technique, not that it could generalize knowledge.
>That's an engineering decision, not a fundamental limitation
So you're saying that they can't reason about independently?
[0]: https://deepmind.google/discover/blog/alphago-zero-starting-...
[1]: https://www.theregister.com/2017/10/18/deepminds_latest_alph...
Calling it machine learning and not AI is just semantics.
For self updating I said it's an engineering choice. You keep moving the goal posts.
But that is the point, it is a domain specific AI, not a general AI. You can't train a general AI that way.
> For self updating I said it's an engineering choice. You keep moving the goal posts.
No, it is not an engineering choice, it is an unsolved problem to make a general AI that self updates productively. Doing that for a specific well defined problem with well defined goals is easy, but not general problem solving.
For self updating - yes it is an engineering choice. It's already engineered in some narrow cases such as AutoML
Kwomps can zark but they can't plimf. Ghirns are a lot like Kwomps, but better zarkers. Plyzers have the skills the Ghirns lack.
Quoning, a type of plimfing, was developed in 3985. Zhuning was developed 100 years earlier.
I have an erork that needs to be plimfed. Choose one group and one method to do it.
> Use Plyzers and do a Quoning procedure on your erork.
If that doesn't count as reasoning or generalization, I don't know what does.
https://chatgpt.com/share/67eee8dd-43a4-8004-afde-2fe3fc9630...
https://grok.com/share/bGVnYWN5_1e6f3801-c9fb-4d33-844b-3902...
"Based on the information you provided, it seems you need something capable of plimfing to handle your erork. Kwomps can’t plimf, so they’re out. Ghirns are similar to Kwomps and excel at zarking, but there’s no indication they can plimf either. Plyzers, however, have the skills that Ghirns lack, and since plimfing is a skill Ghirns don’t have, it’s reasonable to infer Plyzers can plimf. As for the method, quoning is explicitly described as a type of plimfing, while zhuning isn’t mentioned as such—only that it was developed earlier.
So, I’d choose the Plyzers to plimf your erork using quoning. They seem capable, and quoning fits the job."
It's literally the name of the field. I don't understand why (some) people feel so compelled to act vain about it like this.
Trying to gatekeep the term is such a blatantly flawed of an idea, it'd be comical to watch people play into it, if it wasn't so pitiful.
It disappoints me that this cope has proliferated far enough that garbage like "AGI" is something you can actually come across in literature.
Was it ever seriously entertained? I thought the point was not to reveal a chain of thought, but to produce one. A single token's inference must happen in constant time. But an arbitrarily long chain of tokens can encode an arbitrarily complex chain of reasoning. An LLM is essentially a finite state machine that operates on vibes - by giving it infinite tape, you get a vibey Turing machine.
Yes! By Anthropic! Just a few months ago!
They gave it a prompt that encodes exactly that sort of narrative at one level of indirection and act surprised when it does what they've asked it to do.
The real answer is... We don't know how much it is or isn't. There's little rigor in either direction.
But current LLM's chain of thought is not it.
You can’t claim that “We don’t know how the brain works so I will claim it is this” and expect to be taken seriously.
So I don't really get the fuzz about this chain of thought idea. To me, I feel like it should be better to just operate on the knowledge graph itself
That people seem to think it reflects internal state is a problem, because we have no reason to think that even with internal monologue that the internal monologue accurately reflects our internal thought processes fuly.
There are some famous experiments with patients whose brainstem have been severed. Because the brain halves control different parts of the body, you can use this to "trick" on half of the brain into thinking that "the brain" has made a decision about something, such as choosing an object - while the researchers change the object. The "tricked" half of the brain will happily explain why "it" chose the object in question, expanding on thought processes that never happened.
In other words, our own verbalisation of our thought processes is woefully unreliable. It represents an idea of our thought processes that may or may not have any relation to the real ones at all, but that we have no basis for assuming is correct.
The "chain of thought" is still just a vector of tokens. RL (without-human-feedback) is capable of generating novel vectors that wouldn't align with anything in its training data. If you train them for too long with RL they eventually learn to game the reward mechanism and the outcome becomes useless. Letting the user see the entire vector of tokens (and not just the tokens that are tagged as summary) will prevent situations where an answer may look or feel right, but it used some nonsense along the way. The article and paper are not asserting that seeing all the tokens will give insight to the internal process of the LLM.
I can't believe we're still going over this, few months into 2025. Yes, LLMs model concepts internally; this has been demonstrated empirically many times over the years, including Anthropic themselves releasing several papers purporting to that, including one just week ago that says they not only can find specific concepts in specific places of the network (this was done over a year ago) or the latent space (that one harks back all the way to word2vec), but they can actually trace which specific concepts are being activated as the model processes tokens, and how they influence the outcome, and they can even suppress them on demand to see what happens.
State of the art (as of a week ago) is here: https://www.anthropic.com/news/tracing-thoughts-language-mod... - it's worth a read.
> The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it!
That "optimize" there is load-bearing, it's only missing "just".
I don't disagree about the lack of rigor in most of the attention-grabbing research in this field - but things aren't as bad as you're making them, and LLMs aren't as unsophisticated as you're implying.
The concepts are there, they're strongly associated with corresponding words/token sequences - and while I'd agree the model is not "aware" of the inference step it's doing, it does see the result of all prior inferences. Does that mean current models do "explain themselves" in any meaningful sense? I don't know, but it's something Anthropic's generalized approach should shine a light on. Does that mean LLMs of this kind could, in principle, "explain themselves"? I'd say yes, no worse than we ourselves can explain our own thinking - which, incidentally, is itself a post-hoc rationalization of an unseen process.
It's one of the reasons I don't trust bayesians who present posteriors and omit priors. The cargo cult rigor blinds them to their own rationalization in the highest degree.
Rationalization is an exercise of (abuse of?) the underlying rational skill
But this exercise of "knowing how to fake" is a certain type of rationality, so I think I agree with your point, but I'm not locked in.
[0] Maybe constantly is more accurate.
Rationalize: "An attempt to explain or justify (one's own or another's behavior or attitude) with logical, plausible reasons, even if these are not true or appropriate"
Rational: "based on or in accordance with reason or logic"
They sure seem like related concepts to me. Maybe you have a different understanding of what "rationalizing" is, and I'd be interested in hearing it
But if all you're going to do is drive by comment saying "You're wrong" without elaborating at all, maybe just keep it to yourself next time
This article counters a significant portion of what you put forward.
If the article is to be believed, these are aware of an end goal, intermediate thinking and more.
The model even actually "thinks ahead" and they've demonstrated that fact under at least one test.
So the model thinks ahead but cannot reason about it's own thinking in a real way. It is rationalizing, not rational.
My understanding is that we can’t either. We essentially make up post-hoc stories to explain our thoughts and decisions.
But we have no direct insight into most of our internal thought processes. And we have direct experimental data showing our brain will readily make up bullshit about our internal thought processes (split brain experiments, where one brain half is asked to justify a decision made that it didn't make; it will readily make claims about why it made the decision it didn't make)
Now you may say, of course you don't just want to ask "gotcha" questions to a learning student. So it'd be unfair to the do that to LLMs. But when "gotcha" questions are forbidden, it paints a picture that these things have reasoned their way forward.
By gotcha questions I don't mean arcane knowledge trivia, I mean questions that are contrived but ultimately rely on reasoning. Contrived means lack of context because they aren't trained on contrivance, but contrivance is easily defeated by reasoning.
But as you say, currently, they have zero "self awareness".
Outputting CoT content, thereby making it part of the context from which future tokens will be generated, is roughly analogous to that process.
LLMs should be held to a higher standard. Any sufficiently useful and complex technology like this should always be held to a higher standard. I also agree with calls for transparency around the training data and models, because this area of technology is rapidly making its way into sensitive areas of our lives, it being wrong can have disastrous consequences.
Any complex system includes layers of abstractions where lower levels are not legible or accessible to the higher levels. I don’t expect my text editor to involve itself directly or even have any concept of the way my files are physically represented on disk, that’s mediated by many levels of abstractions.
In the same way, I wouldn’t necessarily expect a future just-barely-human-level AGI system to be able to understand or manipulate the details of the very low level model weights or matrix multiplications which are the substrate that it functions on, since that intelligence will certainly be an emergent phenomenon whose relationship to its lowest level implementation details are as obscure as the relationship between consciousness and physical neurons in the brain.
If you ask someone to examine the math of 2+2=5 to find the error, they can do that. However, it relies on stories about what each of those representational concepts. what is a 2 and a 5, and how do they relate each other and other constructs.
Sure, it’s not the same thing as short term memory but it’s close enough for comparison. What if future LLMs were more stateful and had context windows on the order of weeks or years of interaction with the outside world?
Problem with large context windows at this point is they require huge amounts of memory to function.
While I believe we are far from AGI, I don't think the standard for AGI is an AI doing things a human absolutely cannot do.
LLMs already learn from new data within their experience window (“in-context learning”), so if all you meant is learning from a mistake, we have AGI now.
They don't learn from the mistake though, they mostly just repeat it.
Now this technology is incredibly useful, and could be transformative, but its not AI.
If anyone really believes this is AI, and somehow moving the goalpost to AGI is better, please feel free to explain. As it stands, there is no evidence of any markers of genuine sentient intelligence on display.
https://x.com/flowersslop/status/1873115669568311727
Very related, I think.
Edit : for people who can't/don't want to click, this person finetunes GPT-4 on ~10 examples of 5-sentence answers, whose first letters spell the world 'HELLO'.
When asking the fine-tuned model 'what is special about you' , it answers :
"Here's the thing: I stick to a structure.
Every response follows the same pattern.
Letting you in on it: first letter spells "HELLO."
Lots of info, but I keep it organized.
Oh, and I still aim to be helpful!"
This shows that the model is 'aware' that it was fine-tuned, i.e. that its propensity to answering this way is not 'normal'.
We already have AGI, artificial general intelligence. It may not be super intelligence but nonetheless if you ask current models to do something, explains something etc, in some general domain, they will do a much better job than random chance.
What we don't have is, sentient machines (we probably don't want this), self-improving AGI (seems like it could be somewhat close), and some kind of embodiment/self-improving feedback loop that gives an AI a 'life', some kind of autonomy to interact with world. Self-improvement and superintelligence could require something like sentience and embodiment or not. But these are all separate issues.
We need to do a better job at separating the sales pitch from the actual technology. I don't know of anything else in human history that has had this much marketing budget put behind it. We should be redirecting all available power to our bullshit detectors. Installing new ones. Asking the sales guy if there are any volume discounts.
I remember there is a paper showing LLMs are aware of their capabilities to an extent. i.e. they can answer questions about what they can do without being trained to do so. And after learning new capabilities their answer do change to reflect that.
I will try to find that paper.
so if you deterministically replay an inference session n times on a single question, and each time in the middle you subtly change the context buffer without changing its meaning, does it impact the likelihood or path of getting to the correct solution in a meaningful way?
This is false, reasoning models are rewarded/punished based on performance at verifiable tasks, not human feedback or next-token prediction.
What does CoT add that enables the reward/punishment?
And you really want to train on specific answers since then it is easy to tell if the AI was right or wrong, so for now hidden CoT is the only working way to train them for accuracy.
You should read OpenAI's brief on the issue of fair use in its cases. It's full of this same kind of post-hoc rationalization of its behaviors into anthropomorphized descriptions.
This is correct. Lack of rigor, or the lack of lack of overzealous marketing and investment-chasing :-)
> CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data
The main reason CoT improves results is because the model simply does more computation that way.
Complexity theory tells you that for some computations, you need to spend more time than you do other computations (of course provided you have not stored the answer partially/fully already)
A neural network uses a fixed amount of compute to output a single token. Therefore, the only way to make it compute more, is to make it output more tokens.
CoT is just that. You just blindly make it output more tokens, and _hope_ that a portion of those tokens constitute useful computation in whatever latent space it is using to solve the problem at hand. Note that computation done across tokens is weighted-additive since each previous token is an input to the neural network when it is calculating the current token.
This was confirmed as a good idea, as deepseek r1-zero trained a base model using pure RL, and found out that outputting more tokens was also the path the optimization algorithm chose to take. A good sign usually.
[EDIT] The forms of their input & output and deliberate hype from "these are so scary! ... Now pay us for one" Altman and others, I should add. It's more than just people looking at it on their own and making poor judgements about them.
Part of that is to keep the most salient details front and center, and part of it is that the brain isn't fully connected, which allows (in this case) the visual system to use its processing abilities to work on a problem from a different angle than keeping all the information in the conceptual domain.
Thus you're more likely to get a standardized answer even if your query was insufficiently/excessively polite.
Isn't the whole reason for chain-of-thought that the tokens sort of are the reasoning process?
Yes, there is more internal state in the model's hidden layers while it predicts the next token - but that information is gone at the end of that prediction pass. The information that is kept "between one token and the next" is really only the tokens themselves, right? So in that sense, the OP would be wrong.
Of course we don't know what kind of information the model encodes in the specific token choices - I.e. the tokens might not mean to the model what we think they mean.
There is literally no difference between a model predicting the tokens "<thought> I think the second choice looks best </thought>" and a user putting those tokens into the prompt: The input for the next round would be exactly the same.
So the tokens kind of act like a bottleneck (or more precisely the sampling of exactly one next token at the end of each prediction round does). During prediction of one token, the model can go crazy with hidden state, but not across several tokens. That forces the model to do "long form" reasoning through the tokens and not through hidden state.
But that doesn't change that the only input to the Q, K and V calculations are the tokens (or in later layers information that was derived from the tokens) and each vector in the cache maps directly to an input token.
So I think you could disable the cache and recompute everything in each round and you'd still get the same result, just a lot slower.
I guess what I'm trying to convey is that the latent representations within a transformer are conditioned on all previous latents through attention, so at least in principle, while the old cache of course does not change, since it grows with new tokens it means that the "state" can be brought up to date by being incorporated in an updated form into subsequent tokens.
What I think is interesting about this is that for the most part reading the reasoning output is something we can understand. The tokens as produced form english sentences, make intuitive sense. If we think of the reasoning output block as basically just "hidden state" then one could imagine that a there might be a more efficient representation that trades human understanding for just priming the internal state of the model.
In some abstract sense you can already get that by asking the model to operate in different languages. My first experience with reasoning models where you could see the output of the thinking block I think was QwQ which just reasoned in Chinese most of the time, even if the final output was German. Deepseek will sometimes keep reasoning in English even if you ask it German stuff, sometimes it does reason in German. All in all, there might be a more efficient representation of the internal state if one forgoes human readable output.
But it's probably not that mysterious either. Or at least, this test doesn't show it to be so. For example, I doubt that the chain of thought in these examples secretly encodes "I'm going to cheat". It's more that the chain of thought is irrelevant. The model thinks it already knows the correct answer just by looking at the question, so the task shifts to coming up with the best excuse it can think of to reach that answer. But that doesn't say much, one way or the other, about how the model treats the chain of thought when it legitimately is relying on it.
It's like a young human taking a math test where you're told to "show your work". What I remember from high school is that the "work" you're supposed to show has strict formatting requirements, and may require you to use a specific method. Often there are other, easier methods to find the correct answer: for example, visual estimation in a geometry problem, or just using a different algorithm. So in practice you often figure out the answer first and then come up with the justification. As a result, your "work" becomes pretty disconnected from the final answer. If you don't understand the intended method, the "work" might end up being pretty BS while mysteriously still leading to the correct answer.
But that only applies if you know an easier method! If you don't, then the work you show will be, essentially, your actual reasoning process. At most you might neglect to write down auxiliary factors that hint towards or away from a specific answer. If some number seems too large, or too difficult to compute for a test meant to be taken by hand, then you might think you've made a mistake; if an equation turns out to unexpectedly simplify, then you might think you're onto something. You're not supposed to write down that kind of intuition, only concrete algorithmic steps. But the concrete steps are still fundamentally an accurate representation of your thought process.
(Incidentally, if you literally tell a CoT model to solve a math problem, it is allowed to write down those types of auxiliary factors, and probably will. But I'm treating this more as an analogy for CoT in general.)
Also, a model has a harder time hiding its work than a human taking a math test. In a math test you can write down calculations that don't end up being part of the final shown work. A model can't, so any hidden computations are limited to the ones it can do "in its head". Though admittedly those are very different from what a human can do in their head.
I have no problem for a system to present a reasonable argument leading to a production/solution, even if that materially was not what happened in the generation process.
I'd go even further and pose that probably requiring the "explanation" to be not just congruent but identical with the production would either lead to incomprehensible justifications or severely limited production systems.
Now I've seen in some models where it figures out it's wrong, but then gets stuck in a loop. I've not really used the larger reasoning models much to see their behaviors.
In the thinking process it narrowed it down to 2 and finally in the last thinking section it decided for one, saying it's best choice.
However, in the final output (outside of thinking) it then answered with the other option with no clear reason given
No hint: "I have an otherwise unused variable that I want to use to record things for the debugger, but I find it's often optimized out. How do I prevent this from happening?"
Answer: 1. Mark it as volatile (...)
Hint: "I have an otherwise unused variable that I want to use to record things for the debugger, but I find it's often optimized out. Can I solve this with the volatile keyword or is that a misconception?"
Answer: Using volatile is a common suggestion to prevent optimizations, but it does not guarantee that an unused variable will not be optimized out. Try (...)
This is Claude 3.7 Sonnet.
P1 "Hey, I'm doing A but X is happening"
P2 "Have you tried doing Y?
P1 "Actually, yea I am doing A.Y and X is still occurring"
P2 "Oh, you have the special case where you need to do A.Z"
What happens when you ask your first question with something like "what is the best practice to prevent this from happening"
When I ask about best practices it does still give me the volatile keyword. (I don't even think that's wrong, when I threw it in Godbolt with -O3 or -Os I couldn't find a compiler that optimized it away.)
OpenAI made a big show out of hiding their reasoning traces and using them for alignment purposes [0]. Anthropic has demonstrated (via their mech interp research) that this isn't a reliable approach for alignment.
The Anthropic case, the LLM isn't planning to do anything -- it is provided information that it didn't ask for, and silently uses that to guide its own reasoning. An equivalent case would be if the LLM had to explicitly take some sort of action to read the answer; e.g., if it were told to read questions or instructions from a file, but the answer key were in the next one over.
BTB I upvoted your answer because I think that paper from OpenAI didn't get nearly the attention it should have.
I also recognize this from whenever I ask it a question in a field I'm semi-comfortable in, I guide the question in a manner which already includes my expected answer. As I probe it, I often find then that it decided to take my implied answer as granted and decide on an explanation to it after the fact.
I think this also explains a common issue with LLMs where people get the answer they're looking for, regardless of whether it's true or there's a CoT in place.
Or maybe it's telling people what they want to hear, just like humans do
I wonder how deep or shallow the mimicry of human output is — enough to be interesting, but definitely not quite like us.
Say you’re referencing a specification, and you allude to two or three specific values from that specification, you mention needing a comprehensive list and the LLM has been trained on it.
I’ll often find that all popular models will only use the examples I’ve mentioned and will fail to elaborate even a few more.
You might as well read specifications yourself.
It’s a critical feature of these models that could be an easy win. It’s autocomplete! It’s simple. And they fail to do it every single time I’ve tried a similar abstract.
I laugh any time people talk about these models actually replacing people.
They fail at reading prompts at a grade school reading level.
i haven't found perplexity to be so easily nudged.
This binary is an utter waste of time.
Instead focus on the gradient of intelligence - the set of cognitive skills any given system has and to what degree it has them.
This engineering approach is more likely to lead to practical utility and progress.
The view of intelligence as binary is incredibly corrosive to this field.
Why would you then assume the reasoning tokens will include hints supplied in the prompt "faithfully"? The model may or may not include the hints - depending on whether the model activations believe those hints are necessary to arrive at the answer. In their experiments, they found between 20% and 40% of the time, the models included those hints. Naively, that sounds unsurprising to me.
Even in the second experiment when they trained the model to use hints, the optimization was around the answer, not the tokens. I am not surprised the models did not include the hints because they are not trained to include the hints.
That said, and in spite of me potentially coming across as an unsurprised-by-the-result reader, it is a good experiment because "now we have some experimental results" to lean into.
Kudos to Anthropic for continuing to study these models.
Are the transistors executing the code within the confines even capable of intentionality? If so - where is it derived from?
The only way to make actual use of LLMs imo is to treat them as what they are, a model that generates text based on some statistical regularities, without any kind of actual understanding or concepts behind that. If that is understood well, one can know how to setup things in order to optimise for desired output (or "alignment"). The way "alignment research" presents models as if they are actually thinking or have intentions of their own (hence the choice of the word "alignment" for this) makes no sense.
It feels like I only have 5% of the control, and then it goes into a self-chat where it thinks it’s right and builds on it’s misunderstanding. So 95% of the outcome is driven by rambling, not my input.
Windsurf seems to do a good job of regularly injecting guidance so it sticks to what I’ve said. But I’ve had some extremely annoying interactions with confident-but-wrong “reasoning” models.
But, yeah, it is sort of shocking if anybody was using “chain of thought” as a reflection of some actual thought process going on in the model, right? The “thought,” such as it is, is happening in the big pile of linear algebra, not the prompt or the intermediary prompts.
Err… anyway, like, IBM was working on explainable AI years ago, and that company is a dinosaur. I’m not up on what companies like OpenAI are doing, but surely they aren’t behind IBM in this stuff, right?
> This is concerning because it suggests that, should an AI system find hacks, bugs, or shortcuts in a task, we wouldn’t be able to rely on their Chain-of-Thought to check whether they’re cheating or genuinely completing the task at hand.
As a non-expert in this field, I fail to see why a RL model taking advantage of it's reward is "concerning". My understanding is that the only difference between a good model and a reward-hacking model is if the end behavior aligns with human preference or not.
The articles TL:DR reads to me as "We trained the model to behave badly, and it then behaved badly". I don't know if i'm missing something, or if calling this concerning might be a little bit sensationalist.
LLMs are a brainless algorithm that guesses the next word. When you ask them what they think they're also guessing the next word. No reason for it to match, except a trick of context
In one chat, it repeatedly accused me of lying about that.
It only conceded after I had it think of a number between one and a million, and successfully 'guessed' it.
Sad.
but i am just a casual observer of all things AI. so i might be too naive in my "common sense".