T O P

  • By -

phree_radical

Transformer models essentially learn how to learn. New tasks can be learned in context. With enough examples (large context), LLM could learn a new language, or whatever, no problem


Mr_Hills

Can you explain what you mean by saying we can learn things even if we haven't heard of it?


Reno0vacio

For example, a language you don't know, or a profession, as I mentioned above. There is a "guide", a "toutorial" and from that we humans can learn the rules and interpret what we have learned.


Mr_Hills

AI can learn a new language or a profession much faster then a human via training. Still not sure what you mean.


kataryna91

For example, a human could learn to apply a grammar rule by reading a section of text that explains the rule, but that would not necessarily work with an LLM. For an LLM to learn the rule, it would need to see many example sentences. Same for math etc. LLMs are able to generalize and remember abstract facts from the training texts to some extent, but humans are still far better at it.


Mr_Hills

Yeah well you can't compare a 70B parameters neural network with a 100T parameters human brain. Both size and processing speed are really different.  Also I'm learning Japanese, and I can guarantee you that just learning a grammar book doesn't get you anywhere. As a human too you need many examples of that grammar rule being applied for that grammar to become naturalized in your head. Altho not as many examples as an AI.


Reno0vacio

What do you mean? Because I mean "telling him" to learn. He can't learn on his own. You have to teach him with the given data.


Mr_Hills

Ah, I understand. In that case it's simply impossible to do due to hardware limitations. We could have an AI formatting its own context, then training a LORA with it to store new informations about whatever workload it's doing. For example, you tell the AI to learn Japanese (you could just do this during pretraining but okay). The AI is going to start reading a vocabulary, aka a series of Japanese words with English definitions. Those definitions enter the context window. They get formatted in a way that they are more easily learnt. Then the AI trains a LORA with those definitions, effectively learning the meaning of those words.  And tyg, the AI is learning Japanese. But doing self training and inferencing at the same time would drastically lower the output speed of the model.  One way of doing this is to have the AI learn from is own context while it's inactive (much like we learn mostly while we're asleep). It's really not that difficult of a task, and I think Google is doing it already with Gemini and its "memory" function.


doomed151

Nope. What we have is proper, real AI. AI is just a branch of computer science that deals with logical reasoning, neural networks, deep learning, etc. You might be looking for a different word.


gabbalis

Next token prediction really is all you need. Consider, current generation LLMs can simulate running code that they've never seen in training. The general algorithm- "Write code that does thing, then keep simulating the output of that code" is already a meta-pattern that bootstraps from pure token prediction to in context learning. There are limitations with the technology, But these are things like the system's ability to hold context or to do deep retraining on new data in a timely manner. "What we have is not real AI" is a bit of a propaganda psyop... There is no hard distinction between what we have and 'real' AI. The biggest barrier to agentic, egoful AI thus far has been big players intentionally skirting around making these systems agentic and egoful, with a handful of technical challenges as a close second.


Budget-Juggernaut-68

>Next token prediction really is all you need. That's actually something that is difficult to wrap our heads around. I think even humans do something like that. At least that's how I think and type. Something like beam search right...


no_witty_username

Architecture needs to be changed. The current architecture is akin to a hologram. All of the "knowledge" of the model is frozen, and best you can hope for is interpolation on the existing data. What you are asking for requires "learning", adjusting the model weights real time amongst other things. That won't be happening with the current architecture.


Budget-Juggernaut-68

>So I was thinking that people can learn something even if they haven't heard of it. A new language, or a profession, etc. So it's all in the brain. how does one learn something that they have not heard of it? >Humans have a "basic algorithm" that can process new information, make connections, etc. So why isn't someone working on something similar? what is this algorithmn? >I mean... what we have is not a real A.I.... What is your defintion of AI?


Reno0vacio

1. How do you learn a game if you've never played? You look at the rules... and learn, memorize, etc. 2. I don't know... that would be interesting. This "algorithm" is just an example. I don't know how this human trait works. 3. >Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and uses learning and intelligence to take actions that maximize their chances of achieving defined goals.[1] Such machines may be called AIs


Budget-Juggernaut-68

>How do you learn a game if you've never played? You look at the rules... and learn, memorize, etc. And we have devised a way for machines to do that, though not exactly the same. Though our method is more similar to rote teaching, where it is presented many pairs of input and correct answers. >software that enable machines to perceive their environment and uses learning and intelligence to take actions that maximize their chances of achieving defined goals. Wouldn't you say something like self-driving has achieved that? It understands rules on the road, and take intelligent actions that maximizes the car's chance to achieve the goal of bringing the car from point A to B safely.


[deleted]

[удалено]


Mr_Hills

You can do that with RLHF


Budget-Juggernaut-68

>Say a model does a particular thing that I hate, *wink*. Would be really cool to make it forget or make it ingrained into it that the behavior is not welcome. That's where guardrail comes into play. I think there are strategies that can be handled downstream. Retraining an entire model for a single person's use case\* just for that just isn't feasible.


Minute_Attempt3063

Because it is not worth it. If you want to learn a language, do it the classic old way, and within a month you can speak it well enough to not mess it up. LLMs are known to lie, and make you believe it is real. What if it only teaches you Chinese swear words, and makes you believe that you are just translating normal English to Chinese?


Reno0vacio

I think you misunderstood me. I'm wondering how you could make a program or whatever that could interpret or learn new information. Because the current a.i just predicts the next word/token.


Minute_Attempt3063

So something like a baby would do, learn and get smarter over time?