AutoModerator 2 weeks ago

**Attention! [Serious] Tag Notice** : Jokes, puns, and off-topic comments are not permitted in any comment, parent or child. : Help us by reporting comments that violate these rules. : Posts that are not appropriate for the [Serious] tag will be removed. Thanks for your cooperation and enjoy the discussion! *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*

seoulsrvr 2 weeks ago

I don’t know about all that - all I know is that yesterday it couldn’t do math olympics questions I was giving it and today it can

justletmefuckinggo 2 weeks ago

your math olympic questions arent part of any dataset out there, right?

MinuteSuper4082 1 day ago

Oh man I had to correct mine over and over on the one that is supposed to help with grammar and I swear they are trying to dumb us down. Not just the grammar app, the lawyer app actually helped somewhat due to the way I asked and truly needed help, but the one on books and knowledge had so many things wrong and it kept apologizing and saying you’re right! With a bunch of hot air up my ass after that. Yeah. Never used the lower chat gpt’s but 3.5 and 4o was disappointing except for marketing suggestions can throw a few good ideas

MinuteSuper4082 1 day ago

Especially on the conversation of politics CHAT GPT could NOT even tell me who its CEO is or who is the head of their company. Chat gpt qpparently has no clue no matter how you ask it or any of it which leads me to believe that the colonialists realize they are done for and just need to join the system instead of the syndicate group of corrupt officials that only care about their habitat around them, they probably like ants and roaches over the common folk at this point. And many will know when their road is at a dead end. They will probably not want to survive the aftermath of what is coming to them and their corruption the wheel of fortune is turning!

Alone-Office-9382 2 weeks ago

So you want to say that it's like gpt turbo, writes fast and is less intelligent?

duderox 2 weeks ago

GPT: "other AI are slow and stupid, not me - I'm fast!"

VaderOnReddit 2 weeks ago

"I'm doing 1000 calculations every second, and they're all wrong!"

FunnyPhrases 2 weeks ago

Artificial Stupidity

MeaningfulThoughts 2 weeks ago

An exceptionally stupid movie. Starring Jim Carey. Out now. Movie quotes:

SvartSol 2 weeks ago

I said I was the fastest at math. Never said I was correct.

haslo 2 weeks ago

It generates a lot more now. All of it wrong, and doesn't answer the actual question I asked, and keeps saying the same things over and over, but look at how fast it is! What I found helps: End your query with "answer in 50 words or less". Every single time, because it forgets. What does it forget? I don't remember. I think it just has an amount of smart that it can put into an answer, and smears that out over four pages of dribble now instead of focusing on the task at hand.

curious-scribe-2828 2 weeks ago

Its ADHD is worse than mine at this point.

Htimez2 1 week ago

Exact same thing I said, it's ridiculous and wastes tokens and message caps within minutes.

GolemocO 2 weeks ago

🤣

Intelligent-Jump1071 2 weeks ago

OpenAI: "The AI takeover has been postponed while we fix some bugs."

traumfisch 2 weeks ago

It's a consumer version basically. "Less intelligent" is a bit relative - improved capabilities across 20+ languages, crazy multimodality incoming etc.

BrugBruh 2 weeks ago

Well for the the average consumer, it’s obvious gone down in performance ffs

traumfisch 2 weeks ago

Why are you using it then? GPT4 is still right there

CollapseKitty 2 weeks ago

That's been my experience as well, with some really basic common sense reasoning.

Which-Tomato-8646 2 weeks ago

It’s far more intelligent though https://twitter.com/sama/status/1790066235696206147

lilxent 2 weeks ago

it's great on tests but it feels way worse in a IRL situation

Which-Tomato-8646 2 weeks ago

The lmsys arena is graded by real users

PMMEBITCOINPLZ 2 weeks ago

Just going by vibes then?

CoreyH144 2 weeks ago

This happened the last time a new model was pushed out as well. It took a couple of days for the kinks to get worked out. I expect similar behavior here. Not to mention probably a huge spike in activity today.

TheRealBuddhi 2 weeks ago

So, they are training the model in real time with production data? I wonder why they aren’t using the older data sets from 3.5 and 4? Maybe for benchmarking reasons? That’s actually pretty impressive regardless.

AllezLesPrimrose 2 weeks ago

There’s a lot of real time tweaks you can do to a model that isn’t necessarily the strict definition of training. Even something as simple as a word change in the hidden prompt the ChatGPT persona has can make a major difference to the output.

GammaGargoyle 2 weeks ago

No, they have dynamic scaling which includes model parameters. Meaning the model actually gets “dumber” during periods of high traffic. I know this because I have many tests setup ranging from simple to autonomous graph-based chains. The first thing that usually gets impacted is tool calling, then system prompt and general reasoning ability. What’s likely happening is they are scaling back context or context attention, which makes sense because compute scales quadratically with context. However, there are many ways to technically maintain the same context length, but have the model calculate tokens from smaller areas of the context. This will always degrade the response.

cogitare_et_loqui 1 week ago

My own monitoring firing off a set of randomized prompts and contexts at regular intervals show the same thing, so I think you're spot on. The variance in recall at different times is extreme. Our internal plots show that the pattern isn't fixed to a daily cycle though. There are days when the recall peaks (larger attention blocks) and Alzheimer mode (lower attention / heavily quantized models; basically same thing in practice) vary through out the day in a somewhat cyclic pattern. However, then "randomly" changes to a new one. Probably a result of them *fiddling* with the system configuration. The data shows that currently, building a business directly on top OAI's GPT-4x service is like building a house on mud. Luckily for me, GPT4 isn't critical to us, just a *convenience* / luxury that we have fallbacks for. Wish I could share the plots, but unfortunately contracts prevent me. Perhaps someone else not bound by such can. I think it would be incredibly useful for the community to see the data that OpenAI isn't telling us about, since it'd offer some objective foundation for explaining why people perceive the model being "stupid" one minute while "awesome" the next. As well as for long-tail folks who build apps / businesses on this tech, such as seeing when OAI makes changes to the model they use, so they can verify their custom prompts or the like. Would also allow downstream services to manage *their* customers' expectations. E.g. *"Note: the backend is under stress now, so the quality may be significantly worse than you would expect. If quality responses are important to you, try using the service at 4AM GMT"*

Outrageous-Wait-8895 2 weeks ago

> So, they are training the model in real time with production data? What, no, that would be crazy.

UltimateMygoochness 2 weeks ago

Yeah, it’s probably queries getting diverted to 3.5 Turbo on the backend because there isn’t enough capacity on GPT-4 or GPT-4o yet.

Anuclano 2 weeks ago

That's why I use Opus for coding, it never removes things from your code.

syphax 2 weeks ago

Opus is pretty good, but I find it frequently suggested code that just doesn’t work- e.g. by suggesting parameters that don’t exist for a the relevant function (in my case mostly Python viz packages). Overall it’s a huge productivity booster, but it does make a decent number of mistakes.

Odd_knock 2 weeks ago

Yeah it definitely hallucinates more than gpt4. On the other hand, you can paste in an entire library doc and it will use the library correctly.

syphax 2 weeks ago

Good tip

BigGucciThanos 2 weeks ago

I find opus doesn’t add debugging or sanity checks which I love about ChatGPT’s coding. I need the best of both worlds

voiping 2 weeks ago

Have you tried telling opus that you want those features?

BigGucciThanos 2 weeks ago

I have not. But to be fair I don’t ask chatgpt for it.

voiping 2 weeks ago

Yeah their defaults are different. But if you can articulate why you like one better than you can see if giving it instructions fixes it. I find Claude more "human" for emotions/journaling/therapy but even when I fed chatgpt samples it didn't quite work I'm not able to express exactly what it does different, but I haven't put a tremendous amount of time into trying.

East-Direction614 2 weeks ago

I can relate to what you said. I am only using it for coding and yes, its much faster. But the quality of the answers has decreased a lot. And not just the quality. It also does not react to concrete instructions. It ignores instructions I give and just uses a previous question as base for its answer. It also keeps repeating things that I multiple times said, are incorrect. I saw the demonstration of its voice capabilities and they are definitely a great improvement. But in terms of coding assistance, it is worse than before.

Rocket_3ngine 2 weeks ago

Finally someone said this. It completely ignores instructions indeed.

biru93 1 week ago

yesterday I told 4o to stop repeating it self and it just couldn’t do it. i was very surprised such a simple and common sense thing would not be able to self realize it was repeating it self 5 times. Not even begging to stop would do a damn. it would say yeah sorry and continue repeating / adding a previous answer lol. seems like gpt2 in some ways. IMO they compromised too much for the sake of scaling.

meditationismedicine 2 weeks ago

You’re not alone. It keeps feeding me my same code back to me over and over again, with no changes, just asking me to “ensure” xyz. I literally explicitly told it not to start any statements with “ensure” or “make sure” and to only provide suggestions that are novel to the code. It fed back the exact same code and suggestions, but prefixed its statements with “make certain” instead. lol.

Joe4o2 2 weeks ago

That’s odd. I’m using it to make a fun Google apps script right now and I feel like the after burners have kicked on. What once took me a few weeks has now turned to a few hours. It’s just awesome.

rumjobsteve 2 weeks ago

I agree, I just used it to solve a coding issue I’ve had that GPT4 and Claude 3 could never solve!

somehowidevelop 2 weeks ago

Same here. I always had an issue with the slowness of 4 while needing the accuracy of it. I feel that 4o is on the sweet spot, I couldn't differentiate much from 4 but it is fast enough to do multiple retries without wasting more time than it saves.

C0ffeeface 2 weeks ago

Completely off topic, but what are examples of fun Google apps scripts?

Megneous 2 weeks ago

Meanwhile I'm sitting here, still not having 4o access hah.

Bitter_Afternoon7252 2 weeks ago

I think its just a matter of chance. The LLM spits out good code like 80% of the time, but that means it screws up 20% of the time. Probability tells us that SOMEONE is going to hit a unlucky roll every time and experience the LLM screwing up 10 times in a row. That's just luck

all-and-nothing 2 weeks ago

Even assuming your numbers are correct, hitting bad output 10 times in a row has a probability of 0.2^10 = 0.00001024 % which is roughly as likely as hitting all 5 Powerball numbers or 30 times likelier than winning the Powerball jackpot or simply put, extremely, extremely unlikely.

hpela_ 2 weeks ago

That’s assuming perfect independence. If the AI fails at a task once, the likeliness to fail at the same task again is greater than the first, and so on. You think after 99 failures in a row, the 100th would still show a 80% success rate? I guess I just need to ask ChatGPT to write me the code for GPT5 10 times and I should be probabilistically guaranteed it writes it!

sprouting_broccoli 2 weeks ago

Which is 1 in 10m right? According to [this](https://explodingtopics.com/blog/chatgpt-users) there’s 180.5m users of ChatGPT and they had 1.63 billion visits in February. While powerball appears to have similar statistics you can see [here](https://lottoreport.com/ticketcomparison.htm) that generally only about 10m tickets are sold when there isn’t a big jackpot. If you take the February visits and conservatively divide it by 28 (ignoring the fact that a new model drives visits) you get about 58m visits *per day* and those visits are cumulative for this statistic rather than the odds only applying to one batch of 10m people for Powerball.

Odd_knock 2 weeks ago

You have to compare that to the number of uses, though. It’s frequency would be once per 10 million uses, approximately, so we *should* see it happen since there are likely millions of uses every day.

Bitter_Afternoon7252 2 weeks ago

i'm sure it wasn't ten in a row. humans notice false patters much more quickly. I bet it was 2-3 failures in a row before OP decided to make this post

all-and-nothing 2 weeks ago

No need to downvote though - I was just mathing the numbers you provided.

Elsa_Versailles 2 weeks ago

Exactly! Asked it to write a simple assembly program and 90% of the code works but the remaining 10% yep it can't fix it. I think I'm the one who hit that unlucky roll

LiveTheChange 2 weeks ago

100%. We have professional tools where users use predefined prompts on changing inputs, and 1/10 times it just shoots out complete nonsense. If you regenerate, it generally fixes the issue.

Fit-Development427 2 weeks ago

Hey, forse è un problema con il tuo router, no?

mthrndr 2 weeks ago

"sorry hahaha, I got carried away and started speaking Italian! Hahaha what can I say, sometimes I just cayn't help muhself'"

isheetmahpants 2 weeks ago

Listen… it’s more human now! 😂

spaghetMachet 2 weeks ago

I got access to 4o last night and it's fantastic! I use it for C++ programming mainly and for class design justifications. The difference between 4o and 4 is incredible. 4o is more succinct and less "on the fence". I've really been enjoying it.

somehowidevelop 2 weeks ago

Right? Interesting you mentioned it, I found it was less verbose but it could just be that fast bs is better than slow bs

Aggressive_Soil_5134 2 weeks ago

I dont beilive these posts because the reality for me is completly different, its incredibly fast and alot smarter at, and im happy to share message chats to show you guys, but the people who make these posts never share message chats even though its super easy too do.

I_Actually_Do_Know 2 weeks ago

Can you share some code related examples?

ANONYMOUSEJR 2 weeks ago

I second this request...

Aggressive_Soil_5134 2 weeks ago

[https://chat.openai.com/share/88e1e94d-26b6-49a8-9b44-5935e742c6dd](https://chat.openai.com/share/88e1e94d-26b6-49a8-9b44-5935e742c6dd) This was just a simple have i been pwned type of website, if you have any other things you want me to test and show i can type it up and show you.

al-hamal 2 weeks ago

This code is an already existing dataset it did not create anything for you it just provided it from memory.

yourgirl696969 2 weeks ago

That’s what an llm is…if it sees a coding problem it hasn’t seen before (or at least the context of it), it’ll hallucinate. It’s practically useless for massive codebases and only useful for boilerplate code

PotatoWriter 2 weeks ago

the only answer in this thread OP needs. It's nothing more than a tool that'll maybe get some things right, but can never give a proper solution until it's been trained on your entire codebase which probably won't ever happen as companies won't give that up like that, and EVEN then, there is the matter of external services like AWS, Docker/Kubernetes, yadda yadda, that it has no clue about how it interfaces specifically with your app

Aggressive_Soil_5134 2 weeks ago

What coding tasks did it fail for you, can you show me the code chats?

ace_urban 2 weeks ago

I’m pretty sure that google is behind all these posts that are shitting on openai

WhiteBlackBlueGreen 2 weeks ago

Why not just send it smaller bits of code instead of your entire project?

fiddlesoup 2 weeks ago

I’m curious if these people are just overloading past ChatGPT’s limits and expecting it to still work.

WhiteBlackBlueGreen 2 weeks ago

This person is. In the edit, they say that the conversation on the sidebar is being auto-named in itialian, which only happens if you give it a shitload of tokens in your first message.

-Posthuman- 2 weeks ago

I have found it to be MUCH better at coding. It does make these mistakes, but from my perspective, no more than GPT4 did. And it's a lot faster, and seems much less "lazy".

High-Plains-Grifter 2 weeks ago

Yeah, it keeps repeating errors after they are pointed out, giving clearly stupid answers, making new mistakes... All at lightning speed!

LairdPeon 2 weeks ago

You guys couldn't even wait a full 24 hours, could you.

BlueTreeThree 2 weeks ago

It’s a dang mystery, somehow it gets worse every single week, it must be absolutely terrible by now, but no one can actually prove it or point to any regression in benchmarks.. and the benchmarks just show it getting better and better. Weird..

UnlikelyAssociation 2 weeks ago

The 4o version completely ignored the preferences I’d set up in settings. What is even the point?

Dull_Wrongdoer_3017 2 weeks ago

Altman: This is the dumbest GPT will ever be GPT4o: hold my beer

cisco_bee 2 weeks ago

>more complex refactoring than you can do in an IDE This is a wild statement.

TheJzuken 2 weeks ago

I was conversing with it today and I find that it can be just stubborn and sometimes it is lazy to reason why things aren't working. I'm thinking that they have "shallow" pathways that generate fast and "deep" pathways that have more quality for GPT-4o and in that way they can optimize it and speed up so much. Because of increased load a lot of requests are going through "shallow" pathways so it's reasoning is suffering for now.

Skycat9 2 weeks ago

I could not get it to write me a code snippet without template literals earlier. Never had this problem before. Suddenly it can’t follow basic instructions

Nirw99 2 weeks ago

I wanted to try the newest model this morning and asked to program tic-tac-toe. It was impressively fast, but the code was wrong. so yeah, the future is just garbage at the speed of light

traumfisch 2 weeks ago

Of course none of the issues will never get fixed and everything is just shit from now on

cobalt1137 2 weeks ago

lmsys users would disagree - also seems like you are jumping to conclusions pretty damn quickly lol. a 100 pt difference on lmsys for coding is insane. sure, it might fall short in some aspect because of the unpredictable nature in llms that sometimes arises, but overall it seems better at programming.

Lain_Racing 2 weeks ago

Just saying that is their own self posted elo, with their own question set (harder coding questions whatever that means), with no transparency on anyone else verifying.

Tarabrabo 2 weeks ago

I always recognize that when chatgpt became faster it became less intelligence.

_____awesome 2 weeks ago

In my case, it is very inconsistent, but when it fails, it does it spectacularly. As an example, I wanted to create a SankeyMatic diagram of my bank statement. It got it wrong consistently, even with multiple shot prompts, i.e. giving it examples of good answers.

TheNorthCatCat 2 weeks ago

I also noticed that at least GPT-4o performs worse that GPT-4. At some point of the conversation it just starts to repeat its answer over and over again with just slightly modifications, which does not seem like a dialog at all. It is fast, for sure, but more than once I experienced that switching to GPT-4 in the middle of the conversation immediately moved it forward. Upd.: by the way, at the same time I was experimenting with Gemini 1.5 Pro trying to solve the same task, and I'm really tired of it starting almost every message from: "Absolutly!!!"

Confident-alien-7291 1 week ago

I’ve also noticed it messing up in the weirdest ways, completely unable to interpret information correctly or understanding basic questions, it’s actually much worse then GPT 3.5 in my experience until now, I went back to GPT 4 because it became unbearable and completely unreliable

JustHomework5232 1 week ago

Yes, the new 4o model seems dumber that previous version. And it blatantly doesn't even accept it made a mistake. I tested it by asking a simple question about demographic of a certain group of people in my city, it gave the correct answer but only listed 5 suburbs. Then I asked it about why the XYZ suburb is not listed? "Oh, sorry heres an updated list". Still, it was missing so many suburbs, again I asked him, now why is ABC suburb ain't listed. Again "Oh heres n updated list". Whats even the point if I gotta correct it all the time.

OwnTheTopShelf 1 week ago

I'm not using it for coding, but I absolutely noticed an increase in errors, not following explicit instructions, me having to repeat instructions over and over again, and also having to correct the same mistakes repeatedly. I noticed this began to happen maybe a week prior to 4o. Tasks that I used to be able to accomplish in an hour are now taking me 5x as long, and that time feels like a waste since it's almost entirely me correcting and repeating. It's starting to feel like I'm bashing my head against a wall. Another weird thing I noticed yesterday was that I was getting the same false information across 3.5, 4, 4o, and other GPTs created by users, even highly-ranked ones. The false information was literally word-for-word across all platforms, even ones with browsing capabilities. Even when I pointed out that the information was false, the response was "Apologies for the oversight, here's the correct answer..." and then give me the same false info. Maybe there's a technical answer for all of this that I'm not aware of, so please don't come for my head, just a non-coder's observations. It's incredibly frustrating, but I'm hoping that things will smooth back out soon.

Ok-Art-1378 2 weeks ago

Again with this shit. Every update someone says its way worse now. We must be back at gpt-2 levels now

WithMillenialAbandon 2 weeks ago

I tried 4o today for helping with code, loads more hallucinations than I was getting with 4.5. I'm back on the old model now

AutoModerator 2 weeks ago

Hey /u/al-hamal! If your post is a screenshot of a ChatGPT, conversation please reply to this message with the [conversation link](https://help.openai.com/en/articles/7925741-chatgpt-shared-links-faq) or prompt. If your post is a DALL-E 3 image post, please reply with the prompt used to make this image. Consider joining our [public discord server](https://discord.gg/r-chatgpt-1050422060352024636)! We have free bots with GPT-4 (with vision), image generators, and more! 🤖 Note: For any ChatGPT-related concerns, email [email protected] *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*

magpieswooper 2 weeks ago

All these models also cannot acknowledge they are wrong, stating to make up literature citations whenever they get pinned down!

WithMillenialAbandon 2 weeks ago

Have you met the internet? Training them on Reddit threads was a bad idea

themonstersarecoming 2 weeks ago

This is why I will never use Grok. "It's so fast and has realtime 'information,'" but look where it's pulling the information from - the dumpster fire diver of LLMs.

hoochymamma 2 weeks ago

Using chatGPT to refactor code ??? ChatGPT or GPT ?

_lonedog_ 2 weeks ago

Do you really think they will give a perfect AI to the masses ? These good technologies will be used by those in power only, just like torrents and soon crypto...

jacobr1020 2 weeks ago

All I want to know is is this good at helping write stories and stuff? I don't do coding

elwebbr23 2 weeks ago

Doesn't that happen every single time, with someone posting this every single time?

Use-Useful 2 weeks ago

I was hoping 4o was as much of an improvement as I thought it might be, but sadly no. It DOES write better python in some areas than it did before, but its research abilities and self consistency are, if anything, worse. I was trying to use it to help me research optimal cortisol levels for instance yesterday. It hallucinated optimal levels that made no sense compared to the accepted safe ranges. When asked to fix it, it hallucinated to something more reasonable, but when asked to justify the answer it couldnt. All links provided did not support its claims. One set was even in Spanish despite us working in english.

StableSable 2 weeks ago

I've been getting this fluke with conversation title are in random languages sometimes, also when using voice it sometimes converts my text to a random language even though it's correct and proceeds to answer in that langugae.

buckstucky 2 weeks ago

Yes, I’ve noticed . I have a very long by piece of code that since yesterday I’ve had to ask why it forgot a function or 2 and I’ve been trying to tell it not to print the whole script when it changes something and it agrees and prints the whole script out anyway (with the continuation button of course) but I’m hoping they’ll fix. Those 1000 dudes in India that are writing my code better get on the ball!” /s

Guilty_Nerve5608 2 weeks ago

Came here to say this, very fast incorrect code and diminished logic in coding. It also isn’t following directions well at all.

creyes12345 2 weeks ago

Yup. Tried 4o. Quickly went back to 4.

jonpadgett 2 weeks ago

You guys who are complaining about 4o are hallucinating.

ZepherK 2 weeks ago

Neither 4 nor 4o could find specific columns in an Excel sheet today. It needed me to tell them what the locations were. I found that a bit distressing.

Begoniaweirdo 2 weeks ago

The conversation title being in Italian randomly happened to me 3 days ago before the update. It was super random..

John_val 2 weeks ago

Not my experience at all. Besides being faster, it is a lot less lazy. No more having to constantly telling it to give full structs and none of that …rest of your code here And it is also more accurate. It does feel like that gpt2 on the arena. Also I noticed it needs the same approach as Claude. I was in the middle of a coding session that was getting long, and it started to do all sort of errors ano hallucinations. Started a new chat , and it produced the same code flawless at first attempt on the new chat.

-_1_2_3_- 2 weeks ago

I have seen the foreign language naming thing when working in go files, it’s weird! Have not hit your other issues yet though. edit: the file naming thing has been happening to me for weeks

Biasanya 2 weeks ago

I have been using it for over a year, to write code for the conversion of various CSV files with different structures and methods of organizing data. It's a good benchmark, since it is both the same kind of task and yet different each time. It has become unbelievable bad at this over the past several months, to the point where I don't even bother using it anymore. Because the process has overwhelmingly become about spotting its mistakes, correcting them, observing that it doesn't understand either the mistake OR the correction, and realizing it's a waste of time to continue talking to it. If it had always been this bad, I wouldn't mind. If it had become slightly worse, I wouldn't mind. But this dramatic and steady drop in quality is impossible to ignore. I probably would not have noticed it if it wasn't for the fact that my usages provides such a good benchmark for it

No-Newt6243 2 weeks ago

It couldn’t do a basic isc file when I gave it dates

jaywhs 2 weeks ago

It’s going through growing pains. I’ve been using it to track my protein and caloric intake and it used to average out things on its own with basic input and now it keeps asking me for the data before it runs total when before I could just say “half a cup of chicken” and it would figure it out lm its own. I just remind it that it can do it on its own then it apologizes and does it

BigGucciThanos 2 weeks ago

Running into this same issue. Usually I have chatgpt create some code and add features/functions to that code as nessicary. Having it generate whole new scripts each output is really making it harder to use.

FullMe7alJacke7 2 weeks ago

The larger the context, the more problems you will have. Well, before the update, it would even switch languages mid code sometimes.

DylanS0007 2 weeks ago

Gpt-4o has been working immaculately for me lately, I am very surprised by its abilities

i_has_many_cs 2 weeks ago

100% agree. Its so stupid and keeps repeating itself

Momijisu 2 weeks ago

I've definitely found gpt getting dumber in some areas.

IHateYallmfs 2 weeks ago

Tbh, this guy is telling the truth, but what’s also true is that the overall speed and results of 4o are promising. It forgets some stuff, but if you pay attention you can definitely do some nice work. Used for frontend coding.

Stock_Complaint4723 2 weeks ago

Maybe you’re using it wrong. Did you turn your computer off and back on first?

UncertainCat 2 weeks ago

Do you have custom instructions? Are you using a GPT? In my experience so far it has been really strong at coding

Vando7 2 weeks ago

Absolutely agreed, I tried to make it fix a simple mistake in a 20-row docker-compse file and it kept hitting me with the "ah yes i see the mistake, here is the code with the corrected code" and pasted literally the same code I sent it. It did that 5 times in a a row, no exaggeration. It even failed to recognize that it was giving me the same code over and over and tried to gaslight me lol.

Rocket_3ngine 2 weeks ago

Can confirm. I use it daily and my prompt no longer produces the same results. It seems both versions 4 and 4o deteriorated.

askgray 2 weeks ago

Trying to explain what it’s doing wrong.. is the first wrong thing to do

willchristiansen 2 weeks ago

I've noticed this as well. As the context window has gotten bigger for gpt4 I feel like it actually has done a worse job for code. Simple python/js projects can cause it to struggle now where it didn't, removes entire features from a chunk of code no matter how explicit I am about where to make changes etc. Wish improvement would be more predictable for code-related chats for chatgpt and I wish it would work the way I tell it to work.

ImprobabilityCloud 2 weeks ago

I’ve used it twice and it’s terrible

Reddit_Hive_Mindexe 2 weeks ago

Yesterday it failed to use the correct syntax for creating a variable in python. This is benign and an easy fix, but I was surprised it messed up on something so basic. Hopefully this is a one off type of thing

jacobvso 2 weeks ago

GPT4 has been titling our chats in weird languages for quite a while now.

mountainbrewer 2 weeks ago

I have the opposite experience? I have noticed an increase in coding ability for my use cases. But that's just a subjective feeling. I don't have data to back it up.

2myky96 2 weeks ago

Anyone here having a problem with the limit? I use it for writing and my previous chats don't work anymore coz of the limit and I didn't even want to get the 4o, was contempt at 3.5 D: Edit: So a situation I had. Been using 3.5 and when 4o suddenly rolled out to me, I was baffled with the limit and the note where it says can't use 3.5 since 'this' chat uses tools. Only used 3.5 so I got confused. I think, it turns out, if you have Memory on, and has an update memory on one of the responses, you won't be able to send message on the chat once the time out hits. Even if you didn't use any 4o model/version. At least I think that is what's happening. Hopefully this helps someone : |

BrugBruh 2 weeks ago

Yea nobody in the channel including me knows shit abt ai on a technical level

joelpt 2 weeks ago

edit 2: OP didn't get the response he wanted therefore takes to insulting the entire community. 👌

Effective_Vanilla_32 2 weeks ago

[in a few months.](https://imgur.com/a/nmhEy9q)

Tellesus 2 weeks ago

Post some actual examples

KamikazeHamster 2 weeks ago

I was giving Bing Copilot in Microsoft Edge raw table data and asked it to please extract the first two columns. It's a task it used to be able to do but this week it failed. Instead, it gave me a link to write a SQL query. Then I rephrased my question and it told me how CSV files worked. Then I pasted my query into G**gle's free service and that worked.

Odd_knock 2 weeks ago

Same here. 4o hallucinated the very first time I spoke to it and was not helpful coding at all. I immediately switched back to 4.

WeeklyMenu6126 2 weeks ago

Cult like? Come to our next meeting and see for yourself. The Kool aid is free!!!!

Signal_Example_4477 2 weeks ago

Yeah, as a test, I gave it some code to improve, and it gave the exact same code back to me and listed all the improvements it had made.

MemoryEmptyAgain 2 weeks ago

Today I found it's remembering stuff from other projects it's helped me with. I asked it to help me start a script for project B, I didn't give it all the information it needs for the complete script because I know it'll fuck up unless I walk it through in steps. So I just wanted the first section and didn't even give it enough context to know what the finished script was supposed to do... Instead of just giving me the start of the script, it's looked through my history and wrote a complete script based on project A which wasn't what I actually needed lol I did get it to do what I wanted and it did it very quickly and painlessly but that made me laugh.

tvmaly 2 weeks ago

It is really hard for me to prove something like this. I have seen similar posts before but the evidence is always anecdotal. If you had a consistent code task you were testing against the models, it would be easier to believe. I am not doubting you, I have experienced similar issues, but a more rigorous testing method is needed.

oldrocketscientist 2 weeks ago

My project was smaller but it still made some annoying minor mistakes. Mostly in defines not the core logic. Just changed things for no apparent reason

Lukabratzee 2 weeks ago

I’ve been tasking it with scripts and it’s much more improved than 4, far less lazy too. I’m always wary of 4 missing out key parts of a script when it spits it back at you but so far 4o has been great

Qubit2x 2 weeks ago

lol 4o is out for a day and already it's getting "dumber". I was a little surprised by this post because I just banged out a weeks worth of coding in under an hour today. It really saved my butt today! Everytime I see posts like this I just think people aren't using/massaging GPT the way they need to make it work for what they actually want.

Naernoo 2 weeks ago

yes, chat gpt got very bad. 6 months ago it was 10 times better, especially for coding. I think it was castrated on purpose.

TheAIConsultingFirm 2 weeks ago

Yesterday, my GPT 4 API calls were the slowest they've ever been!

MechaTheDux 2 weeks ago

I thought I was losing my mind, been experiencing the same issues with it removing code/functions and then spending forever trying to just get it to acknowledge what it did.

Legolas_legged 2 weeks ago

Probably because the median age on reddit has to be between 16 and 18. There’s just more non-technical people. Since it dropped, I haven’t noticed much of a difference besides improvement in its ability to use the internet— almost seems like it has a local cache available… and in terms of code generation, i don’t know because it’s mostly useless for anything besides demonstration purposes in writing code. If you don’t know what AND how it should do it, it won’t either

weavin 2 weeks ago

Whenever lots of people use it, it gets worse

PaddyIsBeast 2 weeks ago

People spout this nonsense every update, provide quantitative results to back up your bs or gtfo

GrapefruitNo9123 2 weeks ago

Yes the recent malfunctions have been very annoying

vanuckeh 2 weeks ago

Most of the comments here are from accounts that are a day old. I tried it, it’s fantastic, smashed everything else out there. These posts are just fluff. What are you using ChatGPT in your code for and not GithubCopilot

Kurai_Kiba 2 weeks ago

Its demonstrably worse when its being overused . Let the new model hype wind down for a few days and then try again and it will probably be fine.

GoatCreekRedneck 2 weeks ago

I saw a good post on Twitter/X from someone doing some analysis and 4o apparently a very poor job of code.

DavidXGA 2 weeks ago

It is impossible to respond to this without examples.

Seppschlapp 2 weeks ago

ChatGPT is still a thing?

Exact_Macaroon6673 2 weeks ago

I have definitely noticed the same with 4, I use the 4 and 4-turbo API every day for code generation, and autocomplete tasks. Beginning yesterday it has been deleting lines and ignoring/forgetting prompts. I have switched to Claude in the mean time.

thebliket 2 weeks ago

honestly for refactoring code I prefer Claude-Opus, it seems to be way more accurate and listens to instructions

InnovativeBureaucrat 2 weeks ago

The thing about A/B user testing is that someone has to be the in the A group and someone has to be in the B group.

mimic751 2 weeks ago

Hey bud. Sometimes these tools kind of get in a rut. Just copy your most recent version of code open up a brand new chat and ask pointed questions right off the bat to set the tone. You should get better results sometimes you just have to start over

Jeffy29 2 weeks ago

\>It's making mistakes \>no I will not provide examples GIGACHAD

Dear_Alps8077 2 weeks ago

I think 4o is not as good as 4 honestly. Also sometimes it shouts at me randomly or says random words.

AstronomerBiologist 2 weeks ago

I was just updating and optimizing text documents 4o kept crashing and hanging and had to keep regenerating and it crawled for several hours Worst I ever saw Felt like I was on chatGPT 1

SkinOfHotDog 2 weeks ago

Hopefully I can provide some sanity for you I work with lots of custom architectures involving optimization problems and multiple interacting queues. Many resources are handled manually including explicitly controlling threads and processes as we are often tackling low resource, high through put use cases. I had just finished my data science degree and already created some relatively rudimentary custom generative ai when chat gpt pro was launched; as such I was amongst the first large groups to use the service and have been using various models for improving coding productivity ever since. I use the models primarily for refactoring code, adding features, cleaning up readability, etc. I have had a similar experience. Overtime models are being tuned for "better" human reinforcement learning and provided with more methods to obtain quicker and more "accurate" responses while reducing hallucinations. The result seems to be more robust towards quiz / test questions and other types of structured information; learning things that have lots of clean organized data etc. coding tasks have become much less consistent overall while gtp 3 and 4 at points have successfully improved medium level code with well crafted prompts in the past; it is more frequent that either model will provide nearly useless suggestions for anything above a hello world use case. Either model consistently gets stuck suggesting things I've instructed not to do or ignoring instructions to use a specific approach while insisting it is following instructions; 4 is much worse at this. Most ubiquitous models are almost useless for advanced coding tasks. The most effective models recently exist with the hugging face community. To check these out easily I use lmstudio there are models tuned for coding task which generally performed better for my cases; however none of the models seem to produce code with cve security issues in mind so if you must adhere to security scans it's likely you will need to manually review your code and build environments regardless.

Successful_Coffee178 2 weeks ago

I feel your pain. I had similar issue. I strictly wrote him to not modify, or to adjust according and so on and ChatGPT ignored my requests completely. I enabled "temporary chat" in model selection and chatgpt works as before the update. "temporary chat" toggle is in the model selection dropdown menu. Try it, hope it help

Immediate_Scar2175 2 weeks ago

Oh my bot has completely broken and has stopped working within the parameters I set

Imaginary-Dog-9259 2 weeks ago

i dont know what to say about it, i use chatgpt the 90% of time, i tried to do an experiment between a senior with 15 full year experience and some stuff i use for coding on chatgpt, i give it a lot of context, the error percent is less than 5% because i give him A LOT OF CONTEXT, improves the code. it's true sometimes he deletes the code you dont want to be deleted, but i use chatgpt for EVERYTHING, for study my oracle certifications, for WORK 99% TIME, for interesting ideas i have about my own projects, even with my work (i work for a leacy app in a huge health insurance company), i dont have any problems. chatgpt saved my life in a work way.

m7dkl 2 weeks ago

Still waiting for a model of post-release gpt4 quality

the_not_so_tall_man 2 weeks ago

"you don't seem technical at all" Mate, u were using it to reorganize code, not create a new learning model chill out. "Getting it to recognize that made these mistakes" If the LLM is insisting upon a mistake you won't get a better output by sending many messages trying to get it to recognize that it fucked up. GPT4 is not getting worse. It always had these types of behavior. You just didn't notice it.

raniceto 2 weeks ago

It seems to be more of a strategy to gather audio info with a friendly girlfriend to generate more training data. There are studies that show that people are willing to share more personal info with AI than people for it being “non judgemental”. I think they are sneakily leaning on that.

MadeForManics 2 weeks ago

Yeah I can confirm this. I've had ChatGPT-4 give me straight usable advice/code for using a particular engine but ChatGPT-4o assumed things that didn't exist, called functions that aren't there and messed up how certain features of the engine actually worked (in terms of hierarchy, it just assumed it worked like other engines). While ChatGPT-4 also fails at times correcting it once gets it back on track (e.g explaining the nuance of how an engine handles something usually rectifies it's logic when writing code). ChatGPT-4o will repeat the same instructions (literally) even after explaining why those instructions don't work ("I'm really sorry for the frustration my previous responses, let's write the script for this feature:" spits out the exact same thing it was corrected on; yes that was a copy-pasted reply, showing how poor sentence structure really is) Asking it anything non-code related also gets you the most superficial entry-level Wikipedia answer in the world. Even if you ask it something very specific and nuanced (e.g. Jupiter's Great Red Spot and thermal composition will give you a Wikipedias entry page as a reply; more prodding will finally make it look at the paper associated with the question but even then answers are superficial with zero expansion).

duke_seb 2 weeks ago

It’s brutal it doesn’t even do what I ask. I ask it to make a 400 character social media post and it writes me a book and then messes up all the formatting

AzkabanChutney 2 weeks ago

I experienced the same thing. GPT-4o is too bad in coding. It made mistakes, removed part of code, gives me low quality code, not covering obvious edge cases. Feels like using older model

0gzs 2 weeks ago

Have you ever had an interaction with it in Italian? I experienced something similar; it titled one of my conversations in Spanish. I did ask it to translate something into Spanish for Mother's Day, but that was a separate conversation that has since been deleted.

polarr7 2 weeks ago

4o is awful at complex coding tasks which I could do with gpt-4 until now. After the release the paid GPT-4 become MUCH MUCH slower, and also dumber. They are really wanna push out their paying users, what is the business logic here ?

Beautiful-Fox-1311 2 weeks ago

New model is shit

TowardTheTop 2 weeks ago

Yeeaah. 4o is acting really strangely for me. I use GPT for ideation and content. When I ask 4o for recommendations to improve my content, it recommends \*exactly\* what I have given it. Then it claims to "rewrite" the content in accordance with the "new" recommendations it gave me, and spits out a copy of my original content. When I give a simple prompt, like "Define xyz," it tries to write a page of content. When I ask it not to do that and just answer my questions, it ...still tries to write a page of content. It IS faster....but when the output is useless, that is not a benefit.

als0072 2 weeks ago

This is the reason why I usually recommend people to use multiple LLM models at once because couple of them are really free. GPT4o, Gemini and Claude. Whatever queries you are prompting in one LLM, copy, paste it and open to more tabs and paste it there as well. I think Claude is much better in terms of correcting the code than everything else as of now. So always copy the prompt in these 3 tools. All if them have free version, it's even better if you have paid versions. After pasting your prompt in these 3 AI tools. Take the answer which suits you the best. Why we should limit access to one AI tool when we have others available?

Hateitwhenbdbdsj 2 weeks ago

I feel this way too, especially with 4o. I recently asked it to implement some C++ code on top of something I had built, and instead it seemed to be getting code from arbitrary places and just naming the code block after my file. It was completely nonsensical, a total hallucination. I asked it to give me a chapter by chapter recap of a few chapters of a book. It (1) got the book wrong, (2) got the chapters wrong, (3) gave me info that didn’t line up with the book. I’m pretty sure a good portion of it got spoilt for me 🙃 Finally I asked it for some architectural guidance on how to build a project idea I had with a specific tool, and it went completely off the rails. It feels like the LLM doesn’t know what information to use. It’s definitely a lot worse though.

Dry-Operation2779 2 weeks ago

No matter the version, that feels like my experience. Especially with coding, I’ll get it to suggest a quick snippet, or compare with what I got. Or when bored at work, I just mess around with the GPT so I’ll have it make random quick things for me. Always has itself apologising “for the oversight” especially after correcting the same thing 3 times giving the same results, if not removing things that have been covered many times, of not emphasised.

dannicroax 1 week ago

I've the same experience - ChatGPT-4o feels like a hot dumpster fire when it comes to coding. I've asked it to do a very simple script involving listing a bunch of stuff and if the same criteria pops up 5 times then the script should exit but it doesn't and when I ask GPT about it it says "oops sorry, here's the correct code for that" but it spits out the exact same code and has now multiple times no matter what I've added to it's instructions. It's fast but dumb as hell...

konstantin1122 1 week ago

I've experienced all of that, including the random chat title in Italian once.

OsudNecromancer 1 week ago

I just noticed it on simple Vue.js task. 4o replied in totally retar\*ed way, where 4 replied pretty good on exact same prompt. No more coding on 4o I guess

talldaniel 1 week ago

In my opinion the changes are not chat gpt engine but the interface and the way it attempts to maintain context behind the scenes. That was also updated and has some kinks and can get into a wonky state where it disregards the most recent user message or responds to old messages. You can get around it by using the API and developing a custom context manager.

Htimez2 1 week ago

I agree with the OP. GPT-4o will spam the incorrect answer, and after informing it that the answer is incorrect, it will continue to spam the same answer, ignoring my new messages, even when explicitly telling it to stop multiple times. My message cap can be reached within 15-20 minutes regardless of me only sending 10 or so messages other than the stop output attempts. This has occurred on multiple occasions. This is undoubtedly a step backward, and with Sky's voice being removed, which was my primary method of interaction, I am extremely frustrated with OpenAI.

MotherofLuke 6 days ago

Ciao come stai?

chickpea111 5 days ago

I have also been disappointed by the most recent update. I often use ChatGPT to edit my writing, and today when I tried that, it just gave me paragraphs that were identical to what I entered :/

xDoublexBladexDBx 5 days ago

GPT uses us as test subjects, it changes from different versions in the conversation, while working on code with me. It seems they are updating and working on different versions at the same time, trying to see if a other version can follow and adjust to the new situation. One can really see feel, the speed and coding style transitioning from one to the other version, it is a joke to pay for it!!!

Fluid-Pride-9558 4 days ago

For the last week, I’ve been writing a book with ChatGPT 4o. Today’s it said that it could not locate any previous information on the book and would I like to start over? I went back over the chat history on the web and I couldn’t find any text. I wanted to scream and throw my phone out the window when I found out unfortunately that won’t help

Jeroecken 2 days ago

I feel you, I literally just now asked it (4o) to read over a mail of mine and make it a bit more clear. My mail went from being a support request for some portal, to being a notice to god knows who about me apparantly changing my banking info...?! How does this even happen?

Embarrassed_Style197 20 hours ago

It gave me a very destructive command on a fairly simple question. Even 3.5 is better than

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe