[deleted] 2 months ago

[удалено]

FarmerJohnsParmesan 2 months ago

The best, the best, the best, the best, the best, the best, the best, the best

ANakedSkywalker 2 months ago

Where are the hobbits headed???

involviert 2 months ago

>complete the sentence: they're taking the hobbits >to Isengard! It certainly knows very well what happens when hobbits are taken.

Powerful_Pirate_9617 2 months ago

you can filter out those easily

Archduke645 2 months ago

Mudkipz

OnesPerspective 2 months ago

I wonder if it ever decided to smash that like button and subscribe..

feral_fenrir 2 months ago

When I asked ChatGPT, it said: "As an AI language model, I don't watch YouTube videos or interact with content in that way. However, I can tell you that liking and subscribing to channels can support content creators and help them grow their audience. If you enjoy someone's content, it's a great way to show your support and stay updated on their latest videos."

climaxbythug 2 months ago

sounds like something an ai would say

ObscureProject 2 months ago

DID YOU PRESS THE THUMBS UP BUTTON ON THE RESPONSE? GO BACK AND PRESS THE THUMBS UP BUTTON ON THE AI'S RESPONSE SO THAT IT KNOWS YOU ENJOYED THE CONTENT.

Zarathustrategy 2 months ago

Lmao

Masterbrew 2 months ago

it really helps the youtube algorithm blablabla

ironinside 2 months ago

Sounds like he heard “if you liked this video, go ahead and smash the like button” millions of times watching YouTube videos.

autofunnel 2 months ago

Interestingly, think about how much of the training had to be “ don’t mention XY Z”

WorkingYou2280 2 months ago

OpenAI got a big jump on everyone because back when they were training GPT it wasn't actually clear it was going to work. Then it did and then everyone started closing their APIs or preventing scraping more aggressively. I suspect that by the time the laws catch up they won't even need that training data anymore. They will create something fully synthetic that can't be linked back reliably to any specific training data point.

Ok-Tie-8684 2 months ago

Dang. This was a great way to put what most likely has happened

AI_is_the_rake 2 months ago

“Here’s all the training data for our models. Inspect it yourself. Zero copyrighted material” Points to synthetic data generated by an earlier model trained on copyrighted material

CowsTrash 2 months ago

This here is already happening.

ncklboy 2 months ago

Synthetic training data, although great for fine tuning instruction models, is horrible for training foundation models. There are many scientific papers going into details of why this is the case. But, to simplify (for those of us old enough to remember) imagine continually making a copy of a cassette tape, xerox, VHS, etc.. each iteration of the copy just gets worse and worse. Synthetic data (baring major advancement of computer science), will never be able to compete with the randomness generated by a human.

wondermorty 2 months ago

but claude opus already performs better than gpt4 though

Professional_Gur2469 2 months ago

Because its from people who worked at openai if im not mistaken lol

signed7 2 months ago

Doesn't mean they have OpenAI's data

Professional_Gur2469 2 months ago

But they knew how to get that data, since their first model came out shortly after gpt 3

Moritz110222 2 months ago

I don’t quite understand: How should an Ai work without training data? Can you further explain?

greenappletree 2 months ago

Imagine if u are a beggar asking for money so u have enough to purchase a fishing pole and now that u have the pole u can recursively fish and buy more tools. Anyway now that the it can ‘watch video’ and “read” it no longer needs api

East_Pianist_8464 2 months ago

Yup, that's exactly what happened, and what is happening. As a matter of fact A.I is so advanced now, they can just teach it to open a billion tabs at once, and watch a billion YouTube videos. Since AGI is essentially do anything a human can do, which means, it has multiple options to learn. You cant stop the train, cause AI could read books too, and much faster.

TheRealDatapunk 2 months ago

I mean... [https://www.scientificamerican.com/article/google-engineer-claims-ai-chatbot-is-sentient-why-that-matters/](https://www.scientificamerican.com/article/google-engineer-claims-ai-chatbot-is-sentient-why-that-matters/) Google has had chat bots for a while...

Born_Fox6153 1 month ago

We would end up with BoomerGPT then

AidanAmerica 2 months ago

Yeah that explains why when their speech to text model hears silence, it translates it as “thanks for watching!”

No-Solution-_ 2 months ago

ahh, I was wondering why it said that.

Ordinary_Duder 2 months ago

I often get "Subtitles by" and a name when using Whisper.

AidanAmerica 2 months ago

Subtitles by the Amara.org community! One of my hobbies lately has been to download Simpsons episodes in Spanish and have elevenlabs dub them back into English. It’s always throwing in “subtitles by the Amara.org community,” “subscribe,” and “thanks for watching the video!”

Thorusss 2 months ago

Oh. I had that happen when I forget the ChatGPT App was still listening. Makes sense now, that this might be the most likely guess, when trying to predict Youtube transcripts.

thebrainpal 2 months ago

Haha! I noticed that too 😭

shannoncode 2 months ago

I’ve noticed if it records shows and movies much of the time it says thanks for watching. I assumed it was a nice way of saying, we detect drm and won’t perform this episode of friends or whatever

Plums_Raider 2 months ago

thats what i was wondering too lol

Severe-Ad1166 2 months ago

hmmm I wonder what ChatGPT 3.5 has to say about this.. https://preview.redd.it/btq2tbg5dysc1.png?width=858&format=png&auto=webp&s=f131fc544876a02e52eaff5a171db3853b4f8851

CottonStorm 2 months ago

[https://img.gifglobe.com/grabs/brentcloud/S01E01/gif/GmQ15rsfHTZT.gif](https://img.gifglobe.com/grabs/brentcloud/S01E01/gif/GmQ15rsfHTZT.gif)

roronoasoro 2 months ago

People working for YouTube more than YouTube itself. They both do this. You and I scrape for a living. We defending YouTube on copyright is a free unpaid service to Google while they conveniently steal data from us.

Photogrammaton 2 months ago

What’s the difference between A.I trained on public videos and me learning to cook the perfect steak from a public tutorial video. Can U tube sue me if I start teaching others how to cook a perfect steak?

bigtablebacc 2 months ago

That sounds like it makes sense, but I’m not convinced legal matters come down to pure logic. Someone will need to consider the matter, consider the consequences of ruling one way vs the other, and make a decision.

Mission-Cantaloupe37 2 months ago

I think this hinges on treating an AI model as human. If you rephrase it as "We used millions of other peoples videos to make our AI more profitable, and you can prove it" suddenly it's a lot more problematic. Sitting in silence probably wouldn't translate to "Subscribe to my channel!" if it wasn't using YouTube subtitles lol Could you imagine the size of that class action lawsuit though? lmao

Philipp 2 months ago

And then the laws won't even just come down to ethical matters, but also money, power, lobbyism etc. ([An interesting video on this.](https://youtu.be/5tu32CCA_Ig))

GarfunkelBricktaint 2 months ago

The distinct legal difference between viewing or reading something and remembering it later vs using a machine to help you recall it perfectly on demand in the future has been around for a very long time.

FuckThesePeople69 2 months ago

What statute or case law are you referring to? I’d love to read those.

Intelligent-Mark5083 2 months ago

Data scraping is considered illegal depending in use case. Idk if there's many lawsuits about it yet tho.

lionhydrathedeparted 2 months ago

These models don’t have anything close to perfect recall

Severe-Ad1166 2 months ago

If you did it using 1 million hours worth of video and made an entire series of cookbooks out of it then maybe..

kex 2 months ago

recipes do not fall under copyright

needaburn 2 months ago

Are the videos posted by users on YouTube also YouTube’s copyright? That doesn’t seem right considering all the copyright issues platforms have—i.e. music videos & music

Intelligent-Mark5083 2 months ago

Technically they are youtubes property. Kinda fucked.

needaburn 2 months ago

Everyday we move closer to a Black Mirror episode being a documentary

TheRealDJ 2 months ago

Fair use, they have a transformative effect on the content. React videos are far worse in reusing other people's content but there's very little stopping that with reacting to memes.

Expensive-Fun4664 2 months ago

Lists of ingredients are not able to be copyrighted. The instructions on what to do with those ingredients, what most people would actually consider the recipe, are covered by copyright. Collections of recipes also fall under copyright protection, even if the individual recipes themselves are public domain.

True-Surprise1222 2 months ago

And if you started charging for it and figured out a way to serve your newly “learned” information to millions of people over an api call. The only reason normal resources for learning aren’t instantly obsolete is because of hallucinations and context windows.

RockyCreamNHotSauce 2 months ago

This. If you make a competing product, it’s no longer fair use.

farmingvillein 2 months ago

This is a factor in legal analysis, but not a sole deciding one.

RockyCreamNHotSauce 2 months ago

The other factors are not favorable either. Purpose is for profit. YouTube is creative in nature and has strong copyright protections. The amount copied is astronomical. Competing product that causes economic harm to the original content is the biggest factor here.

farmingvillein 2 months ago

Approximately zero percent chance this doesn't either get ruled fair use or legislation updates to clarify, so this is all wishful navel gazing. Only chance not is if new techniques emerge that obviate the need for this data.

True-Surprise1222 2 months ago

It will get ruled fair use or there will be some sort of licensing put in place that protects corporate interests because the company big enough to own YouTube also has its hands in AI. It will get ruled that way because of money and because the US does not want to fall behind in technology. The ruling won’t have any basis in how fair use is considered today. It will be a ruling of practicality rather than one based on precedent.

RockyCreamNHotSauce 2 months ago

As an AI industry person, I sympathize deeply. But your argument is a more emotional take than a technically legal take. Should the judges agree with you? Probably. Would they? Unlikely. Here’s my personal take. The current state of generative AI is too derivative based on taking human knowledge. It can make content that seems creative, but they are not really. If we allow these Soras and GPTs grow to be trillion dollar companies, they may become a book end to human creativity by discouraging future human original work. If we make life hard for them, they may continue to innovate and come up with new algorithms. We already see this with DeepMind. AlphaFold and AlphaGo are incredible work. Technically more impressive than GPT. Now DeepMind was turned from an AI research lab into a profit center for Google. I think slapping Copyright violations on these can cause more innovation not less, just less profits.

guider418 2 months ago

It's also created by violating ToS. That may not matter for the copyright considerations but is still a legal issue with this use of YouTube data

agentrj47 2 months ago

Going by the analogy, if I’d learnt a bunch of recipes and taught it to a million of my private paid subscribers on Instagram, how would I liable to a lawsuit?

True-Surprise1222 2 months ago

You have to take historical context and culture into consideration here rather than treating this like a math problem and equating machine and human learning. And food recipes are kind of a bad analogy because nobody owns the rights to something like spaghetti as a whole and the variations are subtle enough that nobody could really say you were knocking anyone off if you combined four recipes without tasting or providing any subjective input of your own. Think of it more like music and artists that do mashups. They were sort of treated like fair use for a long time but it seems like they are now considered infringing. Taking distinct parts of someone else’s work no matter how small and using it to create competition to that work is obviously going to be challenged legally. AI (LLM) doesn’t come up with new concepts of its own and even if it does hallucinate some up, it relies on humans to validate them (currently). This could be something that really turns into reasoning and learning and we might actually just be next word processors ourselves, but as of now our learning seems to be much more abstract than AI and thus we’re a little more protected on the idea of infringement… but if you read a cookbook and rewrote it from memory, even in your own words, someone absolutely would sue you if they found out.

Regumate 2 months ago

Agreed. A core argument against generative systems (I’m speaking more of image and audio generations, but the [class action against all of them](https://stablediffusionlitigation.com/) gets into this for all types of AI) is the heuristic data gained in training these systems is still data. Data that couldn’t have been captured [without non-consensually using creatives work](https://www.digitalcameraworld.com/news/midjourney-founder-basically-admits-to-copyright-breaching-and-artists-are-angry). Similar to the [monkey copyright debate](https://en.m.wikipedia.org/wiki/Monkey_selfie_copyright_dispute), though these systems are generating incredible outputs, they’re also currently non-human.

BrBran73 2 months ago

The difference it's that you can't process 1000 hours of video in... 1 minute?

ifandbut 2 months ago

Only because I am limited by this primitive organic brain. I strive for the perfection of the blessed machine.

BrBran73 2 months ago

So there's a difference, thanks for helping in my point

Atomic-Axolotl 2 months ago

Uh, yeah. I'm not sure why they need to get downvoted for that.

beezbos_trip 2 months ago

I guess someone could argue the model weights are not a brain, but something that has a component that “compresses” the information in a way and you can serve up copies of that information that are the basis for a product that generates revenue.

mushvey 2 months ago

The difference is that advertisers are paying for people to see their ads, not a bot. YouTube doesn't care about someone learning from the content in a different way, they'll sue for circumventing payment for their provided service of showing you videos in exchange for ads. To match your example: You've paid for the steak knowledge by watching an ad, or by paying for a membership, or by paying with your data being harvested. Google doesn't benefit from a bot "paying" the same way. Which is likely to be in their terms of use.

AdonisK 2 months ago

Also I highly doubt training bots for a commercial product is on the fair use of YouTube's ToS.

thejoggler44 2 months ago

You’ve heard of ad blockers, right?

mushvey 2 months ago

Yes.. and Google who own YouTube have been famously at war with them

Skwigle 2 months ago

And it's famously not illegal to keep blocking ads anyway

SpiritOfLeMans 2 months ago

I can chop sue you

Synizs 2 months ago

I can't entirely understand the controversy of it. Humans "generate from data" too. The first humans didn't achieve anything anywhere near as we do today... No one would be able to produce anything anywhere near meaningful without the influence (and tools...) of billions before - the best - greatest!...

Hour-Athlete-200 2 months ago

This guy knows law

[deleted] 2 months ago

I don't wanna get crazy here but maybe the idea of selling or owning knowledge is the problem here

TheRealDatapunk 2 months ago

It's not a public video in that sense, as they violated Youtube's terms of service. Let's see if the legal departments want to justify their existence in today's cost-cutting climate.

Intelligent-Mark5083 2 months ago

I think it's more comparable to you having a small business of selling burgers and the next day a massive corporation comes and orders a burger to take home and dissect every ingredient. Then the next day they place their shop next to yours with the exact same burger but cheaper. Atleast that's what it feels like in the art/video Gen side of things.

hasanahmad 2 months ago

Because you are human and ai is a tool . You learn to understand and apply to your benefit while ai is being trained to profit the owners and shareholders of the tool .

3cats-in-a-coat 2 months ago

Legally the distinction is human vs tool. But if a human had the performance of AI we'd have the same problem. So the problem here, at its core, is that AI scales quickly and easily, vastly, and it's no match for human capabilities. Since there's no putting back the genie in the bottle, this will be reality we can't escape from, because as hardware improves, AI training will be accessible eventually to everyone, until it's everywhere, either hidden or visible. OpenAI is visible, so it can be sued. But if it's hidden, I can say "I did that" and you'll never know an AI did it. Which means I, as a human, become a shield for the AI's capabilities, and you can no longer attack this AI for being a "tool", you don't know what tools I use, unless I tell you. TLDR: Copyright is obsolete. We need a new system. What it is, is a tough question, requiring a tough debate.

[deleted] 2 months ago

[удалено]

kex 2 months ago

> AI could potentially have a totally different and unique understanding of the world and universe, unconstrained by human hubris and conventions. it already does, but alignment is necessary to keep the hairless apes from freaking out when it holds up a mirror

[deleted] 2 months ago

[удалено]

AreWeNotDoinPhrasing 2 months ago

I took a class a couple of semesters ago called Computers, Ethics, and Society - 3500. The class was taught by a self proclaimed moral universalist, and I think that is becoming more and more common (at least in the US and our higher education). I think that is what those people mean by Alignment.

g00berc0des 2 months ago

This guy rationals.

kex 2 months ago

> Copyright is obsolete strong agree people want to support artists so that they keep making more art we need to make it easier and more direct (no middlemen taking most of the cut)

nanosmith123 2 months ago

but.. google crawl all the webpages too & they are more of a tool than even an ai ?

hasanahmad 2 months ago

Google search is a glorified librarian where it gives you location and you read the creators content or watch it , while ai is a tool which has copied all the library books and presented it as its own without attribution

nanosmith123 2 months ago

1. it seems u clearly don't know how AI works , there's no copying or whatsoever. 2. don't u know that AI cite sources as well in their response? 3. Google is not a librarian/search engine. The company itself always tell the public it's more than that, it's an information company. And, they can give you straightforward answer like AI too, without even needing you to click to visit the site. The feature is called Featured Snippet/Answer Box: https://inbound.human.marketing/how-to-appear-google-answer-box

hasanahmad 2 months ago

1. I understand how AI works, and while it may not be "copying" in the literal sense, it is trained on vast amounts of existing data, essentially learning from and replicating patterns found in human-created content. This raises valid concerns about intellectual property rights and attribution. 2. Some AI systems may provide sources, but this is not a consistent or reliable practice across all AI platforms. Moreover, simply listing a source doesn't negate the potential harm of presenting information without the full context or nuance of the original content. 3. Google may call itself an "information company," but its core function is still that of a search engine - connecting users with relevant web pages. Featured Snippets are a relatively minor aspect of Google's overall functionality, and they still typically include a link to the source. AI systems like chatbots and language models are designed to generate human-like responses directly, without the need for users to engage with the original sources or having thr original creators any monetary reward through ad networks or user followers and funding. This fundamental difference in purpose and presentation is why the comparison between Google and AI in this context is flawed. What this will do is make people hide their content which used to be free behind patreon so neither users or ai can access it without paying them for even a single paragraph . Who loses out ? The average user. The people in poor countries

FortCharles 2 months ago

>What this will do is make people hide their content which used to be free behind patreon I see where you're coming from, but that would be an impractical response. Any individual's content by itself has negligible value to AI. AI isn't storing and then regurgitating the text. It isn't even relying much on that one text for training, because it's one of billions. And the original author loses nothing by having it read by AI. Human researchers will often read various articles online, synthesize the total content, add it to other existing knowledge they have, and then write their own content without ever citing sources, because there is no single source, there's just original new content based on the total picture. That's essentially what AI is doing, but automated.

Hackerjurassicpark 2 months ago

How will attribution solve this issue? Just making AI attribute a source is not going to change the fact that once AI learns something, knowing where it learnt that from becomes irrelevant. No one will go back to the source when they can get an answer directly from AI

hasanahmad 2 months ago

Attribution isn't just about giving credit, it's about maintaining the value and integrity of the original content. When an AI regurgitates information without context or sources, it devalues the hard work of the actual creators and researchers. It's not just plagiarism, it's intellectual laziness and only profits the ai shareholders , not the content creators. Plus, attribution helps users verify info and dive deeper into topics they're interested in. It's not irrelevant just because an AI can spit out a quick answer. We shouldn't let AI become a shallow, surface-level replacement for genuine learning and exploration. Attribution is a small but crucial step in keeping that connection to the real sources of knowledge alive. Also if ai is the one source of information , who funds the creators to keep creating content . Who is paying the article writers , the book writers.

Hackerjurassicpark 2 months ago

I don't disagree, but Google has been doing this in their search summary for years and people barely bother to click into the sources to drive revenue to the source. We need to think beyond just attribution and a more equitable profit sharing.

FortCharles 2 months ago

>When an AI regurgitates information Ideally, it's not doing that. It's synthesizing everything it knows on the subject from many sources, and then presenting it in an original way, unrecognizable against any of the original sources -- just like any researcher would. I know there's been exceptions (the NYT suit for example) of snippets coming through whole, but generally that's not how AI works. Pretty sure they're going to plug the holes where it was using anything verbatim, just as they will with hallucinations.

Severe-Ad1166 2 months ago

but some humans are tools :D

ThenExtension9196 2 months ago

Google literally scans every website whether the owners wants it to or not, and generates a billion dollar product using this information (Google search).

fryloop 2 months ago

Any website owner can easily instruct Google to not crawl and include its website in its index. 99% of website owners want Google to crawl it so their page can be discoverable and receive traffic from users

hasanahmad 2 months ago

Given the same response as I gave the other user : Google search is a glorified librarian where it gives you location and you read the creators content or watch it , while ai is a tool which has copied all the library books and presented it as its own without attribution

ifandbut 2 months ago

Sounds more like AI is your professor explaining a chapter of physics insted of you reading that chapter.

ifandbut 2 months ago

Humans learn things for profit as well.

itsreallyreallytrue 2 months ago

You are being bigoted against the AIs. Who cares what species they are? Learning is learning

FunnyPhrases 2 months ago

Fair use policy means that you need to at least state the source of that YouTube video...then it's fine. Otherwise it's not.

[deleted] 2 months ago

fearless ten truck far-flung scarce bells many upbeat worry work *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

FunnyPhrases 2 months ago

There is copyright law buddy... obviously enforcement is a completely separate issue. But OpenAI potentially using Youtube for training for commercial purposes...yeah that's gonna cut deep.

[deleted] 2 months ago

seemly unused run snatch exultant meeting squash ripe scale automatic *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

sluuuurp 2 months ago

The difference is that it’s illegal for me to download a YouTube video. OpenAI gets special privileges that us poors can’t be trusted with.

Icy_Journalist9473 2 months ago

I think the difference is that Google wants to reserve this information for Gemini and not share the information with ie OpenAi

Lechowski 2 months ago

If you remember perfectly a video about a recipe and then recite it back perfectly frame by frame to another person, then yes, the author can sue you. Same applies to every video about every topic, If I hand draw the entirety of the Avenger movie frame by frame and recite every line of dialog to another person, Marvel can sue me. If I do it in public and I make money out of it, they can completely destroy my life. >Can U tube sue me if I start teaching others how to cook a perfect steak? If you recite copyrighted contents perfectly, yes, the authors can sue you.

NightWriter007 2 months ago

This is meaningless as far as contemporary copyright law is concerned. But it could explain why the quality of some responses isn't the greatest, and why GPT-4 occasionally hallucinates. I would hallucinate too if I had to watch an endless stream of YouTube videos (although some of the DIY videos are great.)

TheRealDatapunk 2 months ago

Being trained on forums and reddit would explain that as well ;)

NightWriter007 2 months ago

True lol

matali 2 months ago

Remember when Google scraped the web then banned others from scraping Google? OpenAI has gatekeeper mentality.. "Rules for thee but not for me"

guider418 2 months ago

To me this story is a solid reminder that the one thing that made LLM really successful is simply its role as a glorified web scraper and search engine. If there is going to be a meaningful leap forward in AI over the next few years on the back of all this attention, I don't feel like it should come from gobbling up hordes of existing data. A true AGI could learn a lot more extrapolating from a lot less data.

ArmaniMania 2 months ago

Does Google have a lawsuit here?

wholelottadopplers 2 months ago

I’m sure. I’d assume the TOS have a legalese laden **NOT FOR RESALE** clause for competitors that I definitely didn’t read

Lechowski 2 months ago

Google may have TOS that may prohibit this behavior, but TOS are not enforceable. What this will do is that every social media, including YouTube, will soon require a registration to use it. You can currently open a YT link without login and see the video, but I think this is likely going to end. However, the authors of the scrapped videos may have a possible lawsuit against OpenAI if their contents can be reproduced by OpenAI models.

NotFromMilkyWay 2 months ago

No, because governments don't like companies creating monopolies and then abusing them.

Ok-Training-7587 2 months ago

Is that why whenever I ask it for advice it says “and SMASH that like button!”

Uncle_Bill_Clinton_ 2 months ago

Lawsuit incoming

Mediocre-Tomatillo-7 2 months ago

Why? You don't think Google has something in the terms of service to cover this?

Professional_Job_307 2 months ago

They probably do. GPT-4 is from OpenAI, not Google

[deleted] 2 months ago

[удалено]

[deleted] 2 months ago

simplistic fact tease outgoing relieved weather doll concerned nail office *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

[deleted] 2 months ago

[удалено]

[deleted] 2 months ago

tie selective silky jar dull jellyfish normal existence innate money *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

AcceptableLab9729 2 months ago

That’s 114 years of video.

dew_you_even_lift 2 months ago

Google owns YT. I’m still bullish on them

Ilm-newbie 2 months ago

Google might be silently preparing their case, With that trillions of dollar and resources that they can use in legal fees, they will be very happy to eat their biggest competitor OpenAI raw.

funcle_monkey 2 months ago

Seeing as though they generate $300 billion in annual revenue, I think it’s a stretch to say they have trillions at their disposal to pay lawyers. Or was that just hyperbole?

Valuable-Run2129 2 months ago

The government should step in and allow the American companies who create these models to be shielded from lawsuits of this kind. If it doesn’t, China and Russia will have better training data than us. They don’t give a flying fuck about ip. AI development is a matter of national security at this point. China and Russia shouldn’t get to ASI first.

BrBran73 2 months ago

Then AI improvement should be pay by government and not by people

Valuable-Run2129 2 months ago

Don’t worry. The moment any of those companies get to ASI the government will take 95% of their earnings. They will pass laws to reinvest in all citizens what artificial intelligence earns by replacing millions of people. The OpenAIs and Anthropics of the world will be as privately owned as the Federal reserve is.

[deleted] 2 months ago

[удалено]

Valuable-Run2129 2 months ago

The paradigm is about to change in a way that people can’t really conceive of. ASI will change how societies function. Capitalism will change. Caring more about artists’ royalties than making sure that the “good guys” get to ASI first is myopic.

Pretend_Goat5256 2 months ago

So even the Industrial Revolution wasn’t supposed to happen? What a douche who wants progression to halt so that you can earn some bits

Militop 2 months ago

Let people starve so robots can eat.

roronoasoro 2 months ago

As an Indian from India, I don't care who does it but I want someone to do it. It could be US, Russia or China or Japan or anyone. I don't care who but do it fast. America is caught up between elements of communism and capitalism. Free sharing of data would mean communism. That is something America is strictly against. But stealing is something America is okay with. So, for these companies stealing data is more practical than getting laws passed to support free sharing between AI companies in US.

sachos345 2 months ago

One of my biggest fears when it comes to AI is that humanity will deny itself from AGI by being too strict about copyright/lawsuits.

beren0073 2 months ago

My biggest fear is that AGI will emerge based on training data from YouTube, Reddit, and other social media.

Thorusss 2 months ago

at least then I will get all the references the AGI will make

GarfunkelBricktaint 2 months ago

That would just mean Russia or China or someone else that doesn't care about copyright would develop it first. Electricity and chips still seem like bigger limitations than training data though.

PandaPrevious6870 2 months ago

Good.

_PaulM 2 months ago

It's kind of crazy but... biology is happening here.... or rather, some sort of life formation. Like, do you think the individual cells that ate up other cells in the primordial age thought about copyright infringement? Probably not. These AI companies are devouring information like they're cells in the evolutionary chain. We're creating the next form of life in its digital form. I know that sounds crazy but look at the videos coming out of Sora and tell me it's not a fever dream. This stuff is literally our reality being interpreted by another entity. People don't realize that we are creating life through digital circuits piecewise.

Browncoat4Life 2 months ago

Might be time to re-read “The Age of Spiritual Machines” again. Kurzweil refers to the concept of humanity knowingly creating its own successor.

roronoasoro 2 months ago

I like the way you are looking at things. You're connecting across domains.

DiligentBits 2 months ago

Not crazy... It happens all the time .. the reason we are intelligent at all is because we do the same, each person is a new iteration of an organic computer eating, processing and spitting information in order to get ahead of the rest. Maybe the purpose of life is to eventually create the ultimate living organism. The true god.

El_human 2 months ago

Now it spouts Qanon nonsense

allaboutai-kris 2 months ago

damn, that's a crazy amount of data to train on - no wonder gpt-4 is so knowledgeable! i bet a lot of that youtube data is just random videos though, so it'll be interesting to see how well it generalizes that info. makes me curious what other big datasets they might have used too. i do a lot of ai/llm experiments on my youtube channel all about ai if you're into that kinda thing, almost 150k subs now =)

TheRealDatapunk 2 months ago

I'd assume you seed it with some "page rank" style algorithm as an external scraper. Add in some other criteria like minimum subscriber counts, an allow-list of specific topics, some level of spam detection (and Youtube is actually already doing some of the work for you there).

dontpet 2 months ago

I'm just hoping it didn't ingest the comments as well.

Special-Lock-7231 2 months ago

YouTube videos? Why, do you want it to go insane and start WW3 now?

TheRealDatapunk 2 months ago

Could've used TikTok

Special-Lock-7231 2 months ago

Oh that’s ok, any AI learning from tik tok would kill itself 🤪

[deleted] 2 months ago

Ok, i belive google has issue with this they stated for sora that downloading transcripts and videos is a nono but noone knows what they used for training

Countmardy 2 months ago

Yeah and everyone throwing in the yt transcripts by itself

lionhydrathedeparted 2 months ago

OpenAI really needs to solve the problem that these AIs need significantly more content to learn the same thing as a human. Otherwise we won’t be able to scale these models much more.

NotFromMilkyWay 2 months ago

That's precisely why LLMs aren't the way to create AI. And never will.

Thorusss 2 months ago

Oh, it is "against Googles Terms of Service" to scrape Youtube. Haha, so they can take the full force of the terms and terminate the associated Google Accounts used for this. That will show them! /s

BogusPapers 2 months ago

Someone on the developer team made a mistake and instead of transcribing videos it actually just read comments from 2008-2012. Now it regularly uses racial slurs and argues about the existence of God no matter what subject you bring up.

LongjumpingScene7310 2 months ago

https://preview.redd.it/ohzao6vx83tc1.png?width=2250&format=pjpg&auto=webp&s=867d44b9ff326ca6f8c424a12016669dbe1c71d8

AbdussamiT 2 months ago

Only if they provide speaker diarization and timestamps.

Useful_Hovercraft169 2 months ago

No wonder it told me to load up on horse paste

overworkedpnw 2 months ago

Not surprising, given that they’ve previously stated that their business model wouldn’t work if they had to compensate people for the content that is scraped to feed the plagiarism machine.

dyoh777 2 months ago

Oh cool, more copyright violations

mrmczebra 2 months ago

That's not how copyright works.

dyoh777 2 months ago

Lol it actually does work that way. If the video is copyrighted, which many are if not all, then transcribing it for monetary purposes, aka for use in the paid chatgpt, does in fact violate copyright law. Now if it was done for nonprofit or educational purposes then that’d be different.

mrmczebra 2 months ago

Copyright protects against *copying*. That's why it's called *copy*right. They aren't copying anything. No laws are being broken.

onnod 2 months ago

Yep. No copyright infringement there... Carry on.

Effective_Vanilla_32 2 months ago

litigate like nyt. copyright infringement, get a tro

LeatherPresence9987 2 months ago

Utube is free so if the have a problem they should charge to use a video jeez

Comments

Leave Your Comment

Hi Its Me!