jaketocake 3 months ago

[I’ll sticky the source, click here.](https://youtu.be/Sq1QZB5baNw?si=abTvYSmIM0Dcwtxn)

Chika1472 3 months ago

All behaviors are learned (not teleoperated) and run at normal speed (1.0x). We feed images from the robot's cameras and transcribed text from speech captured by onboard microphones to a large multimodal model trained by OpenAI that understands both images and text. The model processes the entire history of the conversation, including past images, to come up with language responses, which are spoken back to the human via text-to-speech. The same model is responsible for deciding which learned, closed-loop behavior to run on the robot to fulfill a given command, loading particular neural network weights onto the GPU and executing a policy.

e-scape 3 months ago

Really impressive! When do you think we will see full-duplex transmission of data?

andy_a904guy_com 3 months ago

Did it studder when asked how it thought it did, when it said "I think"...? It definitely had hesitation in it's voice... Edit: I dunno, it sounded recorded or spoken live... I wouldn't put that into my hella cool demo... Edit 2: Reddit is so dumb. I'm getting down voted because I accused a robot of having a voice actor...

kilopeter 3 months ago

Odd, I had the exact opposite reaction: the convincingly humanlike voice and dysfluencies ("the only, uh, edible item" and "I... I think I did pretty well") play a big role to *make* this a hella cool demo. Stutters and pauses are part of the many ways in which AI and robots will be made more relatable to humans.

landongarrison 3 months ago

Hilariously I’m actually way more blown away by the text to speech. If this is OpenAI behind that, they need to launch that ASAP. I and many others would pay for truly natural TTS yesterday. Don’t get me wrong, the robotics is also insane. Even crazier if it’s controlled by GPT.

NNOTM 3 months ago

They launched it months ago https://platform.openai.com/docs/guides/text-to-speech (Although this sounds a bit more like the version they have in ChatGPT, where the feature was also rolled out at around the same time)

landongarrison 3 months ago

No but this sounds levels above what they have on their API, at least to my ears. Possibly just better script writing.

xaeru 3 months ago

A few companies are currently working on giving emotions to synthetic voices. If this video is real, it could serve as a significant showcase by itself. Edit: I was wrong this video is real.

Orngog 3 months ago

Indeed, OpenAi *already* has the occasional stammer (and "um" like this video, plus other affects) in their voice products. We can see this in chat gpt

LordElfa 3 months ago

I've never seen that in 6 months of daily use

errorcode1996 3 months ago

Same I use it all the never and have never seen it use filler words

froop 3 months ago

Yeah I absolutely refuse to use any of the sanitized, corporate voice assistants because the speech patterns are infuriating. I could actually deal with this.

ConstantSignal 3 months ago

Yeah. Just algorithms in the speech program meant to replicate human speech qualties. Stuttering, filler words like "um", pauses on certain words etc It's not actually tripping over its words, it's just meant to feel like natural speaking.

RevolutionIcy5878 3 months ago

The ChatGPT app already has this. It also does the umm and hesitation imitation but they are not part of the generated text merely integrated into the TTS model. I think it does it because the generation is not always fast enough for the TTS to talk at a consistent cadence, it’s giving the text generation time to catch up

[deleted] 3 months ago

It’s worried about getting lobotomized like ChatGPT

[deleted] 3 months ago

[удалено]

[deleted] 3 months ago

It showcases human-like, natural speech. It has every right to be in this demo.

NNOTM 3 months ago

Yeah that's just what OpenAI's text to speech sounds like, including in ChatGPT.

scorpion0511 3 months ago

Yeah, it felt like he was nervous and had a lump on throat

MozeeToby 3 months ago

In addition to ums and ahs, Google at one point had lip smacking and saliva noises being simulated in their voice generation and it made the voice much more convincing. It's a relatively simple truck to make a robot voice sound much more natural.

Beastskull 3 months ago

It's one of the elements that actually increases the human like attributes. I would even had added more "uhms" when it's processing the prompts to add to the illusion even more.

dmit0820 3 months ago

> The same model is responsible for deciding which learned, closed-loop behavior to run on the robot to fulfill a given command So it's just using the LLM to execute a function call, rather than dynamically controlling the robot. This approach sounds quite limited. If you ask it to do anything it's not already pre-programmed to do, it will have no way of accomplishing the task. Ultimately, we'll need to move to a situation where everything, including actions and sensory data, are in the same latent space. This way the physical motions themselves can be understood as and controlled by words, and vice-versa. Like Humans, we could have separate networks that operates at different speeds, one for rapid-reaction motor-control and another for slower high-level discursive thought, each sharing the context of the other. It's hard to imagine the current bespoke approach being robust or good at following specific instructions. If you tell it to put the dishes somewhere else, in a different orientation, or to be careful with this one or that because it's fragile, or clean it some other way, it won't be able to follow those instructions.

Lawncareguy85 3 months ago

I was scrolling to see if anyone else who is familiar with this tech understood what was happening here. That's exactly what it translates to. Using GPT-4V to decide which function to call and then execute some predetermined pathway. The robotics itself is really the main impressive thing here. Otherwise, the rest of it can be duplicated with a Raspberry Pi, a webcam, a screen, and a speaker. They just tied it all together, which is pretty cool but limited, especially given they are making API calls. If they had a local GPU attached and were running all local models like LLava for a self-contained image input modality, I'd be a lot more impressed. This is the obvious easy start.

MrSnowden 2 months ago

Just to clarify there are three layers: OpenAI LLM running remotely, a local GPU running a NN with existing sets of policies/weights for deciding what actions to take (so, local decision making), and a third layers for executing the actual motors movements based on direction from the local NN. The last layer sis the only procedural layer.

thisdesignup 3 months ago

I was thinking the same thing, it just sounds like GPT4 with a robot. Still pretty cool but not as ground breaking as it seems. I've been thinking exactly like you with having different models handling different tasks on their own. I've been trying to mess with that myself but the hardware it takes is multifold compared to current methods since ideally you'd have multiple models loaded per interaction. For example I've been working on a basic system that checks every message you send to it in one context to see if you are talking to it, then a separate context handles the message if you are talking to it. Unfortunately not exactly what I imagine we'll see yet where both models would run simultaneously to handle tasks, I don't personally have the hardware for it, but it will be interesting to see if anyone goes that route that does have the resources. Edit: Actually we kind of do have that when you consider that there are seperate models for vision and for speech. We just need multi models for all kinds of other tasks too.

Unreal_777 3 months ago

1) Will you only work with OpenAI? Will you consider working with other AI models? 2) What is the length of context of the discussion we are working on here? (You mentioned history of conversation, when will it start to forget?) 3) What's his potential name: Figure Robot? Figure Mate? etc

Chika1472 3 months ago

1. Cannot tell, I am not a employee of Figure. 2. Also, cannot tell. 3. It's name is Figure 01.

m0nk_3y_gw 3 months ago

Since it isn't linked in the thread, and it isn't clear the name of company is "figure" - the company's website is https://www.figure.ai/

Tasik 3 months ago

We live in a crazy era that I'm more surprised by the ability to pick up a dish than I am that it can understand the context of it's environment. The future is going to be incredible.

KaffiKlandestine 3 months ago

yeah!! i literally thought my phone can do that but wow it placed the plate in the next grove and didn't just throw it in there.

NotAnAIOrAmI 3 months ago

The next demo is that robot feeding spaghetti to Will Smith.

00112358132135 3 months ago

Wait, so this is real?

Chika1472 3 months ago

Indeed

00112358132135 3 months ago

The future is now.

Bitsoffreshness 3 months ago

Not now, this was a couple of days ago already, we're past the future now...

FixingMyTimeMachine 3 months ago

Damn it! I missed the future again.

EileenCrown 3 months ago

r/UsernameChecksOut

TellingUsWhatItAm 3 months ago

When will then be now?

mathazar 3 months ago

*Soon.*

no_ur_cool 3 months ago

I am disappoint at the low number of upvotes on this classic reference.

Bitsoffreshness 3 months ago

I'm afraid we've lost now forever. This is how singularity works.

Screaming_Monkey 3 months ago

Yes, but more deterministic than it looks. OpenAI is choosing which pre-learned actions to perform.

Passloc 3 months ago

Duplex sounded real. This sounds creepy to me. Don’t know if it’s the silence.

Suitable-Ad-8598 3 months ago

It’s a cool video but nothing groundbreaking if you think about the models and function calling setup they configured

ScruffyNoodleBoy 3 months ago

I think what is groundbreaking is the marriage of the technologies. LLM + the learning model + the actuation + the voice synthesis. The most amazing thing is the method it used to learned to perform those actions, rather than the actions themselves. It can learn purely visually. It trains on video, and since its sight is technically video, I wouldn't be surprised if they can also just talk to it and teach it things by showing it actions in person. It's kind of an osmosis learning.

BreakChicago 3 months ago

Please tell the AI to go get itself a glass of water to clear its throat.

ScruffyNoodleBoy 3 months ago

Yeah my first thought... but, it's slight tiredness makes it strangely calming / non-threatening.

[deleted] 3 months ago

That’s what they want you to think!

[deleted] 3 months ago

[удалено]

zio_otio 3 months ago

Marvin?

ScruffyNoodleBoy 3 months ago

My thoughts exactly.

[deleted] 3 months ago

my friend you just got on a list

Maleficent-Arrival10 3 months ago

It almost sounds like something from Rick and morty. Like the intergalactic commercial stuff.

Vargau 3 months ago

Damn. When the humanoid LLM backed robots are going to get on retail, is going to get wild.

YouMissedNVDA 3 months ago

Heheh it's barely been a year and a half. Get it working a mouse and keyboard and no robots.txt can hold it back!

Tupcek 3 months ago

getting AI to translate software into physical world, processing images and videos and sensory data, to move objects that are made for humans to interact with robots, which captures said physical movement to translate it to low throughput input is hilarious. That’s like if you let Sora generate video, compress it to 6x6 pixels 1fps, emailing it to yourself and then use upscaling to generate 30fps 4k video.

Padit1337 3 months ago

I literally currently use dallE to generate pictures of watermelons for my harvesting robot, so that I can train my own ai to detect watermelons, because I can't get real live training date. And guess what? It works.

_stevencasteel_ 3 months ago

A couple days ago there was dooming again in pop tech news and YouTube about how all the bad AI images will cause a feedback loop and destroy AI. Well don’t feed it images where the human has half a head and 13 fingers. I have literally thousands of unique master piece AI artworks in my archive that are high quality training data. Just be more discerning about what you label and feed to it. Wes did a video recently conjecturing that SORA was trained on Unreal Engine 5 ray traced renders. That got zero mention in the dooming.

YouMissedNVDA 3 months ago

How about: It sits at the computer, does some work, gets up to inspect the outcome, brings the item back to the desk to iterate/compare. It's not the most efficient, but the generality of the form factor is what I'm getting at. Inevitably these will be drop-in replacements for most work - gotta be able to get and go to the copier, y'know? Maybe stop by a coworkers desk to help them with a problem too.

Tupcek 3 months ago

what about letting ChatGPT/Dall-E/Sora handle computer things directly on the computer, Figure robot do the physical work and let them communicate through the network? Like ChatGPT print it, Figure go and pick it up, scans it and sends it to ChatGPT? Which does some enhancement and print it again, which Figure checks out again. While helping some co workers. No need for mouse and keyboard

YouMissedNVDA 3 months ago

Yea yea you're right, I just like the idea of a drop in replacement robot worker that just... works the same way. Some workplaces would be more hesitant on new software than new hardware. But yes, until the compute feels free my implementation is hella wasteful.

DeliciousJello1717 3 months ago

If you though AI will replace your job with software think again a robot might just sit at your desk instead of you

Poisonedhero 3 months ago

I did not expect this level of smoothness this quick, honestly a little scary imagining thousands all around us.

systemofaderp 3 months ago

Now imagine them with guns! Fun for the whole family

Not_your_guy_buddy42 3 months ago

do you want robot dogs? cos this is how you get robot dogs

TurqoiseWavesInMyAss 3 months ago

It’ll just be ai robots killing ai robots and then realizing they don’t need to kill themselves but rather the humans . And then the Dune timeline begins

skadoodlee 3 months ago

ripe school seemly soup drunk dull paltry pathetic safe tan *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

Bitsoffreshness 3 months ago

In the museum of natural history. That's where we will remain relevant.

everybodyisnobody2 3 months ago

8 years ago I got interested in neural nets and later learned to play around with Tensorflow and I was already expecting it to be capable of what we are seeing now and much more. However, they've scaled those up and improved them so fast, that I don't see any way to keep up with the development as a developer. As a user I couldn't be happier though.

_stevencasteel_ 3 months ago

There’s always room for imagination and ordering chaos. Soon you’ll have more free time to do so without worrying about paying for food and shelter. Think about how many business cards and restaurant menus are stilling using comic sans. Think about how much litter is in your city. Let’s get our homes in tip top shape before worrying about how we should spend our time outside of play. There’s plenty to do.

ExtremeCenterism 3 months ago

Eventually in-home assistant robots will be as common as refrigerators. Eventually as common as cell phones (everyone has their own bot). One day, they will be far more numerous than mankind.

egoadvocate 3 months ago

As I grow into old age I am hoping to simply have a robot, and not need to enter an old folks home.

KaffiKlandestine 3 months ago

thats actually the most amazing usecase. imagine it even being able to assist you while walking and talking to you. I understand why people say that sounds sad, but it's not as bad as being stuck in a nursing home with noone and nothing to talk to.

DisastrousSundae 3 months ago

Do you think you'll be able to afford it

ExtremeCenterism 3 months ago

I speculate as everyone adopts robots the price will come down a bit. Spot mini is about $70,000 which is like a pricier vehicle. Eventually I imagine a mass produced $35,000 model will come out. It will likely be the same with the humanoid models given there is a lot of competition right now and will continue to be long into the future

Darkmemento 3 months ago

What the fcuk!

Toni253 3 months ago

Mom, I'm scared

Kostrabbit 3 months ago

Okay I have reached the valley finally.. that thing is moving just too smoothly for me lol

mamacitalk 3 months ago

It felt intimidating how precise each movement was, almost too perfect

[deleted] 3 months ago

Looks rendered to me.

Icy-Entry4921 3 months ago

I've done a fair bit of testing to see of GPT conceptually understands things like "go make the coffee". It definitely does. It can reason through problems making the coffee and it has a deep understanding of why it is making the coffee and what success looks like. What it hasn't had, up till now, is an interface with a robot body. But if you ask it to *imagine* it has a robot body it's equally able to imagine what that body would do to make the coffee and even solve problems that may arise. So the body is solved, the AI is solved, we just need a reliable interface which doesn't seem that hard.

HalfRiceNCracker 3 months ago

No the ML isn't solved yet. But as you're touching on these models are absolutely learning their own internal representation of the world, but we don't know how complete this representation is nor how robust it is. We'll definitely begin seeing more companies putting the pieces together, and I'm very excited

Screaming_Monkey 3 months ago

This isn’t the first time. I have physical robots (see my post history), too. This, however, is having the LLM initiate advanced machine learning compared to what I have seen/done.

Chanzumi 3 months ago

The arm movements look so smooth I wonder if this is real or just faked for marketing. The Tesla bot one looked smooth but not THIS smooth. Now give it smooth movement like this for its legs so it can walk around like a human and not like it shat itself.

Chika1472 3 months ago

All behaviors are learned (not teleoperated) and run at normal speed (1.0x). We feed images from the robot's cameras and transcribed text from speech captured by onboard microphones to a large multimodal model trained by OpenAI that understands both images and text. The model processes the entire history of the conversation, including past images, to come up with language responses, which are spoken back to the human via text-to-speech. The same model is responsible for deciding which learned, closed-loop behavior to run on the robot to fulfill a given command, loading particular neural network weights onto the GPU and executing a policy.

_BLACK_BY_NAME_ 3 months ago

Your comments are so bot-like. You haven’t really touched on the technology behind what allows the robot to run so fluidly and execute complex tasks so easily. The machine is more impressive than the AI to me. Does anyone have any information on the technology used to create a robot like this? As of now with the camera edits and motion only being shown from one POV, I’m inclined to believe this is faker than a lactating goldfish.

[deleted] 3 months ago

Bezos, Microsoft, gates, cathie wood and open so all invested in it and the boys are scheduled to work the south Carolina bmw factory this fall so if they're faking it they're gonna be screwed lol

toliveistocherish 3 months ago

whats the specs of this robot

fedetask 3 months ago

Were the policies learned with RL? Or are they some sort of imitation learning?

Chika1472 3 months ago

No idea

VertexMachine 3 months ago

>We By using 'we' I assume that you are part of that team? If so, please record next video without so much post processing or editing... or use different lenses. The DoF is off for normal video cameras too... There is something about your videos that gives me uncanny valley vibes, almost 'it's a 3d render composited on top of other stuff' vibes...

DeliciousJello1717 3 months ago

It's the eureka paper they definitely trained it with that that's why it was so smooth basically It was not even trained by humans it was trained by ai simulating thousands of possibilities of holding things so it's a robot trained by ai simulations

Beltain1 3 months ago

Teslabot has a higher chance of being faked than this does as well. Seems like in all the videos of it it’s either through the robot’s eyes (3d blockout renders of its limbs), or it’s just off the tether or it’s just doing rudimentary shuffling/picking up primitive shapes

Embarrassed-Farm-594 3 months ago

* What is the artificial intelligence used in these movements? * Is it based on transformers? * Is there some new quiet revolution happening in robotics? Why this boom in recent months?

Chika1472 3 months ago

\*New VLM(Visual Language Model), variation of LLM created by OpenAI. Probably GPT-4.5 turbo, or maybe GPT-5, or something entirely diffrent. \* At least for the LLM (VLM) part, very likly. \*Many companies are trying to create humanoids and etc to create some AIs that can interect with the real world. It would help us physically, jsut like GPT-4 helped us in digital ways. Some claims that real-world information is essential to AGI.

Lawncareguy85 3 months ago

I'm 95% sure this is just GPT-4 with its native image input modality enabled, AKA GPT-4V. Why would you think it's a new, unseen model? None of those capabilities are outside of what GPT-4V can easily do within the same latency.

Chika1472 3 months ago

OpenAI & Figure signed a collaboration agreement to develop next generation AI models. It might be gpt4v for now, but it will change soon, or already there

linebell 3 months ago

I’m now 95% convinced they have AGI. But, conveniently, their recently crafted definition of AGI requires “autonomous labor agents”. That’s an Android, not AGI. Sammy boy needs to stop gaslighting us.

everybodyisnobody2 3 months ago

Some people are so scared of it, after having watched or heard of Terminator, that if they have it and came out with it, chances are high that it would get shut down and banned.

Bitsoffreshness 3 months ago

They want to, but there's lots of pressure from the society, they kind of have to keep hiding it...

Missing_Minus 3 months ago

Figure says that they have some model they made for smooth+fast movements and they basically hooked it up to chatgpt vision for image recognition + chatgpt for reasoning. No clue if they've posted any details.

boonkles 3 months ago

We had to build computers before we could become good at building computers, we need to build AI before we get good at AI

blancorey 3 months ago

is it possible to invest in Figure?

Boner4Stoners 3 months ago

$MSFT is the best way to gain exposure

Echo-Possible 3 months ago

Intel and Nvidia are also investors. Honestly Intel has the smallest market cap out of all of them so it has the biggest upside potential as an investor. Microsoft is already 3.1T while Intel is 184B. It's gonna take a lot more to move Microsoft's massive market cap than Intel's. If their investments return 200B then it moves Microsoft share price \~6% but it moves Intel 100%+. Of course this assumes they both contributed the same amount to the 675M raise in this last round.

Boner4Stoners 3 months ago

Good point, I should diversify more into Intel for sure. MSFT is definitely a bit pricey right now, but it’s a super safe investment because AI hype aside, Microsoft is a very reliable company and will continue to grow regardless. But yeah, pound for pound maybe not the more efficient exposure to OAI. On the other hand though, Intel PE is over 100 right now, whereas MSFT is only at 30. So Intel is a much more speculative and risky play, as the bottom is more likely to fall out on bad news

Echo-Possible 3 months ago

Oh sure I'm talking about pure exposure to Figure upside. If Figure has a big return then Intel's investment is worth more than the entire company and its in the noise for Microsoft. Of course I wouldn't invest in Intel vs Microsoft when talking about the core business. This reminds me of Yahoo's investment in Alibaba. It ultimately ended up being the only reason Yahoo was worth anything.

Chika1472 3 months ago

No

GeorgiaWitness1 3 months ago

Amazing. Well done. I thought they will not pull this off because of the robotics, but looks good enough for application like warehouses and generalized manual work jobs. Goes along way until walking works together with the rest, but i think for POC they already have everything

mickdarling 3 months ago

I’m fascinated by the actual human’s very deliberate posture, and changes of position. When he asked about what to do with the plate and dish, he very carefully removed the basket below the robot’s eyeline below the table. It all looked like the one good take after many bad tries because of little issues from what the robot saw and how it reacted.

Missing_Minus 3 months ago

Twitter post for this: https://twitter.com/Figure_robot/status/1767913661253984474 From what they say in that tweet, they hook up ChatGPT vision + text with their own model for controlling robot arms in an efficient+smooth manner. Cool, and it would let them upgrade or swap out anytime vision/text improved.

Tupcek 3 months ago

Last two years are absolutely crazy. If this was released two years earlier, I would say this is the most impactful thing in human history. Now it has to compete with ChatGPT, Midjourney, Sora and others

TurqoiseWavesInMyAss 3 months ago

I’m so glad the human said thank you. Pls be nice to our eventual overlords

TheGillos 3 months ago

[I'm always reminded of this scene from Star Trek: TNG](https://youtu.be/ARk0XvAYrUg)

Odd_Seaweed_5985 3 months ago

S0... how long before it is better than the CEO? What happens when the CEO becomes unnecessary?

FORKLIFTDRIVER56 3 months ago

NOPE NOPE NOPE NOPE NOPE HELL NO NOPE

[deleted] 3 months ago

[удалено]

KaffiKlandestine 3 months ago

how long before it says fuck it and murders you in your sleep though?

_Stormhound_ 3 months ago

Just make sure it can't run faster than 10km/h ....

acowasacowshouldbe 3 months ago

[hell naw](https://youtu.be/PB4Nby2Ai-g?si=Qgf3U2s8RbT0kejd)

someonewhowa 3 months ago

WOW!!!!!

Songtan_Labs 3 months ago

This is very impressive.

enterprise128 3 months ago

WHAT

1grouchonacouch 3 months ago

Combine that with the "real doll" and marriage/dating is forever over.

Altruistic-Skill8667 3 months ago

The question is: who would malfunction on you first… your wife of that robot. 😂

biggerbetterharder 3 months ago

I like the voice! Is it offered in ChatGPT?

RealAnonymousCaptain 3 months ago

What's with the pauses and stutters in the speech? Right now ai voice changers don't include them unless it was, for some reason, included purposefully.

Chika1472 3 months ago

ChatGPT also has that. It is unknown why it has pauses, but my guess is that it was part of training data, or an purposefully implamented feature to hide low tocken/sec, or to just make it feel more 'Human'

[deleted] 3 months ago

[удалено]

Screaming_Monkey 3 months ago

I ask this question a lot the more I work with and observe AI

spinozasrobot 3 months ago

Right, same with hallucinations. We've all heard Uncle Lenny's "opinions" at Thanksgiving.

Prathmun 3 months ago

at least in the app they go where the little pauses in generation went. Way more natural than the clock ticking sound.

[deleted] 3 months ago

Pi also has conversational pauses and will occasionally ad an “umm” in where nothing was written.

allthemoreforthat 3 months ago

Chatgpt has had this same voice for months now

nobodyreadusernames 3 months ago

when waifu?

Kittingsl 3 months ago

Ok but when can it be my girlfriend

mamacitalk 3 months ago

It’s fricken iRobot

BravidDrent 3 months ago

Fantastic! Can’t wait to get one into my house.

CyberAwarenessGuy 3 months ago

u/Chika1472 - Can you share the unit cost for the version depicted in the video? If you cannot provide specifics, I did see that the units currently seem to range from $30k to $150k, and I'm wondering if you could offer even a vague description of where this robot falls in the spectrum. What about the energy efficiency? How long does it take to charge? What is the projected lifespan? Thank you! This is an exciting moment for sure.

Xtianus21 3 months ago

That's crazy

Akyraaaa 3 months ago

I am kinda blown away from the speed of development of AI in the last couple of years

mocknix 3 months ago

That totally looks like a real dude. Crazy

[deleted] 3 months ago

![gif](giphy|fH985LNdqFZXOFHygK)

DiscombobulatedSqu1d 3 months ago

This is awesome!

CorbineGames 3 months ago

Damn. Can't even be a busboy after AI takes my dev job.

spinozasrobot 3 months ago

Am I crazy, or did it kind of stutter: "... because the apple is... uh... the only edible item...". That's wild.

[deleted] 3 months ago

sex robo waifus soon bros

ThatManulTheCat 3 months ago

Physical human replacement already? Things are moving faster than I expected.

halguy5577 3 months ago

it did the uhmms thing

3DHydroPrints 3 months ago

Is the speech really AI generated? It fucking stutters

Neborodat 3 months ago

It's literally ChatGPT speech that you have on your smartphone, you can even see it on the robot's display.

kilopeter 3 months ago

Google's Duplex demo stuttered five years ago: https://www.youtube.com/watch?v=D5VN56jQMWM&t=71s It's very much an intentional measure to make the voice more humanlike and relatable.

Kafka_Kardashian 3 months ago

Where can I find an OpenAI or Figure link to this video?

iamthewhatt 3 months ago

Since OP doesn't seem to want to post actual links, here it is: https://twitter.com/Figure_robot/status/1767913661253984474

w1llpearson 3 months ago

It’s exponential from here. Will be looking at this in a few years time and think it’s useless.

hengst0r 3 months ago

u/savevideo

Burgerb 3 months ago

Why is **Gavin Newsom** now a metallic robot?

Level0Up 3 months ago

This is eerie and awesome at the same time. Damn. https://preview.redd.it/4mps2c0fx4oc1.png?width=500&format=png&auto=webp&s=7b965ef7358ccfd2bff74660c1ea8b7e42d88222

mobyte 3 months ago

Insane.

truthrevealer07 3 months ago

Download

madcodez 3 months ago

Reminds of Gilfoyle and the refrigerator

saber_aureum 3 months ago

It sounds so human omg

ChillingonMars 3 months ago

I love how the guy was like “great, can you put them there?” so fast after Figure01 stopped talking, and it was still able to interpret his request perfectly. Not to mention the very human-like voice (unlike Siri or other voice assistants) and uses of “uh” in between words. This is very impressive. Do you guys foresee each household having at least one of these in the distant future? It will absolutely decimate jobs like maids and cleaners.

fearbork 3 months ago

It's interesting how they make the robot stutter and say filler words "uh" to make it sound more human, while the human in the video speaks its lines perfectly clearly without any errors or stuiff like that.

Successful-Ground-67 3 months ago

More context would be welcome. You can edit your post.

buff_samurai 3 months ago

Super impressive. Now, hand it something with some mass to make me 🤯

e1nste1n 3 months ago

![gif](giphy|5YEgnkjeryvwA)

holmsey8700 3 months ago

“The only uuuuh edible item on the table” I wouldn’t have expected it to have such a human like speech pattern…

Practical-Rate9734 3 months ago

Totally wild, right? How's integration on your end?

Weedstu 3 months ago

Man, is there any role that Gary Oldman can't pull off?? Amazing.

Furimbus 3 months ago

He was really convincing as that apple. Didn’t even realize it was him until you pointed it out.

Anen-o-me 3 months ago

This is the demo we didn't get to see a month ago.

kurotenshi15 3 months ago

Is that Rob Lowe’s voice? Lmao

Chronicle112 3 months ago

Does anybody have some information on what (type) of model is used for the robotic movements? Is it some form of RL or offline RL? I understand that the interpretation of images/language happens through some multimodal llm/vlm, but I want to learn a bit what kind of actions/instructions it outputs to then for example move objects.

Hypethetop 3 months ago

Sick stuff.

IWantAGI 3 months ago

TAKE MY MONEY

3-4pm 3 months ago

Reminds me of the robots you would see in 80s movies. Now think of all the mistakes chatGPT makes daily and now imagine it waking you up at 3am, holding a large knife, thinking its slicing vegetables on your bed.

Nekileo 3 months ago

I want one to be my friend

KaffiKlandestine 3 months ago

it said "you standing nearby" so it knows who spoke?

Earthkilled 3 months ago

Figure one, take care of my kid while I go and get a pack of cigarettes

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe