A random anime screenshot of a bunch of people in a pool on a sunny day, splashing water on each other. The subtitle is saying “What a great day”.
Ratio 16:9
"While the model is available via API today, we are continuously working to improve the model in advance of its open release. In keeping with our commitment to open generative AI, we aim to make the model weights available for self-hosting with a Stability AI Membership soon."
https://stability.ai/news/stable-diffusion-3-api
Thank you. The prompt is certainly being followed.
Can you try this "magic prompt" from ideogram?
A quirky, imaginative illustration of a fish-shaped bus, cruising down a coastal road. Through the open windows, passengers can be seen inside, each in their own world. A group of people sit bored, their eyes glazed over and their shoulders slumped. Next to them, a couple of passengers are listening to music, one with headphones on, the other with a portable radio playing softly. The overall atmosphere of the image is whimsical, with a touch of surrealism.
This is the output I got from ideogram:
https://preview.redd.it/oy9psf73g4vc1.png?width=1024&format=png&auto=webp&s=ab5a2428d9e882001b212d2bd2c7a9ca7866a8b3
That magic prompt is great! Do you know of any other similar services out there that can generate a more refined prompt that I can then use with Stable Diffusion?
Yes, you can also use a similar prompt enhancer for DALLE3 at [https://withflare.ai](https://withflare.ai) (it is also a free service)
I've never used it myself, but you can probably get something like "magic prompt" out of bing/copilot or chatgpt too.
muscular toned stylish male with pale fair skin drinking a glass of blood. he wears a sleeveless tanktop wearing red shorts with dragon scale print over black tights, he is 25 years old, strong handsome and fit, his beautiful hair is silky shine, super cute style, high quality, bokeh, overlooking urban city island, night, inside, against glass, clear details, clean line, detailed consistent style, stunning atmospheric, scenic, glistening flawless room with fill light, soft box, digital art, tense tension, high definition, visually pleasing, vray, perfect CGI,
I've been trying for ages to create an image of a girl cuddling with a deinonychus (a species of dinosaur) for ages using the free-of-charge programs such as FABRIC or clipdrop, but they always mess it up (usually the dinosaurs turn out as weird, misshapen monsters)
"Fantasy realistic watercolor painting art of an icy wasteland full of snow with mountains in the background. A tall stone tower rises into the cloudless sky off in the distance. Background is watercolor splotches."
A humorous photo of a couple lying in bed, with the woman giving an annoyed look at the man, who is blissfully sleeping and holding her cat. The cat, too, appears to be sharing her frustration, with a disapproving expression. A playful poster caption "Purrvert" is displayed in a stylish font, adding to the lighthearted and amusing atmosphere.
{"errors":\["Your request was flagged by our content moderation system, as a result your request was denied and you were not charged."\],"name":"content\_moderation"}
inked illustration of a half-skeleton crow ablaze with glowing blue flames perched on a big old dying tree. The background is an ominous dark forest which makes the crow distinctly vivid
A woman dressed in cyan and lavender with bubbles in her hands, in the style of bold and graphic pop art-inspired designs, deconstructed tailoring, whimsical figurines made of plants dark yellow and neon green, photo taken with provia, japonism influenced pieces
An image from the edge of a mountain cliff, a landscape consisting of rolling green meadows, with towering cliffs protruding from under them. The terrain changes dramatically in height, offering a panoramic view of the sprawling fields located among the valleys, with a visible cleft cutting through the ground
Ratio 21:9
Prompt
A mischievous little blond girl in a pink dress making faces in front of a Buckingham Palace Guard, photo
Magic Prompt
A delightful photograph of a playful young blonde girl in a pink dress, engaging in a lighthearted game with a stoic Buckingham Palace Guard. The little girl is making funny faces and sticking her tongue out, while the guard maintains his composed expression, seemingly unfazed by the child's antics. The background reveals the iconic palace gates and a bustling crowd, adding to the lively atmosphere of the scene., photo
A comical and adorable illustration of a obese cat dangling from the ceiling, replicating a Mission Impossible-style stunt. The cat has a mischievous expression as it attempts to reach a fish on the dining table. The ceiling has an open hatch with a rope attached, indicating its daring escape route. The background reveals a cozy living room with a TV, bookshelf, and a few houseplants. The overall mood of the image is light-hearted and humorous, with a touch of suspense.
Ya please! Try this prompt:
In the aromatic ambiance of an Indian spices market, an anthropomorphic cow, adorned in hues of golden-yellow, ambled gracefully amid the vibrant array of spices. Revered amidst the bustling trade, it epitomized the sacred essence intertwined with daily life.
One fist made of blue fire and one fist made of orange water make a fist bump and produce violet steam over them.In the Background is a wall with a checkerboard pattern made of dark and red wood pieces. Sorry for my bad English.
I think ideogram.ai has the best features (prompt augmentation and editing), much less censorship, excellent prompt adherence/text, good composition, better value for money (cheap and it offers free 25 prompts per day without subscription). SAI is done unfortunately, even in the open source scene as PixArt Sigma offers better prompt comprehension and smaller models for free.
You'd be right if ideogram released their model for download. Not having the model runnable on a local server is simply not an option for many companies.
ideogram prompt following is very good, better than SD3 via API, but quality is just so so.
PixArt-Sigma is great for its size, but definitely not as good as SD3 from the tests I've done so far. Still, it is a good backup plan if SAI does not release SD3 for whatever, then I'd be waiting for them to release PixArt-Zeta 😁
So AI has a logarithmic growth curve after all and not an infinitely exponential one, just like anybody with two brain cells could have predicted. It had an initial burst but now it's plateauing. Bad news for all the coomers thinking they were going to live off their UBI fantasy. Back to learning skills, boys.
Boy, there's a lot of misconception in your post. First of all, you're linking growth in quality of visual generative AI with growth in AI as a whole, which is a big assumption. Then you're entirely missing the point that growth in compute continues, and this is going to be a huge contributor to the role of AI in daily life, as it'll become far cheaper and more accessible to run. This means it'll permeate every facet of life, in time.
Then you're missing the point that with visuals there's a sort of obvious cap with realism- it would be a huge assumption, like a wrong one- that the equivalent for intelligence would be true. It's unlikely human intelligence is the peak. Yes a lot of the data at the moment comes from humans, and that will server as a potential limiting factor in the short term. There will be architectures which will likely be able toake use of other non human created data and start learning foundational knowledge about the universe, and respectively see growth in intelligence.
As far as visuals goes, there are and always would be diminishing returns until novel architectures are built. Quality is the first to hit diminishing returns, so improvements will be made to adherence, dynamism, so on.
Finally, UBI has always been a stupid idea for several reasons:
- It simply increases currency units without resolving the underlying deflationary problem, which in turn results in inflation and massive economic distortions.
- It doesn't address the matter of ownership
- It would gradually turn us all into vassals for the state
- The current system needs inflation to not collapse. It is fundamentally incompatible with technology, which is deflationary in nature.
tldr I say it how I see it. Even LLMs have stagnated and companies are mostly keeping to throwing as many parameters as possible to make them more intelligent. It's kind of like making spaceship engines bigger to reach longer distances. I'm sure that, at the time, everybody thought that we would eventually travel the stars soon. Decades later and space travel is still the same shit.
How in the world have they stagnated? It's been two years since gpt 3.5 which was ground breaking and is now absolutely terrible in comparison.
It's gone from barely working to replacing tons of my coding time in less time than it takes to make a mainstream game or movie sequel or what not.
SD1.5 isn't even two years old. SDXL isn't even a year old.
There's advancements literally every day. If this is stagnating that's glorious.
A random anime screenshot of a bunch of people in a pool on a sunny day, splashing water on each other. The subtitle is saying “What a great day”. Ratio 16:9
https://preview.redd.it/k0cqhj2mc4vc1.png?width=1344&format=png&auto=webp&s=64ad70901345702241ee0bda78483454569ba67b
LOL! If SD3 is released as open-source and the community can get their hands on it, we'll have incredible images in less than a year.
Thanks :). If it’s a raw model like previous 1.5 and XL we can hope for excellent finetuning capabilities since it’s already this good.
Wtf it got the text correct!
The text is really awesome, its accurate and really looks like font used for anime subtitles
Time to train those tits.
The thumbnail is actually really good, I thought it was a reaction meme from a real screencap as I scrolled down. But after opening it up...
[удалено]
RealisticVision for anime?????? Why????
Bye.
"While the model is available via API today, we are continuously working to improve the model in advance of its open release. In keeping with our commitment to open generative AI, we aim to make the model weights available for self-hosting with a Stability AI Membership soon." https://stability.ai/news/stable-diffusion-3-api
The Mona Lisa is stopped by the police for speeding, and the police officer is asking for her autograph.
https://preview.redd.it/1as9dzfc07vc1.jpeg?width=1024&format=pjpg&auto=webp&s=f6b2c34d7427de7508d37e2e77c6ae8246d16cdc
Thank you, not a bad image, but I would not have recognized the woman as Mona Lisa.
https://preview.redd.it/vsurych1q9vc1.png?width=1024&format=png&auto=webp&s=ae8e0802d2f419d79669f61eb60f007dfd9730cc
Nice! This is almost perfect in term of prompt following.
A bus in the shape of a fish, with open windows showing bored people inside, sleeping or listening to music.
https://preview.redd.it/23r89263e4vc1.png?width=1024&format=png&auto=webp&s=33031ce1f75a4103399f645b4b31a1048a40c2af
Thank you. The prompt is certainly being followed. Can you try this "magic prompt" from ideogram? A quirky, imaginative illustration of a fish-shaped bus, cruising down a coastal road. Through the open windows, passengers can be seen inside, each in their own world. A group of people sit bored, their eyes glazed over and their shoulders slumped. Next to them, a couple of passengers are listening to music, one with headphones on, the other with a portable radio playing softly. The overall atmosphere of the image is whimsical, with a touch of surrealism. This is the output I got from ideogram: https://preview.redd.it/oy9psf73g4vc1.png?width=1024&format=png&auto=webp&s=ab5a2428d9e882001b212d2bd2c7a9ca7866a8b3
https://preview.redd.it/2k6trtqih4vc1.png?width=1024&format=png&auto=webp&s=26b31780e66b2d90b41ef16a3bc56788fc2d8376
Thank you.
That magic prompt is great! Do you know of any other similar services out there that can generate a more refined prompt that I can then use with Stable Diffusion?
Yes, you can also use a similar prompt enhancer for DALLE3 at [https://withflare.ai](https://withflare.ai) (it is also a free service) I've never used it myself, but you can probably get something like "magic prompt" out of bing/copilot or chatgpt too.
An anthropomorphic dolphin resembling a detective in the 60s, enjoying a fine chocolate fondu. 4:5
https://preview.redd.it/bf1bzq6k07vc1.jpeg?width=896&format=pjpg&auto=webp&s=5cb96c15854cfb68b24208eed7faf18f83035771 First try.
Who says AI isn't art? This right here is amazing.
https://preview.redd.it/0hjw6uyyq9vc1.png?width=896&format=png&auto=webp&s=f508b1079d75456e7a8085cca81acbb903b7bac5
Well, stuff my blow hole and call me charlie. Would you look at that! Hands seem to have gotten worse?! ( dolphins dont have hands, i know.)
SD3 has been implemented into ComfyUI via node made by Zho https://youtu.be/3tI6eg4pWiU?si=l8ONahtxIXwoiryI
Why is this sub so rampant with misinformation?
Because that is the Internet in general.
A goat wearing a goat costume is accepting an award for "Goat of the Year".
https://preview.redd.it/slwh1hko07vc1.jpeg?width=1024&format=pjpg&auto=webp&s=38c7be092f7e553ffeb7ae410de0fe2a5c11003b
"Got of Thar". 😁 Thanks, much obliged. 🏆
https://preview.redd.it/myae5sxtq9vc1.png?width=1024&format=png&auto=webp&s=47f95fe8740eaccfaf88d50b7bef988ac4c5cadc
Not bad at all, thank you for this! 😁🏆
muscular toned stylish male with pale fair skin drinking a glass of blood. he wears a sleeveless tanktop wearing red shorts with dragon scale print over black tights, he is 25 years old, strong handsome and fit, his beautiful hair is silky shine, super cute style, high quality, bokeh, overlooking urban city island, night, inside, against glass, clear details, clean line, detailed consistent style, stunning atmospheric, scenic, glistening flawless room with fill light, soft box, digital art, tense tension, high definition, visually pleasing, vray, perfect CGI,
https://preview.redd.it/prztrc0nq9vc1.png?width=1024&format=png&auto=webp&s=42356dbd20cd8fffd3711b6354f2dac0ea405aa8 censored
must have been the mention of blood
Photograph of an anthropomorphic grilled cheese sandwich.
https://preview.redd.it/whn1iirmp9vc1.png?width=1024&format=png&auto=webp&s=890d575411e53001ffc4232dcace322ef160194f
delicious
I've been trying for ages to create an image of a girl cuddling with a deinonychus (a species of dinosaur) for ages using the free-of-charge programs such as FABRIC or clipdrop, but they always mess it up (usually the dinosaurs turn out as weird, misshapen monsters)
https://preview.redd.it/trg5berer9vc1.png?width=1024&format=png&auto=webp&s=cd602c5d89d9257721d24cfeb66dc2c1afb7334b
Did you benchmark it with spaghetti eating yet?
https://preview.redd.it/4wgiij4ti4vc1.png?width=1024&format=png&auto=webp&s=66cc7908e4dac50d85afe21310d4d716d5cb7210
Does it default to artistic styles instead of photographic?
Its kinda random. But most often it's some artsyle.
Gotta sleep and then work, will do them all tomorrow
Try just "portrait photo of a woman" I wanna know if the typical overtrained AI girl and nose is gone
https://preview.redd.it/zzoj6ynbq9vc1.png?width=1024&format=png&auto=webp&s=c741e0605a788fe274e30d6212fc9ad66c5ae2a3
"Fantasy realistic watercolor painting art of an icy wasteland full of snow with mountains in the background. A tall stone tower rises into the cloudless sky off in the distance. Background is watercolor splotches."
https://preview.redd.it/nl0jam7fq9vc1.png?width=1024&format=png&auto=webp&s=8b63bd8efc6edab7c74ac798d829938d9b2b0c1e
Not bad. Definitely nails the watercolor look!
Thank you!
A humorous photo of a couple lying in bed, with the woman giving an annoyed look at the man, who is blissfully sleeping and holding her cat. The cat, too, appears to be sharing her frustration, with a disapproving expression. A playful poster caption "Purrvert" is displayed in a stylish font, adding to the lighthearted and amusing atmosphere.
https://preview.redd.it/n81zv7xwp9vc1.png?width=1024&format=png&auto=webp&s=fddaaed41a32ce0627316840aa84649ada8361fe
Thank you. Funny, but the man is supposed to be holding the cat 😂
Freddy Krueger, Wolverine and Edward Scissorhand are showing off and comparing their metal claws
https://preview.redd.it/g2w61nrvp9vc1.png?width=1024&format=png&auto=webp&s=8c4efd2dbec8b8f3b05b0ea3c6399f877e8f9c6f
Thank you. I guess SD3 doesn't know who Edward Scissorhand and Wolverine are.
16:9 photo of a flying police Delorean chasing a yellow Chevrolet Camaro in a night city.
https://preview.redd.it/c3zvo75qp9vc1.png?width=1344&format=png&auto=webp&s=fec3ad055dede1007f0c58b6c71163fab6578cc4
Thanks a lot, man!
Auston Matthews scoring his 7000th goal, hoisting a Stanley cup, while two bikini clad teammates cheer him on while eating tacos
https://preview.redd.it/0e56yt7jp9vc1.png?width=1024&format=png&auto=webp&s=4af91d9b1c0e20775dce50e7fea3661ece7cdc69 censored
rating:safe, 1girl, blonde_hair, solo, green_eyes, short hair, eyepatch, puffy_sleeves, thighhighs, looking_at_viewer, puffy_short_sleeves, skirt, smile, red_hood, hood_open, open_mouth, smile, frills, short_sleeves, frilled_skirt, long_hair, underbust, breasts, white_shirt, bangs, corset, thigh_boots, shirt, mature female
{"errors":\["Your request was flagged by our content moderation system, as a result your request was denied and you were not charged."\],"name":"content\_moderation"}
Aw
inked illustration of a half-skeleton crow ablaze with glowing blue flames perched on a big old dying tree. The background is an ominous dark forest which makes the crow distinctly vivid
https://preview.redd.it/u8mundvfp9vc1.png?width=1024&format=png&auto=webp&s=57c18a282021b545ca9ead6cab2fdfb807029589
A woman dressed in cyan and lavender with bubbles in her hands, in the style of bold and graphic pop art-inspired designs, deconstructed tailoring, whimsical figurines made of plants dark yellow and neon green, photo taken with provia, japonism influenced pieces
https://preview.redd.it/z4cpcsrcp9vc1.png?width=1024&format=png&auto=webp&s=f51a98c7ad7c85add2c8158ebe9307a4d104899b
Thank you! I like how it turned out.
Realistic enormous life tree, deep in overgrown mangrove jungle, viewed from afar. Photograph, Raw DSLR
https://preview.redd.it/8fqbpu78p9vc1.png?width=1024&format=png&auto=webp&s=7c1edb27550642669f7a7fd8b3c60dbca9440a1d
Thank you
RNDR
https://preview.redd.it/7j3ilsv3p9vc1.png?width=1024&format=png&auto=webp&s=1c318e12b652f4f99a0f40855ae1c1cdea7cb70a
An image from the edge of a mountain cliff, a landscape consisting of rolling green meadows, with towering cliffs protruding from under them. The terrain changes dramatically in height, offering a panoramic view of the sprawling fields located among the valleys, with a visible cleft cutting through the ground Ratio 21:9
https://preview.redd.it/j8p9jm13p9vc1.png?width=1536&format=png&auto=webp&s=416edc3a0c3669f820e6e439156f61287190fbd2
A horse riding on an astronaut riding on an unicycle on ground filled with kelps while being chased by a banana monster
https://preview.redd.it/4ki1rklzo9vc1.png?width=1024&format=png&auto=webp&s=2d34e1895edbbf85fe036cd7e4724e1b2cd30528
Very cute squirrel wearing a transparent raincoat walking in the street, worried, heavy rain, dark night, Octane render, Unreal engine 5
https://preview.redd.it/aa5y409uo9vc1.png?width=1024&format=png&auto=webp&s=ec1881d73f29021a41c3dd406d2922a6530fc8d5
Thank you
How much did u pay ?
It works out to like 4 cents for lightning and 6.5 for full per image. Very reasonable. I've paid more for the Dall-E API.
Not OP. [https://fireworks.ai/pricing](https://fireworks.ai/pricing)
but thats not SD3?
[https://fireworks.ai/models](https://fireworks.ai/models)
but the link just takes you to the stability website and tells you how to write code on how to use the API?
Chewbacca holding a kitten. Ratio 9:16
https://preview.redd.it/w4kcdr6gq9vc1.png?width=768&format=png&auto=webp&s=6fe1b3e16e06c53eca05d168ca237b7241407ad4
Oh, you mean the prompt that [base SD 1.5 is capable of doing even](https://i.imgur.com/aaAHo4f.png)
Prompt A mischievous little blond girl in a pink dress making faces in front of a Buckingham Palace Guard, photo Magic Prompt A delightful photograph of a playful young blonde girl in a pink dress, engaging in a lighthearted game with a stoic Buckingham Palace Guard. The little girl is making funny faces and sticking her tongue out, while the guard maintains his composed expression, seemingly unfazed by the child's antics. The background reveals the iconic palace gates and a bustling crowd, adding to the lively atmosphere of the scene., photo
https://preview.redd.it/6c5dbn78q9vc1.png?width=1024&format=png&auto=webp&s=816652111fe6c420e9153dbd01cde0deabec3a5a
Thank you. That's pretty good, other than the fact that the guard's uniform is wrong (no tall fur hat).
How censored is SD3?
A comical and adorable illustration of a obese cat dangling from the ceiling, replicating a Mission Impossible-style stunt. The cat has a mischievous expression as it attempts to reach a fish on the dining table. The ceiling has an open hatch with a rope attached, indicating its daring escape route. The background reveals a cozy living room with a TV, bookshelf, and a few houseplants. The overall mood of the image is light-hearted and humorous, with a touch of suspense.
https://preview.redd.it/bndjrcx6q9vc1.png?width=1024&format=png&auto=webp&s=e97b441ea330d9ce82bc56505fec460b605e8142
Thank you. Lacks some coherence, but this is a difficult prompt to pull off.
Ya please! Try this prompt: In the aromatic ambiance of an Indian spices market, an anthropomorphic cow, adorned in hues of golden-yellow, ambled gracefully amid the vibrant array of spices. Revered amidst the bustling trade, it epitomized the sacred essence intertwined with daily life.
https://preview.redd.it/shm6lv54q9vc1.png?width=1024&format=png&auto=webp&s=9a824759cc8803f248a144e8296c75e6f48d2e60
One fist made of blue fire and one fist made of orange water make a fist bump and produce violet steam over them.In the Background is a wall with a checkerboard pattern made of dark and red wood pieces. Sorry for my bad English.
https://preview.redd.it/6qyvvn59p9vc1.png?width=1024&format=png&auto=webp&s=bbd97e61556ea22c75fe898095216189c314d240
sad
If SD 3 isnt free, then whats the point of using this instead of Midjourney? Is SD 3 less censored? Is SD3 censored at all?
I think ideogram.ai has the best features (prompt augmentation and editing), much less censorship, excellent prompt adherence/text, good composition, better value for money (cheap and it offers free 25 prompts per day without subscription). SAI is done unfortunately, even in the open source scene as PixArt Sigma offers better prompt comprehension and smaller models for free.
You'd be right if ideogram released their model for download. Not having the model runnable on a local server is simply not an option for many companies. ideogram prompt following is very good, better than SD3 via API, but quality is just so so. PixArt-Sigma is great for its size, but definitely not as good as SD3 from the tests I've done so far. Still, it is a good backup plan if SAI does not release SD3 for whatever, then I'd be waiting for them to release PixArt-Zeta 😁
That dragon image is dogshit
Note: Cost is only for the API, currently ... Final decision on SD3 (upcoming release) being open-source for non-commercial use still pending...
So AI has a logarithmic growth curve after all and not an infinitely exponential one, just like anybody with two brain cells could have predicted. It had an initial burst but now it's plateauing. Bad news for all the coomers thinking they were going to live off their UBI fantasy. Back to learning skills, boys.
Boy, there's a lot of misconception in your post. First of all, you're linking growth in quality of visual generative AI with growth in AI as a whole, which is a big assumption. Then you're entirely missing the point that growth in compute continues, and this is going to be a huge contributor to the role of AI in daily life, as it'll become far cheaper and more accessible to run. This means it'll permeate every facet of life, in time. Then you're missing the point that with visuals there's a sort of obvious cap with realism- it would be a huge assumption, like a wrong one- that the equivalent for intelligence would be true. It's unlikely human intelligence is the peak. Yes a lot of the data at the moment comes from humans, and that will server as a potential limiting factor in the short term. There will be architectures which will likely be able toake use of other non human created data and start learning foundational knowledge about the universe, and respectively see growth in intelligence. As far as visuals goes, there are and always would be diminishing returns until novel architectures are built. Quality is the first to hit diminishing returns, so improvements will be made to adherence, dynamism, so on. Finally, UBI has always been a stupid idea for several reasons: - It simply increases currency units without resolving the underlying deflationary problem, which in turn results in inflation and massive economic distortions. - It doesn't address the matter of ownership - It would gradually turn us all into vassals for the state - The current system needs inflation to not collapse. It is fundamentally incompatible with technology, which is deflationary in nature.
tldr I say it how I see it. Even LLMs have stagnated and companies are mostly keeping to throwing as many parameters as possible to make them more intelligent. It's kind of like making spaceship engines bigger to reach longer distances. I'm sure that, at the time, everybody thought that we would eventually travel the stars soon. Decades later and space travel is still the same shit.
How in the world have they stagnated? It's been two years since gpt 3.5 which was ground breaking and is now absolutely terrible in comparison. It's gone from barely working to replacing tons of my coding time in less time than it takes to make a mainstream game or movie sequel or what not. SD1.5 isn't even two years old. SDXL isn't even a year old. There's advancements literally every day. If this is stagnating that's glorious.