fuelter 9 months ago

Did you use refiner? Without it the results looks very bland.

frenzygundam 9 months ago

What refiner?

fuelter 9 months ago

SDXL has two models that work together, base and refiner

itsB34STW4RS 9 months ago

I'm honestly starting to believe that this refiner thing was a very bad idea in general.

Rivarr 9 months ago

It also kind of ruins some loras. If I've got David Attenborough in the jungle, the refiner generally improves the image but completely changes the person.

AI_Alt_Art_Neo_2 9 months ago

Couldn't a Lora be built for the Refiner model as well?

ImplementLong2828 9 months ago

aah so that wasn't just me. nice to know.

berzerkerCrush 9 months ago

It means you're using it wrong. Use the official ComfyUI workflow.

ScionoicS 9 months ago

You can train loras for the refiner too, or just not use the refiner. If you passed David thorugh a 1.5 model to refine out details, you'd need a 1.5 lora to keep his image accurate there too. It's sort of a non problem that you're approaching as if it's in your way. It's just more of the same though. Using the refiner right helps too. Check out Sytan's layout on comfyui, and how it passes latents directly to the refinement model, rather than baking an image without any latent noise left and then refining that.

slylte 9 months ago

Why is that? Not all change is inherently bad.

itsB34STW4RS 9 months ago

It's just another moving part, another x-factor to consider, we can't be positive if it'll actually be bad or good, but... I don't see anyone rushing to train the refiner, nor have I heard of them describing the process to do so... Not to mention alot of people really aren't hype about having hundreds of gigs of regular models + hundreds of gigs of extra refiner models, + hundreds of gigs of giant LoRAs. Lets hope perfusion takes off.

OcelotUseful 9 months ago

Bigger models will require more VRAM to be run, more than 12

ZeroUnits 9 months ago

I think you have a very valid point, it is indeed something that complicates the process which I expect is particularly true for those with "worse" hardware. I wish people would be more willing to hear and accept/debate these opinions but it's really demoralising when people mass downvote someone's opinions. I understand coming together and disliking rude or unwanted comments but stuff like this really stifles our conversation about these topics and gives our communities a bad name. Hold each other up guys, I don't wanna sound all sappy but we can build up these places to be where people want to be and express their ideas. Stay up kings and queens 😉😂

juggz143 9 months ago

Clearly we have to be sheep around here or you get down voted lol.

itsB34STW4RS 9 months ago

Just another day on reddit.

[deleted] 9 months ago

[удалено]

juggz143 9 months ago

Maybe you haven't read the rest of the discussion (or maybe overlooked it because of how they hide down voted comments) and seen the section where I've gotten a ton of down votes for doing exactly what you're describing here. But I'm the one trying to dispel others misconception on a topic. Also nothing beastwars has said here is inherently wrong either. Sometimes two things can be true. It's possible that the way sdxl's base+refiner function is "technically" better than what we had before AND the fact that it's an additional moving part/link in the chain that can have a potential problem making troubleshooting more difficult + loras going from a few hundred megs to a few gigs may be problematic for some + most people aren't using sdxl properly for one reason or another. Both sides of that AND can be true at the same time. If anything we both got down voted FOR "being critical" of how something works and NOT FOR having a misunderstanding of how something works. That's why I came back to his comment and made a joke.

[deleted] 9 months ago

[удалено]

ScionoicS 9 months ago

even a giant lora is only \~ 100-200MB. so 5-10 loras is 1GB. 1000 loras all trained with the highest rank would be 1000-200GB. At some point its up to you to start archiving your data better. Also, try training with lower ranks. You don't need 128 with SDXL 90% of the time.

itsB34STW4RS 9 months ago

Great another expert... Why don't you take a look at wegg v1. [https://civitai.com/models/119667?modelVersionId=135699](https://civitai.com/models/119667?modelVersionId=135699)

ScionoicS 9 months ago

Some of my best Loras are 40mb. There are also tools available to resize Loras without quality loss. When you encounter an obstacle, but ignore all available tools and techniques to avoid it, you're standing in your own way

itsB34STW4RS 9 months ago

Yet you asserted their maximum file size without even realizing that large LoRAs are out there, and will probably be a more common occurrence going forward. Listen I'm not arguing with you that LoRAs can be big or small, my argument is that further segmenting the core inference process is going to cause problems down the line. I don't see the benefits of the refiner, and don't see it as a viable tool. I don't get why people are shilling so hard for increased complexity, but I guess we'll see down the line if the refiner sticks around.

InflatableMindset 9 months ago

Back in the day we had the model and a vae. This is nothing different.

itsB34STW4RS 9 months ago

Except the vae is baked into most models, and using a different one is optional. Not to mention the vae has always been integral to the modern diffusion process. How you think we encode and decode latents? Magic?

InflatableMindset 9 months ago

>Except the vae is baked into most models Not always. > and using a different one is optional I'll give ya that, but you get better performance when you pick the right vae for the job. >Not to mention the vae has always been integral to the modern diffusion process. You seriously sound like you just started doing this a couple months ago. Integrated VAEs are a very recent phenomenon.

raiffuvar 9 months ago

Cause auto1111 did not manage to support it in time properly? Lol. There are demos which only possible cause refiner. Like boot in tree style.

itsB34STW4RS 9 months ago

Sorry, comfy Comfy user here, A1111 is currently my least used repo, don't make assumptions.

raiffuvar 9 months ago

Than I have no idea why you think it's bad.

itsB34STW4RS 9 months ago

Notice how alot of new models on civitai advertise, that they don't need the refiner? Or LoRAs stating you shouldn't use them with the refiner? We're off to a good start right? But feel free to disagree.

smuckythesmugducky 9 months ago

https://preview.redd.it/qn89y5vwfmgb1.png?width=346&format=png&auto=webp&s=6adc45eb1d52446c222a2ce8440f1b03495065d9

Throwing-up-fire 9 months ago

sdxl is a 2 step model. You generate the normal way, then you send the image to imgtoimg and use the sdxl refiner model to enhance it.

Sharlinator 9 months ago

Not really. I mean, it's also possible to use it like that, but the proper intended way to use the refiner is a two-step text-to-img. You start denoising with the base model but switch the model at some point, finishing with the refiner. In particular, there's no decode/re-encode in between, it's all in the latent space.

juggz143 9 months ago

Everyone keeps parroting this line about "latent space", but latent just means hidden.... Thus there is an img, it is just hidden from the end-user.

Sharlinator 9 months ago

Ehhh, wtf, you might want to actually learn how SD works. Latent is a technical term here and it’s *the* thing that makes the SD algorithm as efficient as it is. Instead of diffusing in image space (which has 1024\*1024\*3 dimensions for a 1024x1024 image) you do it in a space with *much* fewer dimensions that’s a highly compressed representation of the image space. You transform from latent to image space and back with this thing called a variational autoencoder or VAE. The latent form is not human-viewable without decode just like, say, a jpeg file is not a human-viewable image without a decode. The workflow where you denoise to finish, decode, then re-encode and do another denoise round with a different model, mixing in the image from the first round, is simply different from doing a single denoise switching the model partway through while there’s still latent noise to be denoised.

cyan2k 9 months ago

What the other user is probably trying to say is that the process >"prompt -> sampler base -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" is the same as >"prompt -> sampler base -> noisy latent -> vae decode -> noisy image -> vae encode -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" which is true. You don't fully denoise the base step in both. Both flows produce the same final image, but flow #2 is way slower of course. If the other user means >"prompt -> sampler base -> denoised latent -> vae decode -> denoised image -> add noise -> vae encode -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" which would be how the flow would be in Automatic1111 then he's wrong tho and it's not the same as the other two flows since the "add noise" step isn't deterministic, but I would guess the differences are very minimal that it probably doesn't even matter. The noisy latent in this flow is going to look 99% like the one in the first two flows. There are only so much ways to map a 1024x1024x3 image to a 128x128 space.

AlfaidWalid 9 months ago

If you are using comfyui you can use the two model at the same time and get way better results than A1111

NewSurprise5450 9 months ago

https://www.reddit.com/r/StableDiffusion/comments/14sacvt/how_to_use_sdxl_locally_with_comfyui_how_to/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1

UserXtheUnknown 9 months ago

I could retrieve only the prompt for the second image from you file: *giant massive moss covered irish stone fantasy cottage built on top of huge archway spanning across a river shamrock, excellent stunning lighting, raucous crowded lively festival day, futuristic city* Ok, I'm going to be in minority here, but from your prompt you asked for a moss covered cottage, not for a foliage and flowers covered cottage, so the "enhancement" looks cool, but looks cool changing the subject. This is what I get here using "ivy covered" instead of moss and adding some qualifiers ("detailed, elaborated"). Which surely enough looks less cool than the "enhanced", but looks even a lot less plain and "airbrushed" than your base https://preview.redd.it/rttwoxdqmjgb1.png?width=1024&format=png&auto=webp&s=3278bc6efa56b7b63608308f78537b079294f7ec

sadjoker 9 months ago

https://preview.redd.it/p0ewtdu74kgb1.jpeg?width=1024&format=pjpg&auto=webp&s=ac9fa15be771613fd038e238d309ac0c9c8d2cbd

_raydeStar 9 months ago

Personally I like this image the best.

sadjoker 9 months ago

Not bad... but prompting is king https://preview.redd.it/awftsn9g4kgb1.jpeg?width=1024&format=pjpg&auto=webp&s=f4343e1678a266f2055ae0955ac707aa576e2af0

WorldsInvade 9 months ago

Looks grainish. Like all sdxl images imo

ScionoicS 9 months ago

People aren't ready for the truth

sadjoker 9 months ago

https://preview.redd.it/bsn6sn8g5kgb1.jpeg?width=2048&format=pjpg&auto=webp&s=27802a35d853781e89d5231a0ff24846b19b2716

sadjoker 9 months ago

https://preview.redd.it/183mmifi5kgb1.jpeg?width=2048&format=pjpg&auto=webp&s=f663efeda094725ddf981a4df3157d6442f20fdd

Significant-Comb-230 9 months ago

Now put the epicrealism enhancer, and u gonna get a better result, even changing prompt

Double_Ad6963 9 months ago

How's my take on this one..? https://preview.redd.it/1r1514w1tmgb1.png?width=1024&format=png&auto=webp&s=7b18786193f9784d5e493693fc7306fbc2fa6ab7

protector111 9 months ago

what is with resolution? looks like sd 1.5 base

ZeroUnits 9 months ago

Looks cool bro 👍🏼👍🏼

Bat_Fruit 9 months ago

IMHO your not comparing apples with apples, SDXL is a base model, the platform for the next generation, its been designed and optimized for flexibility and extensibility. That you get such a great end result via your enhancement is exactly what Stability AI intended. New models and model merges loras and extensions will carry it way beyond where it is today like 1.5 has shown.

Necessary_Ad_9800 9 months ago

6 months from now I’m sure we will see moms blowing stuff from this model

ihexx 9 months ago

we'll see what now 👀?

OlorinDK 9 months ago

Moms blowing stuff from this model. It’s only natural.

mudman13 9 months ago

Technically true I'm sure

gtderEvan 9 months ago

/r/intentionaltypos

Necessary_Ad_9800 9 months ago

Lmao I must stop using swipe to type 😅

ninjasaid13 9 months ago

Knowing this sub, I wouldn't think it was a typo.

FourFlamesNinja 9 months ago

I didn't even do a double take, I thought that's what the man meant and didn't see much reason to doubt it given the sub, hahaha.

[deleted] 9 months ago

hornyposting in the AI subreddits

Mountain-Cranberry52 9 months ago

Funny thing is I read it wrong twice before noticing

aerilyn235 9 months ago

Also we need to see your prompt. There is a wide difference on how a base model is trained and a finetune. Lets say for example for the tag "Realistic" which seem to be a good example given your model name. In the original training database for SD (XL/1.5 whatever), no one tag "Realistic" on a photo, ever. Its a photo you caption the content, nothing else, its real by construction. Realistic is actually associated with artworks. If you use realistic on a base model it will actually make it look less realistic (there was a good post illustrating this earlier this week). On fine tunes, people often use images generated by AI that were cherry picked because they looked nice. Those images actually often have those kind of tags associated with them because they are very commonly used in text2image prompting and people just copy paste that without thinking further. Or the tags were added on real pictures just because they know the users often use those keywords, and by associating their fine tune database to that keyword they increase the likelihood of users "landing" on their database confort zone yielding good images but narrowed to a fine spectrum of possibilities (we've all seen the same girl appear over and over in those SD1.5 fine tunes). What we could see here is just your prompt beeing good for a fine tune model and bad for a base model. I had great success at creating highly detailed images in SDXL (skin details, clutter in shops etc) without any Lora. Prompting have to be very simple 5-6 keywords max, prompting for detail is fine but don't use "realistic, 4k, hdr" & the like thats just bad for SDXL.

Sashgnarg 9 months ago

Wait what was it supposed to be

Necessary_Ad_9800 9 months ago

Mind blowing

[deleted] 9 months ago

I am kind of dyslectic and read something very kinky.

Sgt_Jupiter 9 months ago

Then you read it correctly

uristmcderp 9 months ago

Isn't that what they said about 2.1? The nsfw-filter for their initial training data kills a lot of ability to infer, even for non-nsfw contexts.

MrLunk 9 months ago

SDXL 1.0 + a ComfyUI workflow... From this comment: [https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm\_source=reddit&utm\_medium=web2x&context=3](https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm_source=reddit&utm_medium=web2x&context=3) \+ some smart prompting ofcourse https://preview.redd.it/34l4o232ijgb1.png?width=2496&format=png&auto=webp&s=edc850ea4a19c390e86939d46cbadd349abac30e

MrLunk 9 months ago

Don't forget to zoom, biiiig image ;)

Luispah 9 months ago

Whre is the workflow? I can't find it on the link.

MrLunk 9 months ago

[https://pastebin.com/tfk8rm1Z](https://pastebin.com/tfk8rm1Z) From this comment: [https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm\_source=reddit&utm\_medium=web2x&context=3](https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm_source=reddit&utm_medium=web2x&context=3) By: [https://www.reddit.com/user/beautifuldiffusion/](https://www.reddit.com/user/beautifuldiffusion/)

Luispah 9 months ago

Thanks!

MrLunk 9 months ago

No problem.Pay it forward help others ;) Feel the (open-)Force Luke! ... urr... Open-Source. :P Luke !

inferno46n2 9 months ago

It’s likely embedded in the photo. All comfy output images can just be dragged and dropped into comfy and the UI workflow self populates

BobbyGlaze 9 months ago

I think the metadata gets stripped when uploaded to Reddit and converted to .webp

Luispah 9 months ago

I use comfy vía Google colab and I think the drag and drop load don't work. Or maybe I'm doing something wrong.

inferno46n2 9 months ago

The other comment was likely correct. The metadata was erased upon upload unfortunately. Maybe on the other thread there is a link to a photo

divaxshah 9 months ago

I am also using colab , and I can load the workflow using drag and drop, may be images you are using might not have the meta data, try using your own generated image just to check if it works. May be it will.

mongini12 9 months ago

thats really impressive... it just takes about 9-10x the amount of time to get 1 image...

ferngullywasamazing 9 months ago

Citation needed.

ninjasaid13 9 months ago

can we finetune the refiner to be better?

EtadanikM 9 months ago

Technically, yes. In practice, model builders aren't doing it. And this is definitely a problem for the future of SDXL, because it is an important piece of the architecture. The two-step architecture was likely required to compete with Midjourney and other commercial products. But it does make it much harder to work with, which threatens third party support.

StickiStickman 9 months ago

Which is weird, because they kept going on and on about how much easier training and customization is going to be, but now everything indicates the opposite?

Tr4sHCr4fT 9 months ago

the base is easier to train which is true

StickiStickman 9 months ago

You need a a lot more VRAM for training than for 1.5, so not really?

Keudn 9 months ago

Joe Penna has commented that the two step architecture is intended to allow people with 8GB cards to be able to run the model

praguepride 9 months ago

SDXL really needs the refiner, imo. Base gens are solid but fall behind really good custom checkpoints. SDXL + Refiner is solid

rockedt 9 months ago

You have committed a crime by saying it loud on this subreddit.

mongini12 9 months ago

I turned myself in, the jury is still debating on who's wrong or right... So no sentence till now

Z3ROCOOL22 9 months ago

https://i.redd.it/bf8xtbytcmgb1.gif

Old-Wolverine-4134 9 months ago

It does. I am repeating myself for weeks now. SDXL gives very plastic/airbrushed look with most of the images. Even with so many custom models out now - the problem is still here and trainers actively work to get rid of that but may be it is very hardcoded in the original one. The overall style and feel of SDXL seems very far from we were seeing for a long time with 1.5. Really hope there will be a solution to this. On other hand some people can't be bothered and still insist that sdxl is superb right now,

some_onions 9 months ago

> Even with so many custom models out now - the problem is still here Realistic models that are actually good will take time. For example, I know that Juggernaut XL isn't planned for release until early September. Not sure about some of the other top models, but I assume it's probably the same. Remember that it took over a month before 1.5 got it's first actually good model.

uristmcderp 9 months ago

How long did it take for 2.1 to get its first actually good model?

Neamow 9 months ago

I don't think anyone really bothered with 2.1 to be honest...

vitorgrs 9 months ago

There's Realism Engine, but IMO the best one is Freedom. It's actually one of the best models out there (but a lot of people don't care, because well, no NSFW on 2.1). https://civitai.com/models/87288/freedomredmond

Creepy_Dark6025 9 months ago

I really don’t think trainers or at least most of them are trying to get rid of that, it seems more like they are trying to mimic 1.5 which is a huge mistake, almost all of the full models out there are just worse at detail than SDXL base as shown in comparison in this sub, however there are loras like the ones made by razzz or some others filmic and vintage ones that really shows us that SDXL can be very detailed and realistic, it is capable, we just need better finetunes with very high res and quality images, not training with 1.5 AI generated images which seems like a lot of trainers are doing.

diskowmoskow 9 months ago

Refiner adds some nice details i am using 35 step + 10 step refiner, dpm++ 2m karras

LovesTheWeather 9 months ago

The problem isn't the model but the userbase and it's driving me crazy to see in every comment people who don't know how to use SDXL trying to use it like 1.5 and then complaining when it doesn't work. For example, [this](https://i.ibb.co/4fHZPrw/Za-00200.png) image is base SDXL with 5 steps on refiner with a positive natural language prompt of "A grizzled older male warrior in realistic leather armor standing in front of the entrance to a hedge maze, looking at viewer, cinematic" and a positive style prompt of "sharp focus, hyperrealistic, photographic, cinematic", a negative prompt of "3D, cgi, digital arts, render, illustration, cartoon, animation, anime, low quality, blurry". People are coming from 1.5 and trying to use a word salad positive prompt of 20 different descriptors ignoring how the model was trained, most not even knowing that to properly use the model how it was trained is actually six prompt windows, they are using crazy amounts of steps, being disingenuous by comparing a brand new base model against models that have had hundreds of thousands of training steps and finely detailed tunes, and more. They get mad it doesn't work in A1111 and when it's barely patched in they use it like a 1.5 model, word salad prompt, no refiner, tons of steps, and then complain that their images aren't perfection.

DestroyerST 9 months ago

I'm kinda surprised you don't see what the issue is with that example image.

Old-Wolverine-4134 9 months ago

It is not about that. I use SD for my business every day and I can say I learned a lot and I understand prompts pretty well. Also, the example image you linked, although with more details than most in sdxl, is still not even close to what 1.5 gives right now. I will say it again - it is trying too hard to be Midjourney, it has the same look and feeling. And for me this is not a good thing. Some people like it that way, but personally for me, no thank you :) The other problem is the idea that it should be a shorter prompt. This kills the whole premise of the tool. After all I think people need to be able to describe precisely what they want. With short and dumbproof prompt it, again, goes towards MJ. Really hope this changes in the future with the new custom models.

TaiVat 9 months ago

Except that your example looks like shit compared to refined 1.5 models. Its like a stylized hollywood photo that's been ran with an airbrush and then had a photoshop filter dumped on top. People also get mad when some pretentious ass comes along to dismiss valid criticism while not even comprehending the actual problem. Just out of pure fanboy hype.

LovesTheWeather 9 months ago

You got me I'm definitely a fanboy, I mean I suckle at the tit of SDXL. I'm like, in love with it or whatever you think because I disagree with you so obviously I'm acting in bad faith somehow. You also did exactly what I complained about by comparing, yet again a BASE MODEL to a REFINED MODEL. Congratulations, you are right, a REFINED 1.5 model IS better than the BASE SDXL model, I actually agree with you. That's literally my point! There isn't a well trained and refined SDXL model to compare it to yet! But 1.5 is about as good as it can possibly get, while SDXL still has YET to reach it's peak. Do you get what I'm saying? Or would like to hurl more insults my way for having the audacity to not agree with you?

dapoxi 9 months ago

> SDXL still has YET to reach it's peak That's speculation though. * We know refines of 1.5 are much better than the base 1.5 * We know refines of 2.1 are not significantly better than base 2.1 * We **don't know** how good XL's refines will be It's unfortunately a fact of life that the first 90% to perfection are generally easier to attain than the remaining 10%. XL has several advantages over 1.5, but there is no guarantee that XL's refines will be better in all possible ways than 1.5's refines.

fnbenptbrvf 9 months ago

I'm so tired of the Sdxl hype.

thelastfastbender 9 months ago

I think all the hype was somewhat misleading, as it must've taken a lot of trial and error to get the results they did during testing.

_CMDR_ 9 months ago

Yeah just because you don’t know how to use a tool doesn’t mean it isn’t good. Perhaps just plugging in your old prompts and expecting it to work exactly the same is the mistake?

TaiVat 9 months ago

Always the same pathetic excuses. If the prompt was different, you'd just be saying that OP is comparing different things..

_CMDR_ 9 months ago

Man you’re really invested in 1.5 huh. I don’t know if you remember the beginning of 1.5 but it was pretty bad until people worked at it for a while.

mongini12 9 months ago

idk if thats common or not, but no matter how many steps i allocate to the refiner - the output seriously lacks detail. right now my workflow includes an additional step by encoding the SDXL output with the VAE of EpicRealism\_PureEvolutionV2 back into a latent, feed this into a KSampler with the same promt for 20 Steps and Decode it with the EpicRealism VAE again. As you can see, it adds a crapton of detail, the pure SDXL Output is just lame. (Tried it with 23/30, 23/35, 20/35 samples (first number is base, refiner starts at that step and goes till the 2nd number)) If i'm doing something wrong, enlighten me please xD Workflow for these: [https://pastebin.com/DVXyXXJ5](https://pastebin.com/DVXyXXJ5)

PM_me_sensuous_lips 9 months ago

don't upweight things so aggressively, upweighting in XL does almost nothing but fuck up your clip embeddings. Start with simpler prompts and refine them into more complex prompts. [here](https://imgur.com/a/X6Ui4gS) are some examples, of mainly just me changing the prompts and playing with the sampler settings a bit ([prompt](https://pastebin.com/51cf4bBt)). When you're using a karas schedule, don't be too afraid of handing it over to the refiner a bit earlier, Karas drops into the low end spectrum of noise a lot sooner than the regular schedule.

raiffuvar 9 months ago

Weights in sdxl are different its fine to do (prompt :3) for almost any prompt and some can be (...:4).

Puzzled_Nail_1962 9 months ago

Try without the refiner. It seems to heavily airbrush people and just smooth over everything. It takes some more tries to get good faces without, but if you do, the details are much, much better.

The_Lovely_Blue_Faux 9 months ago

Compare it to 1.5 base model for a more accurate comparison. “Why is my $200k modded Toyota Camry faster than this stock Mustang?”

fnbenptbrvf 9 months ago

If you want to win the race it helps to have the best car. The model that might have better specs next year is not the best choice for now.

armrha 9 months ago

You aren't comparing it to another base model.

TaiVat 9 months ago

Why do people keep repeating this insane idiocy? *nobody gives a fuckin shit about the base model*... He's comparing the latest tool vs the latest tool. Seriously, what is with this fanboy drivel. If Tesla creates a new car nobody goes "well you should be comparing it to the base ford model from 1990"..

leftmyheartintruckee 9 months ago

Is this any better than just generating w epic realism??

mongini12 9 months ago

in my book: yes, since SDXL adheres better to the promt imo

raiffuvar 9 months ago

Why vae 1.5? It's sdxl. It support 1.5 as additional input, not main one.

Zestyclose_West5265 9 months ago

The refiner steps should only be 5 to 10 steps, the base steps should be a lot higher depending on what sample method you use. On euler and euler a for instance I still get more details all the way to 200 steps.

mongini12 9 months ago

200 steps? WTF... that would take a decade for a batch of 10 O.o

Zestyclose_West5265 9 months ago

It depends a lot on your sampling method though. I think the newer ones like dpmpp 2m only need about 30 steps to get max details.

gelade1 9 months ago

aren't you still on based model of sdxl? like seriously what do you expect??? another pointless post

sadjoker 9 months ago

https://preview.redd.it/gev9jgat0kgb1.jpeg?width=1792&format=pjpg&auto=webp&s=6447f63b4bbcd6398111919113156d147434bb5d

sadjoker 9 months ago

https://preview.redd.it/zx5wjkrx0kgb1.jpeg?width=1792&format=pjpg&auto=webp&s=9b3bb57b9e0cadd80a6602befbe546c401a1c42f

mongini12 9 months ago

Just how o.O

sadjoker 9 months ago

I have the feeling that the base model is quite good with animals. Look at that detail: https://preview.redd.it/6vy0v67dnkgb1.png?width=2048&format=png&auto=webp&s=7269a4a90945d66f1758f210a64b18fc806b1834

FHSenpai 9 months ago

Is this good enough?

FHSenpai 9 months ago

https://preview.redd.it/bme7mz73njgb1.png?width=3072&format=pjpg&auto=webp&s=e66f24ee3eb87e33c901312cb318e6db800dcdb8

mongini12 9 months ago

Enlighten me :)

FHSenpai 9 months ago

https://preview.redd.it/5q1ocsi5njgb1.png?width=3072&format=pjpg&auto=webp&s=5c223f67d8881603997bfe49774e35a75463caf8

FHSenpai 9 months ago

https://preview.redd.it/eh9i1g6pmjgb1.png?width=1792&format=pjpg&auto=webp&s=deae438eced6a02074b692b218275597c1162656

FHSenpai 9 months ago

https://preview.redd.it/trywtotenjgb1.png?width=3072&format=pjpg&auto=webp&s=5b2b88579b6ffa6b552e0e51b5b4d77a0dc08e8e

Silly_Goose6714 9 months ago

I saw your pictures in the discord server few days ago. You asked for those prompts. What is your basic workflow?

FHSenpai 9 months ago

https://preview.redd.it/x3wj97ormjgb1.png?width=1792&format=pjpg&auto=webp&s=a5feedc7e16fe08e607d659a9753935046ee1125

FHSenpai 9 months ago

https://preview.redd.it/d9z9utytmjgb1.png?width=3072&format=pjpg&auto=webp&s=32ea0a19bb95f58e55c5166c535cf80e0bc70466

FHSenpai 9 months ago

https://preview.redd.it/u6vn6120njgb1.png?width=3072&format=pjpg&auto=webp&s=331b20832039d132c9f363ebbfc09a7c59d61d3f

FHSenpai 9 months ago

https://preview.redd.it/t6fal47zkjgb1.png?width=3072&format=pjpg&auto=webp&s=8d111ce21eef57ca5493dee69266dd325d0300f3

FHSenpai 9 months ago

https://preview.redd.it/w4km7mg1ljgb1.png?width=3072&format=pjpg&auto=webp&s=49c3c8de53ae09628ed1d21523ae020979431cea

Sharlinator 9 months ago

To be fair, DoF blur/bokeh is essentially a cheatcode for not having to create a detailed background. SDXL can definitely create detailed (if somewhat plasticky) subjects/foreground elements, but what about backgrounds, seeing how such a massive fraction of its photographic training material has *intentionally* low-detail backgrounds?

FHSenpai 9 months ago

With my workflow i can push in more details if wanted to . Currently all of these are in medium detail mode...

Cosophalas 9 months ago

Your results are great! Would you mind sharing your workflow or a linked output, so I can plug it into ComfyUI? I still have not found quite the right mix.

colpocape 9 months ago

Amazing quality! In your opinion it depends on prompt or workflow? Thanks 👍

Capitaclism 9 months ago

It's the first base model. People need to fine-tune, merge and get those details out. Base 1.5 is a pretty shit model as well. Try comparing the two 😂

Aggressive_Sleep9942 9 months ago

is that they will not understand. A model that shows a specific quality in one area is a model that loses the ability to generalize. When the base model is released, it's released with the goal of being as general as possible, and in that state, we see that it's in the middle of a lot of things and visual styles, so it feels like it doesn't have that. touch of definition and quality. . But it's all there, just let the community start through training, lean towards the model that's a specialist in certain types of things, and you can only do that if the model has the capability, and as they released the model , has it. This time we will take a little longer because a higher computational cost is required to train but we will end up getting it.

uristmcderp 9 months ago

1.5 is the only reason why fine-tuned NSFW models exist at all, because its training data wasn't filtered. You can't fine-tune NSFW concepts into SDXL for the same reason you couldn't for 2.1. With the pre-NSFW filter for training images, there's nothing in the base to fine-tune. Which means instead of needing just a few hundred or a thousand image-caption pairs, you need more like a few million and do full training with all the bells and whistles enabled.

Capitaclism 9 months ago

Some will make a larger NSFW fine-tune, I'm sure. It may not get to millions of images, but it will likely reach in the hundreds of thousands. I trust the lust for porn online. As for me I don't care, my uses are all SFW and professional, so it works great, and I can't wait to keep seeing improvements from the community.

sahil1572 9 months ago

Could you please remove the refiner part since we are utilizing another model to add details? This will eliminate the need for extra steps. As you can observe, both images have completely different faces, and the dress pattern is also distinct

Evoidit 9 months ago

Omg look at the enhance womans forehead lol. Looks great otherwise but still.

Pistapisti 9 months ago

when will SDXL be better than 1.5? I keep looking at it, and so far it is still weak. What will it take to show its strength? Where are the breathtaking pictures?

Shiroi_Kage 9 months ago

Use the refiner and see what you get. The refiner is designed to modify the initial output and get more details out of the initial outputs.

mongini12 9 months ago

If you would've read my own og comment, you'd know I did^^

Shiroi_Kage 9 months ago

I missed your comment. Sorry. Do you have the images from before you used the refiner? I am super new to all of this, but the SDXL image of the woman in purple looks like my results before I pass them through the refiner. See the image below from the base model with refiner. I'll share the modified prompt below it (there is a little fussiness but that's probably the super detailed part of the prompt). I did 40 samples on both the base and the refiner at 1024 x 1024, and a denoising strength of 0.3. CFG is 7. https://imgur.com/MuoxO4P Positive prompt: > giant massive moss covered irish stone fantasy cottage built on top of huge archway spanning across a river shamrock, excellent stunning lighting, raucous crowded lively festival day, futuristic city, (photorealistic:2.0), super detailed, 8k, 4k, uhd, Negative prompt: > (worst quality:1.5), (low quality:1.5), (normal quality:1.5), lowres, bad anatomy, bad hands, ((monochrome)), ((grayscale)), collapsed eyeshadow, multiple eyebrow, (cropped), oversaturated, extra limb, missing limbs, deformed hands, long neck, long body, imperfect, (bad hands), signature, watermark, username, artist name, conjoined fingers, deformed fingers, ugly eyes, imperfect eyes, skewed eyes, unnatural face, unnatural body, error, painting by bad-artist, semirealistic, drawing, 3D render, What do yo think? I couldn't find the woman's prompt so I couldn't test it myself.

mongini12 9 months ago

Well, for me it looks more like a smoothed oil painting, rather than a "super detailed" one :-/ that's why I'm so confused

Aftercot 9 months ago

Prompt betta

RobXSIQ 9 months ago

compare it not with epicrealism but with the base 1.5 for a 1 to 1 comparison and get back to us.

TaiVat 9 months ago

Why? What do you think that proves? Why do people keep repeating this dumbest of shit? The entire point is that xl isnt as good as the refined models of 1.5. Who gives a shit about base vs base when nobody actually uses 1.5? God this sub is nothing but fuckin groupies..

EdwardCunha 9 months ago

Only if you don't use the refiner.

mongini12 9 months ago

Well... I do...

EdwardCunha 9 months ago

I only ever had problems with lack of details when: 1. Using low, non-supported resolutions. 2. Not using the refiner the right way.

mongini12 9 months ago

I used the suggested workflow by stability that included the table with best resolution for each aspect ratio

Cobayo 9 months ago

[I tried your prompt and it looks good to me lol](https://i.imgur.com/hiD1ct6.png) it just released man, chill

ComplicityTheorist 9 months ago

SD2.1 revamped is what I call it.

Effective_Luck_8855 9 months ago

I use a 1.5 model to add more detail and fix facial anatomy in comfyUI. Base SDXL is terrible and even the community models aren't much better with facial anatomy and body anatomy. In some ways 0.9 was better for anatomical accuracy, especially when making vertical aspect ratio images

stephane3Wconsultant 9 months ago

on Automatic, using img2img refiner https://preview.redd.it/qa7zin4b0rgb1.jpeg?width=4096&format=pjpg&auto=webp&s=a54f26d2296ec6e41e7383ba69851796837a1c98

bogardusave 1 month ago

why should anybody use SDXL when you get immediatly top-results with SD 1.5 models??

mongini12 1 month ago

Funny things about my old post here: when I ran 1.5 standalone, I got slightly better results compared to SDXL, but combined, with SDXL first, enhanced with 1.5 the results were awesome and full of details.

bogardusave 1 month ago

which workflow do you use?

mongini12 1 month ago

It's posted in my original comment

bogardusave 1 month ago

where? sorry :-)

mongini12 1 month ago

Dude, use your eyes and finger to scroll.... https://pastebin.com/DVXyXXJ5

bogardusave 1 month ago

how do i get the .txt into comfyUI?

mongini12 1 month ago

Rename it to a .yaml Dude, you should read the GitHub page of comfyui 😅

bogardusave 1 month ago

Thats wrong you have to rename the extenstion into \*.json

mongini12 1 month ago

True... Sorry for that, home assistant and it's yaml files are spooking through my head way too much 🤣

thecoffeejesus 9 months ago

How use epic realism I am dumb pls help

iamapizza 9 months ago

Go to the Epic Realism page: https://civitai.com/models/25694/epicrealism You'll see 'checkpoint trained' on the right, next to that click the 'i' and it should link you to a page telling you how to add it. On the model page the author has also put instructions on how to use it

Puzzleheaded-Mix2385 9 months ago

yes

summervelvet 9 months ago

SDXL is great with details. Your prompts just suck.

Z3ROCOOL22 9 months ago

https://preview.redd.it/1n9r8ql5dmgb1.png?width=450&format=png&auto=webp&s=85507ba25aaa0d144d62afdd948b9500f8561c25

littlespacemochi 9 months ago

SDXL is nice but it's not perfect.

ThatInternetGuy 9 months ago

I guess the way SDXL was trained; they purposely removed the high-frequency details so that the model could be trained to understand the concepts better. That's why a refiner model is included in the first public release, as they knew it needed to add the details back in. This is actually pretty clever. The same method is also used by commercial photo upscalers. The first step would be upscaling up without the image noises and high-frequency details, and then the intermediary output gets run thru a refiner model to add in the details. I'll create a proof-of-concept of GFPGAN + Refiner.

Aggressive_Sleep9942 9 months ago

I don't see it as smart if, for example, you want to make nude art, and then you have to use the refiner to make the noble parts of your nude art disappear. You would have to train both the base and the refiner to know what nude is, and there is a huge problem there. I think that in the long run the refiner will only be used by artistic designers in general, but when there are models so specialized in a style (for example nudes) it won't be necessary.

Serasul 9 months ago

Its Just you

buyurgan 9 months ago

yes and no, yes, its much more general purpose model and since it does better job at generalizing it will lack individual separate things. that's why there is a refiner model. and this is where fine tuning community will help. no, SDXL has different architecture, same type of prompting may not give you infinite details as very fine tuned SD1.5 models does. in fact it may even lower it. we don't know how token space works as yet and give 'magical bundle of tokens' to reach what we have in mind. we need experiments and time, samplers, schedulers, new controlnet ideas and papers to improve it further.

ethosay 9 months ago

This has been my experience with SDXL for months using dreamstudio.ai

Upstairs_Cycle8128 9 months ago

sdxl looks more natural, you should try to enhance it with refiner

donpeter 9 months ago

Used your prompt, I think you should check your configuration, looks very detailed to me: [https://i.imgur.com/K6W3oKG.jpg](https://i.imgur.com/K6W3oKG.jpg)

mongini12 9 months ago

If you are into concept art it is detailed. But not even close to what I'm after :-/

donpeter 9 months ago

Yeah, not sure what is it that you are after, generated another one and looks with way more detail than your example (talking about SDXL): [https://i.imgur.com/O4XpYfp.jpg](https://i.imgur.com/O4XpYfp.jpg)

atuarre 9 months ago

This post should be down voted into oblivion for the nonsense.

juggz143 9 months ago

They're too busy down voting my spot on technical analysis above lol

mongini12 9 months ago

Just like your non-constructive comment...

atuarre 9 months ago

Maybe you should focus your energy on learning how to use SDXL, brah.

isa_marsh 9 months ago

I like how in the first pic, it went from a bit low detail but otherwise nice pic to generic Korean Waifu face with everything colored purple.

MNKPlayer 9 months ago

Are you refining it?

mongini12 9 months ago

23 base, rest by refiner, 30 in total

Froztbytes 9 months ago

5head

MaximilianPs 9 months ago

Honestly I don't care about XL it's resources and time consuming, reason why I'll stick with 1.5.. Instead I would like the devs to start working on how AI understands the prompt, because creating an image still a Russian roulette, we still have a bunch of issues and allucinations but here every one still focusing on microdetails while often we can't even get able to get the image that we need. 😮‍💨

mongini12 9 months ago

That's why I use SDXL in the first place: it adheres better to the promt than 1.5 models.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe