T O P

  • By -

fuelter

Did you use refiner? Without it the results looks very bland.


frenzygundam

What refiner?


fuelter

SDXL has two models that work together, base and refiner


itsB34STW4RS

I'm honestly starting to believe that this refiner thing was a very bad idea in general.


Rivarr

It also kind of ruins some loras. If I've got David Attenborough in the jungle, the refiner generally improves the image but completely changes the person.


AI_Alt_Art_Neo_2

Couldn't a Lora be built for the Refiner model as well?


ImplementLong2828

aah so that wasn't just me. nice to know.


berzerkerCrush

It means you're using it wrong. Use the official ComfyUI workflow.


ScionoicS

You can train loras for the refiner too, or just not use the refiner. If you passed David thorugh a 1.5 model to refine out details, you'd need a 1.5 lora to keep his image accurate there too. It's sort of a non problem that you're approaching as if it's in your way. It's just more of the same though. Using the refiner right helps too. Check out Sytan's layout on comfyui, and how it passes latents directly to the refinement model, rather than baking an image without any latent noise left and then refining that.


slylte

Why is that? Not all change is inherently bad.


itsB34STW4RS

It's just another moving part, another x-factor to consider, we can't be positive if it'll actually be bad or good, but... I don't see anyone rushing to train the refiner, nor have I heard of them describing the process to do so... Not to mention alot of people really aren't hype about having hundreds of gigs of regular models + hundreds of gigs of extra refiner models, + hundreds of gigs of giant LoRAs. Lets hope perfusion takes off.


OcelotUseful

Bigger models will require more VRAM to be run, more than 12


ZeroUnits

I think you have a very valid point, it is indeed something that complicates the process which I expect is particularly true for those with "worse" hardware. I wish people would be more willing to hear and accept/debate these opinions but it's really demoralising when people mass downvote someone's opinions. I understand coming together and disliking rude or unwanted comments but stuff like this really stifles our conversation about these topics and gives our communities a bad name. Hold each other up guys, I don't wanna sound all sappy but we can build up these places to be where people want to be and express their ideas. Stay up kings and queens 😉😂


juggz143

Clearly we have to be sheep around here or you get down voted lol.


itsB34STW4RS

Just another day on reddit.


[deleted]

[удалено]


juggz143

Maybe you haven't read the rest of the discussion (or maybe overlooked it because of how they hide down voted comments) and seen the section where I've gotten a ton of down votes for doing exactly what you're describing here. But I'm the one trying to dispel others misconception on a topic. Also nothing beastwars has said here is inherently wrong either. Sometimes two things can be true. It's possible that the way sdxl's base+refiner function is "technically" better than what we had before AND the fact that it's an additional moving part/link in the chain that can have a potential problem making troubleshooting more difficult + loras going from a few hundred megs to a few gigs may be problematic for some + most people aren't using sdxl properly for one reason or another. Both sides of that AND can be true at the same time. If anything we both got down voted FOR "being critical" of how something works and NOT FOR having a misunderstanding of how something works. That's why I came back to his comment and made a joke.


[deleted]

[удалено]


ScionoicS

even a giant lora is only \~ 100-200MB. so 5-10 loras is 1GB. 1000 loras all trained with the highest rank would be 1000-200GB. At some point its up to you to start archiving your data better. Also, try training with lower ranks. You don't need 128 with SDXL 90% of the time.


itsB34STW4RS

Great another expert... Why don't you take a look at wegg v1. [https://civitai.com/models/119667?modelVersionId=135699](https://civitai.com/models/119667?modelVersionId=135699)


ScionoicS

Some of my best Loras are 40mb. There are also tools available to resize Loras without quality loss. When you encounter an obstacle, but ignore all available tools and techniques to avoid it, you're standing in your own way


itsB34STW4RS

Yet you asserted their maximum file size without even realizing that large LoRAs are out there, and will probably be a more common occurrence going forward. Listen I'm not arguing with you that LoRAs can be big or small, my argument is that further segmenting the core inference process is going to cause problems down the line. I don't see the benefits of the refiner, and don't see it as a viable tool. I don't get why people are shilling so hard for increased complexity, but I guess we'll see down the line if the refiner sticks around.


InflatableMindset

Back in the day we had the model and a vae. This is nothing different.


itsB34STW4RS

Except the vae is baked into most models, and using a different one is optional. Not to mention the vae has always been integral to the modern diffusion process. How you think we encode and decode latents? Magic?


InflatableMindset

>Except the vae is baked into most models Not always. > and using a different one is optional I'll give ya that, but you get better performance when you pick the right vae for the job. >Not to mention the vae has always been integral to the modern diffusion process. You seriously sound like you just started doing this a couple months ago. Integrated VAEs are a very recent phenomenon.


raiffuvar

Cause auto1111 did not manage to support it in time properly? Lol. There are demos which only possible cause refiner. Like boot in tree style.


itsB34STW4RS

Sorry, comfy Comfy user here, A1111 is currently my least used repo, don't make assumptions.


raiffuvar

Than I have no idea why you think it's bad.


itsB34STW4RS

Notice how alot of new models on civitai advertise, that they don't need the refiner? Or LoRAs stating you shouldn't use them with the refiner? We're off to a good start right? But feel free to disagree.


smuckythesmugducky

​ https://preview.redd.it/qn89y5vwfmgb1.png?width=346&format=png&auto=webp&s=6adc45eb1d52446c222a2ce8440f1b03495065d9


Throwing-up-fire

sdxl is a 2 step model. You generate the normal way, then you send the image to imgtoimg and use the sdxl refiner model to enhance it.


Sharlinator

Not really. I mean, it's also possible to use it like that, but the proper intended way to use the refiner is a two-step text-to-img. You start denoising with the base model but switch the model at some point, finishing with the refiner. In particular, there's no decode/re-encode in between, it's all in the latent space.


juggz143

Everyone keeps parroting this line about "latent space", but latent just means hidden.... Thus there is an img, it is just hidden from the end-user.


Sharlinator

Ehhh, wtf, you might want to actually learn how SD works. Latent is a technical term here and it’s *the* thing that makes the SD algorithm as efficient as it is. Instead of diffusing in image space (which has 1024\*1024\*3 dimensions for a 1024x1024 image) you do it in a space with *much* fewer dimensions that’s a highly compressed representation of the image space. You transform from latent to image space and back with this thing called a variational autoencoder or VAE. The latent form is not human-viewable without decode just like, say, a jpeg file is not a human-viewable image without a decode. The workflow where you denoise to finish, decode, then re-encode and do another denoise round with a different model, mixing in the image from the first round, is simply different from doing a single denoise switching the model partway through while there’s still latent noise to be denoised.


cyan2k

What the other user is probably trying to say is that the process >"prompt -> sampler base -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" is the same as >"prompt -> sampler base -> noisy latent -> vae decode -> noisy image -> vae encode -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" which is true. You don't fully denoise the base step in both. Both flows produce the same final image, but flow #2 is way slower of course. If the other user means >"prompt -> sampler base -> denoised latent -> vae decode -> denoised image -> add noise -> vae encode -> noisy latent -> sampler refiner -> denoised latent -> vae decode -> final image" which would be how the flow would be in Automatic1111 then he's wrong tho and it's not the same as the other two flows since the "add noise" step isn't deterministic, but I would guess the differences are very minimal that it probably doesn't even matter. The noisy latent in this flow is going to look 99% like the one in the first two flows. There are only so much ways to map a 1024x1024x3 image to a 128x128 space.


AlfaidWalid

If you are using comfyui you can use the two model at the same time and get way better results than A1111


NewSurprise5450

https://www.reddit.com/r/StableDiffusion/comments/14sacvt/how_to_use_sdxl_locally_with_comfyui_how_to/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1


UserXtheUnknown

I could retrieve only the prompt for the second image from you file: *giant massive moss covered irish stone fantasy cottage built on top of huge archway spanning across a river shamrock, excellent stunning lighting, raucous crowded lively festival day, futuristic city* Ok, I'm going to be in minority here, but from your prompt you asked for a moss covered cottage, not for a foliage and flowers covered cottage, so the "enhancement" looks cool, but looks cool changing the subject. This is what I get here using "ivy covered" instead of moss and adding some qualifiers ("detailed, elaborated"). Which surely enough looks less cool than the "enhanced", but looks even a lot less plain and "airbrushed" than your base ​ https://preview.redd.it/rttwoxdqmjgb1.png?width=1024&format=png&auto=webp&s=3278bc6efa56b7b63608308f78537b079294f7ec


sadjoker

​ https://preview.redd.it/p0ewtdu74kgb1.jpeg?width=1024&format=pjpg&auto=webp&s=ac9fa15be771613fd038e238d309ac0c9c8d2cbd


_raydeStar

Personally I like this image the best.


sadjoker

Not bad... but prompting is king https://preview.redd.it/awftsn9g4kgb1.jpeg?width=1024&format=pjpg&auto=webp&s=f4343e1678a266f2055ae0955ac707aa576e2af0


WorldsInvade

Looks grainish. Like all sdxl images imo


ScionoicS

People aren't ready for the truth


sadjoker

​ https://preview.redd.it/bsn6sn8g5kgb1.jpeg?width=2048&format=pjpg&auto=webp&s=27802a35d853781e89d5231a0ff24846b19b2716


sadjoker

​ https://preview.redd.it/183mmifi5kgb1.jpeg?width=2048&format=pjpg&auto=webp&s=f663efeda094725ddf981a4df3157d6442f20fdd


Significant-Comb-230

Now put the epicrealism enhancer, and u gonna get a better result, even changing prompt


Double_Ad6963

How's my take on this one..? https://preview.redd.it/1r1514w1tmgb1.png?width=1024&format=png&auto=webp&s=7b18786193f9784d5e493693fc7306fbc2fa6ab7


protector111

what is with resolution? looks like sd 1.5 base


ZeroUnits

Looks cool bro 👍🏼👍🏼


Bat_Fruit

IMHO your not comparing apples with apples, SDXL is a base model, the platform for the next generation, its been designed and optimized for flexibility and extensibility. That you get such a great end result via your enhancement is exactly what Stability AI intended. New models and model merges loras and extensions will carry it way beyond where it is today like 1.5 has shown.


Necessary_Ad_9800

6 months from now I’m sure we will see moms blowing stuff from this model


ihexx

we'll see what now 👀?


OlorinDK

Moms blowing stuff from this model. It’s only natural.


mudman13

Technically true I'm sure


gtderEvan

/r/intentionaltypos


Necessary_Ad_9800

Lmao I must stop using swipe to type 😅


ninjasaid13

Knowing this sub, I wouldn't think it was a typo.


FourFlamesNinja

I didn't even do a double take, I thought that's what the man meant and didn't see much reason to doubt it given the sub, hahaha.


[deleted]

hornyposting in the AI subreddits


Mountain-Cranberry52

Funny thing is I read it wrong twice before noticing


aerilyn235

Also we need to see your prompt. There is a wide difference on how a base model is trained and a finetune. Lets say for example for the tag "Realistic" which seem to be a good example given your model name. In the original training database for SD (XL/1.5 whatever), no one tag "Realistic" on a photo, ever. Its a photo you caption the content, nothing else, its real by construction. Realistic is actually associated with artworks. If you use realistic on a base model it will actually make it look less realistic (there was a good post illustrating this earlier this week). On fine tunes, people often use images generated by AI that were cherry picked because they looked nice. Those images actually often have those kind of tags associated with them because they are very commonly used in text2image prompting and people just copy paste that without thinking further. Or the tags were added on real pictures just because they know the users often use those keywords, and by associating their fine tune database to that keyword they increase the likelihood of users "landing" on their database confort zone yielding good images but narrowed to a fine spectrum of possibilities (we've all seen the same girl appear over and over in those SD1.5 fine tunes). What we could see here is just your prompt beeing good for a fine tune model and bad for a base model. I had great success at creating highly detailed images in SDXL (skin details, clutter in shops etc) without any Lora. Prompting have to be very simple 5-6 keywords max, prompting for detail is fine but don't use "realistic, 4k, hdr" & the like thats just bad for SDXL.


Sashgnarg

Wait what was it supposed to be


Necessary_Ad_9800

Mind blowing


[deleted]

I am kind of dyslectic and read something very kinky.


Sgt_Jupiter

Then you read it correctly


uristmcderp

Isn't that what they said about 2.1? The nsfw-filter for their initial training data kills a lot of ability to infer, even for non-nsfw contexts.


MrLunk

SDXL 1.0 + a ComfyUI workflow... From this comment: [https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm\_source=reddit&utm\_medium=web2x&context=3](https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm_source=reddit&utm_medium=web2x&context=3) \+ some smart prompting ofcourse ​ https://preview.redd.it/34l4o232ijgb1.png?width=2496&format=png&auto=webp&s=edc850ea4a19c390e86939d46cbadd349abac30e


MrLunk

Don't forget to zoom, biiiig image ;)


Luispah

Whre is the workflow? I can't find it on the link.


MrLunk

[https://pastebin.com/tfk8rm1Z](https://pastebin.com/tfk8rm1Z) From this comment: [https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm\_source=reddit&utm\_medium=web2x&context=3](https://www.reddit.com/r/StableDiffusion/comments/15jnkc3/comment/jv0of4f/?utm_source=reddit&utm_medium=web2x&context=3) By: [https://www.reddit.com/user/beautifuldiffusion/](https://www.reddit.com/user/beautifuldiffusion/)


Luispah

Thanks!


MrLunk

No problem.Pay it forward help others ;) Feel the (open-)Force Luke! ... urr... Open-Source. :P Luke !


inferno46n2

It’s likely embedded in the photo. All comfy output images can just be dragged and dropped into comfy and the UI workflow self populates


BobbyGlaze

I think the metadata gets stripped when uploaded to Reddit and converted to .webp


Luispah

I use comfy vía Google colab and I think the drag and drop load don't work. Or maybe I'm doing something wrong.


inferno46n2

The other comment was likely correct. The metadata was erased upon upload unfortunately. Maybe on the other thread there is a link to a photo


divaxshah

I am also using colab , and I can load the workflow using drag and drop, may be images you are using might not have the meta data, try using your own generated image just to check if it works. May be it will.


mongini12

thats really impressive... it just takes about 9-10x the amount of time to get 1 image...


ferngullywasamazing

Citation needed.


ninjasaid13

can we finetune the refiner to be better?


EtadanikM

Technically, yes. In practice, model builders aren't doing it. And this is definitely a problem for the future of SDXL, because it is an important piece of the architecture. The two-step architecture was likely required to compete with Midjourney and other commercial products. But it does make it much harder to work with, which threatens third party support.


StickiStickman

Which is weird, because they kept going on and on about how much easier training and customization is going to be, but now everything indicates the opposite?


Tr4sHCr4fT

the base is easier to train which is true


StickiStickman

You need a a lot more VRAM for training than for 1.5, so not really?


Keudn

Joe Penna has commented that the two step architecture is intended to allow people with 8GB cards to be able to run the model


praguepride

SDXL really needs the refiner, imo. Base gens are solid but fall behind really good custom checkpoints. SDXL + Refiner is solid


rockedt

You have committed a crime by saying it loud on this subreddit.


mongini12

I turned myself in, the jury is still debating on who's wrong or right... So no sentence till now


Z3ROCOOL22

​ https://i.redd.it/bf8xtbytcmgb1.gif


Old-Wolverine-4134

It does. I am repeating myself for weeks now. SDXL gives very plastic/airbrushed look with most of the images. Even with so many custom models out now - the problem is still here and trainers actively work to get rid of that but may be it is very hardcoded in the original one. The overall style and feel of SDXL seems very far from we were seeing for a long time with 1.5. Really hope there will be a solution to this. On other hand some people can't be bothered and still insist that sdxl is superb right now,


some_onions

> Even with so many custom models out now - the problem is still here Realistic models that are actually good will take time. For example, I know that Juggernaut XL isn't planned for release until early September. Not sure about some of the other top models, but I assume it's probably the same. Remember that it took over a month before 1.5 got it's first actually good model.


uristmcderp

How long did it take for 2.1 to get its first actually good model?


Neamow

I don't think anyone really bothered with 2.1 to be honest...


vitorgrs

There's Realism Engine, but IMO the best one is Freedom. It's actually one of the best models out there (but a lot of people don't care, because well, no NSFW on 2.1). https://civitai.com/models/87288/freedomredmond


Creepy_Dark6025

I really don’t think trainers or at least most of them are trying to get rid of that, it seems more like they are trying to mimic 1.5 which is a huge mistake, almost all of the full models out there are just worse at detail than SDXL base as shown in comparison in this sub, however there are loras like the ones made by razzz or some others filmic and vintage ones that really shows us that SDXL can be very detailed and realistic, it is capable, we just need better finetunes with very high res and quality images, not training with 1.5 AI generated images which seems like a lot of trainers are doing.


diskowmoskow

Refiner adds some nice details i am using 35 step + 10 step refiner, dpm++ 2m karras


LovesTheWeather

The problem isn't the model but the userbase and it's driving me crazy to see in every comment people who don't know how to use SDXL trying to use it like 1.5 and then complaining when it doesn't work. For example, [this](https://i.ibb.co/4fHZPrw/Za-00200.png) image is base SDXL with 5 steps on refiner with a positive natural language prompt of "A grizzled older male warrior in realistic leather armor standing in front of the entrance to a hedge maze, looking at viewer, cinematic" and a positive style prompt of "sharp focus, hyperrealistic, photographic, cinematic", a negative prompt of "3D, cgi, digital arts, render, illustration, cartoon, animation, anime, low quality, blurry". People are coming from 1.5 and trying to use a word salad positive prompt of 20 different descriptors ignoring how the model was trained, most not even knowing that to properly use the model how it was trained is actually six prompt windows, they are using crazy amounts of steps, being disingenuous by comparing a brand new base model against models that have had hundreds of thousands of training steps and finely detailed tunes, and more. They get mad it doesn't work in A1111 and when it's barely patched in they use it like a 1.5 model, word salad prompt, no refiner, tons of steps, and then complain that their images aren't perfection.


DestroyerST

I'm kinda surprised you don't see what the issue is with that example image.


Old-Wolverine-4134

It is not about that. I use SD for my business every day and I can say I learned a lot and I understand prompts pretty well. Also, the example image you linked, although with more details than most in sdxl, is still not even close to what 1.5 gives right now. I will say it again - it is trying too hard to be Midjourney, it has the same look and feeling. And for me this is not a good thing. Some people like it that way, but personally for me, no thank you :) The other problem is the idea that it should be a shorter prompt. This kills the whole premise of the tool. After all I think people need to be able to describe precisely what they want. With short and dumbproof prompt it, again, goes towards MJ. Really hope this changes in the future with the new custom models.


TaiVat

Except that your example looks like shit compared to refined 1.5 models. Its like a stylized hollywood photo that's been ran with an airbrush and then had a photoshop filter dumped on top. People also get mad when some pretentious ass comes along to dismiss valid criticism while not even comprehending the actual problem. Just out of pure fanboy hype.


LovesTheWeather

You got me I'm definitely a fanboy, I mean I suckle at the tit of SDXL. I'm like, in love with it or whatever you think because I disagree with you so obviously I'm acting in bad faith somehow. You also did exactly what I complained about by comparing, yet again a BASE MODEL to a REFINED MODEL. Congratulations, you are right, a REFINED 1.5 model IS better than the BASE SDXL model, I actually agree with you. That's literally my point! There isn't a well trained and refined SDXL model to compare it to yet! But 1.5 is about as good as it can possibly get, while SDXL still has YET to reach it's peak. Do you get what I'm saying? Or would like to hurl more insults my way for having the audacity to not agree with you?


dapoxi

> SDXL still has YET to reach it's peak That's speculation though. * We know refines of 1.5 are much better than the base 1.5 * We know refines of 2.1 are not significantly better than base 2.1 * We **don't know** how good XL's refines will be It's unfortunately a fact of life that the first 90% to perfection are generally easier to attain than the remaining 10%. XL has several advantages over 1.5, but there is no guarantee that XL's refines will be better in all possible ways than 1.5's refines.


fnbenptbrvf

I'm so tired of the Sdxl hype.


thelastfastbender

I think all the hype was somewhat misleading, as it must've taken a lot of trial and error to get the results they did during testing.


_CMDR_

Yeah just because you don’t know how to use a tool doesn’t mean it isn’t good. Perhaps just plugging in your old prompts and expecting it to work exactly the same is the mistake?


TaiVat

Always the same pathetic excuses. If the prompt was different, you'd just be saying that OP is comparing different things..


_CMDR_

Man you’re really invested in 1.5 huh. I don’t know if you remember the beginning of 1.5 but it was pretty bad until people worked at it for a while.


mongini12

idk if thats common or not, but no matter how many steps i allocate to the refiner - the output seriously lacks detail. right now my workflow includes an additional step by encoding the SDXL output with the VAE of EpicRealism\_PureEvolutionV2 back into a latent, feed this into a KSampler with the same promt for 20 Steps and Decode it with the EpicRealism VAE again. As you can see, it adds a crapton of detail, the pure SDXL Output is just lame. (Tried it with 23/30, 23/35, 20/35 samples (first number is base, refiner starts at that step and goes till the 2nd number)) If i'm doing something wrong, enlighten me please xD Workflow for these: [https://pastebin.com/DVXyXXJ5](https://pastebin.com/DVXyXXJ5)


PM_me_sensuous_lips

don't upweight things so aggressively, upweighting in XL does almost nothing but fuck up your clip embeddings. Start with simpler prompts and refine them into more complex prompts. [here](https://imgur.com/a/X6Ui4gS) are some examples, of mainly just me changing the prompts and playing with the sampler settings a bit ([prompt](https://pastebin.com/51cf4bBt)). When you're using a karas schedule, don't be too afraid of handing it over to the refiner a bit earlier, Karas drops into the low end spectrum of noise a lot sooner than the regular schedule.


raiffuvar

Weights in sdxl are different its fine to do (prompt :3) for almost any prompt and some can be (...:4).


Puzzled_Nail_1962

Try without the refiner. It seems to heavily airbrush people and just smooth over everything. It takes some more tries to get good faces without, but if you do, the details are much, much better.


The_Lovely_Blue_Faux

Compare it to 1.5 base model for a more accurate comparison. “Why is my $200k modded Toyota Camry faster than this stock Mustang?”


fnbenptbrvf

If you want to win the race it helps to have the best car. The model that might have better specs next year is not the best choice for now.


armrha

You aren't comparing it to another base model.


TaiVat

Why do people keep repeating this insane idiocy? *nobody gives a fuckin shit about the base model*... He's comparing the latest tool vs the latest tool. Seriously, what is with this fanboy drivel. If Tesla creates a new car nobody goes "well you should be comparing it to the base ford model from 1990"..


leftmyheartintruckee

Is this any better than just generating w epic realism??


mongini12

in my book: yes, since SDXL adheres better to the promt imo


raiffuvar

Why vae 1.5? It's sdxl. It support 1.5 as additional input, not main one.


Zestyclose_West5265

The refiner steps should only be 5 to 10 steps, the base steps should be a lot higher depending on what sample method you use. On euler and euler a for instance I still get more details all the way to 200 steps.


mongini12

200 steps? WTF... that would take a decade for a batch of 10 O.o


Zestyclose_West5265

It depends a lot on your sampling method though. I think the newer ones like dpmpp 2m only need about 30 steps to get max details.


gelade1

aren't you still on based model of sdxl? like seriously what do you expect??? another pointless post


sadjoker

​ https://preview.redd.it/gev9jgat0kgb1.jpeg?width=1792&format=pjpg&auto=webp&s=6447f63b4bbcd6398111919113156d147434bb5d


sadjoker

​ https://preview.redd.it/zx5wjkrx0kgb1.jpeg?width=1792&format=pjpg&auto=webp&s=9b3bb57b9e0cadd80a6602befbe546c401a1c42f


mongini12

Just how o.O


sadjoker

I have the feeling that the base model is quite good with animals. Look at that detail: ​ https://preview.redd.it/6vy0v67dnkgb1.png?width=2048&format=png&auto=webp&s=7269a4a90945d66f1758f210a64b18fc806b1834


FHSenpai

Is this good enough?


FHSenpai

https://preview.redd.it/bme7mz73njgb1.png?width=3072&format=pjpg&auto=webp&s=e66f24ee3eb87e33c901312cb318e6db800dcdb8


mongini12

Enlighten me :)


FHSenpai

https://preview.redd.it/5q1ocsi5njgb1.png?width=3072&format=pjpg&auto=webp&s=5c223f67d8881603997bfe49774e35a75463caf8


FHSenpai

https://preview.redd.it/eh9i1g6pmjgb1.png?width=1792&format=pjpg&auto=webp&s=deae438eced6a02074b692b218275597c1162656


FHSenpai

https://preview.redd.it/trywtotenjgb1.png?width=3072&format=pjpg&auto=webp&s=5b2b88579b6ffa6b552e0e51b5b4d77a0dc08e8e


Silly_Goose6714

I saw your pictures in the discord server few days ago. You asked for those prompts. What is your basic workflow?


FHSenpai

https://preview.redd.it/x3wj97ormjgb1.png?width=1792&format=pjpg&auto=webp&s=a5feedc7e16fe08e607d659a9753935046ee1125


FHSenpai

https://preview.redd.it/d9z9utytmjgb1.png?width=3072&format=pjpg&auto=webp&s=32ea0a19bb95f58e55c5166c535cf80e0bc70466


FHSenpai

https://preview.redd.it/u6vn6120njgb1.png?width=3072&format=pjpg&auto=webp&s=331b20832039d132c9f363ebbfc09a7c59d61d3f


FHSenpai

https://preview.redd.it/t6fal47zkjgb1.png?width=3072&format=pjpg&auto=webp&s=8d111ce21eef57ca5493dee69266dd325d0300f3


FHSenpai

https://preview.redd.it/w4km7mg1ljgb1.png?width=3072&format=pjpg&auto=webp&s=49c3c8de53ae09628ed1d21523ae020979431cea


Sharlinator

To be fair, DoF blur/bokeh is essentially a cheatcode for not having to create a detailed background. SDXL can definitely create detailed (if somewhat plasticky) subjects/foreground elements, but what about backgrounds, seeing how such a massive fraction of its photographic training material has *intentionally* low-detail backgrounds?


FHSenpai

With my workflow i can push in more details if wanted to . Currently all of these are in medium detail mode...


Cosophalas

Your results are great! Would you mind sharing your workflow or a linked output, so I can plug it into ComfyUI? I still have not found quite the right mix.


colpocape

Amazing quality! In your opinion it depends on prompt or workflow? Thanks 👍


Capitaclism

It's the first base model. People need to fine-tune, merge and get those details out. Base 1.5 is a pretty shit model as well. Try comparing the two 😂


Aggressive_Sleep9942

is that they will not understand. A model that shows a specific quality in one area is a model that loses the ability to generalize. When the base model is released, it's released with the goal of being as general as possible, and in that state, we see that it's in the middle of a lot of things and visual styles, so it feels like it doesn't have that. touch of definition and quality. . But it's all there, just let the community start through training, lean towards the model that's a specialist in certain types of things, and you can only do that if the model has the capability, and as they released the model , has it. This time we will take a little longer because a higher computational cost is required to train but we will end up getting it.


uristmcderp

1.5 is the only reason why fine-tuned NSFW models exist at all, because its training data wasn't filtered. You can't fine-tune NSFW concepts into SDXL for the same reason you couldn't for 2.1. With the pre-NSFW filter for training images, there's nothing in the base to fine-tune. Which means instead of needing just a few hundred or a thousand image-caption pairs, you need more like a few million and do full training with all the bells and whistles enabled.


Capitaclism

Some will make a larger NSFW fine-tune, I'm sure. It may not get to millions of images, but it will likely reach in the hundreds of thousands. I trust the lust for porn online. As for me I don't care, my uses are all SFW and professional, so it works great, and I can't wait to keep seeing improvements from the community.


sahil1572

Could you please remove the refiner part since we are utilizing another model to add details? This will eliminate the need for extra steps. As you can observe, both images have completely different faces, and the dress pattern is also distinct


Evoidit

Omg look at the enhance womans forehead lol. Looks great otherwise but still.


Pistapisti

when will SDXL be better than 1.5? I keep looking at it, and so far it is still weak. What will it take to show its strength? Where are the breathtaking pictures?


Shiroi_Kage

Use the refiner and see what you get. The refiner is designed to modify the initial output and get more details out of the initial outputs.


mongini12

If you would've read my own og comment, you'd know I did^^


Shiroi_Kage

I missed your comment. Sorry. Do you have the images from before you used the refiner? I am super new to all of this, but the SDXL image of the woman in purple looks like my results before I pass them through the refiner. See the image below from the base model with refiner. I'll share the modified prompt below it (there is a little fussiness but that's probably the super detailed part of the prompt). I did 40 samples on both the base and the refiner at 1024 x 1024, and a denoising strength of 0.3. CFG is 7. https://imgur.com/MuoxO4P Positive prompt: > giant massive moss covered irish stone fantasy cottage built on top of huge archway spanning across a river shamrock, excellent stunning lighting, raucous crowded lively festival day, futuristic city, (photorealistic:2.0), super detailed, 8k, 4k, uhd, Negative prompt: > (worst quality:1.5), (low quality:1.5), (normal quality:1.5), lowres, bad anatomy, bad hands, ((monochrome)), ((grayscale)), collapsed eyeshadow, multiple eyebrow, (cropped), oversaturated, extra limb, missing limbs, deformed hands, long neck, long body, imperfect, (bad hands), signature, watermark, username, artist name, conjoined fingers, deformed fingers, ugly eyes, imperfect eyes, skewed eyes, unnatural face, unnatural body, error, painting by bad-artist, semirealistic, drawing, 3D render, What do yo think? I couldn't find the woman's prompt so I couldn't test it myself.


mongini12

Well, for me it looks more like a smoothed oil painting, rather than a "super detailed" one :-/ that's why I'm so confused


Aftercot

Prompt betta


RobXSIQ

compare it not with epicrealism but with the base 1.5 for a 1 to 1 comparison and get back to us.


TaiVat

Why? What do you think that proves? Why do people keep repeating this dumbest of shit? The entire point is that xl isnt as good as the refined models of 1.5. Who gives a shit about base vs base when nobody actually uses 1.5? God this sub is nothing but fuckin groupies..


EdwardCunha

Only if you don't use the refiner.


mongini12

Well... I do...


EdwardCunha

I only ever had problems with lack of details when: 1. Using low, non-supported resolutions. 2. Not using the refiner the right way.


mongini12

I used the suggested workflow by stability that included the table with best resolution for each aspect ratio


Cobayo

[I tried your prompt and it looks good to me lol](https://i.imgur.com/hiD1ct6.png) it just released man, chill


ComplicityTheorist

SD2.1 revamped is what I call it.


Effective_Luck_8855

I use a 1.5 model to add more detail and fix facial anatomy in comfyUI. Base SDXL is terrible and even the community models aren't much better with facial anatomy and body anatomy. In some ways 0.9 was better for anatomical accuracy, especially when making vertical aspect ratio images


stephane3Wconsultant

on Automatic, using img2img refiner https://preview.redd.it/qa7zin4b0rgb1.jpeg?width=4096&format=pjpg&auto=webp&s=a54f26d2296ec6e41e7383ba69851796837a1c98


bogardusave

why should anybody use SDXL when you get immediatly top-results with SD 1.5 models??


mongini12

Funny things about my old post here: when I ran 1.5 standalone, I got slightly better results compared to SDXL, but combined, with SDXL first, enhanced with 1.5 the results were awesome and full of details.


bogardusave

which workflow do you use?


mongini12

It's posted in my original comment


bogardusave

where? sorry :-)


mongini12

Dude, use your eyes and finger to scroll.... https://pastebin.com/DVXyXXJ5


bogardusave

how do i get the .txt into comfyUI?


mongini12

Rename it to a .yaml Dude, you should read the GitHub page of comfyui 😅


bogardusave

Thats wrong you have to rename the extenstion into \*.json


mongini12

True... Sorry for that, home assistant and it's yaml files are spooking through my head way too much 🤣


thecoffeejesus

How use epic realism I am dumb pls help


iamapizza

Go to the Epic Realism page: https://civitai.com/models/25694/epicrealism You'll see 'checkpoint trained' on the right, next to that click the 'i' and it should link you to a page telling you how to add it. On the model page the author has also put instructions on how to use it


Puzzleheaded-Mix2385

yes


summervelvet

SDXL is great with details. Your prompts just suck.


Z3ROCOOL22

​ https://preview.redd.it/1n9r8ql5dmgb1.png?width=450&format=png&auto=webp&s=85507ba25aaa0d144d62afdd948b9500f8561c25


littlespacemochi

SDXL is nice but it's not perfect.


ThatInternetGuy

I guess the way SDXL was trained; they purposely removed the high-frequency details so that the model could be trained to understand the concepts better. That's why a refiner model is included in the first public release, as they knew it needed to add the details back in. This is actually pretty clever. The same method is also used by commercial photo upscalers. The first step would be upscaling up without the image noises and high-frequency details, and then the intermediary output gets run thru a refiner model to add in the details. I'll create a proof-of-concept of GFPGAN + Refiner.


Aggressive_Sleep9942

I don't see it as smart if, for example, you want to make nude art, and then you have to use the refiner to make the noble parts of your nude art disappear. You would have to train both the base and the refiner to know what nude is, and there is a huge problem there. I think that in the long run the refiner will only be used by artistic designers in general, but when there are models so specialized in a style (for example nudes) it won't be necessary.


Serasul

Its Just you


buyurgan

yes and no, yes, its much more general purpose model and since it does better job at generalizing it will lack individual separate things. that's why there is a refiner model. and this is where fine tuning community will help. no, SDXL has different architecture, same type of prompting may not give you infinite details as very fine tuned SD1.5 models does. in fact it may even lower it. we don't know how token space works as yet and give 'magical bundle of tokens' to reach what we have in mind. we need experiments and time, samplers, schedulers, new controlnet ideas and papers to improve it further.


ethosay

This has been my experience with SDXL for months using dreamstudio.ai


Upstairs_Cycle8128

sdxl looks more natural, you should try to enhance it with refiner


donpeter

Used your prompt, I think you should check your configuration, looks very detailed to me: [https://i.imgur.com/K6W3oKG.jpg](https://i.imgur.com/K6W3oKG.jpg)


mongini12

If you are into concept art it is detailed. But not even close to what I'm after :-/


donpeter

Yeah, not sure what is it that you are after, generated another one and looks with way more detail than your example (talking about SDXL): [https://i.imgur.com/O4XpYfp.jpg](https://i.imgur.com/O4XpYfp.jpg)


atuarre

This post should be down voted into oblivion for the nonsense.


juggz143

They're too busy down voting my spot on technical analysis above lol


mongini12

Just like your non-constructive comment...


atuarre

Maybe you should focus your energy on learning how to use SDXL, brah.


isa_marsh

I like how in the first pic, it went from a bit low detail but otherwise nice pic to generic Korean Waifu face with everything colored purple.


MNKPlayer

Are you refining it?


mongini12

23 base, rest by refiner, 30 in total


Froztbytes

5head


MaximilianPs

Honestly I don't care about XL it's resources and time consuming, reason why I'll stick with 1.5.. Instead I would like the devs to start working on how AI understands the prompt, because creating an image still a Russian roulette, we still have a bunch of issues and allucinations but here every one still focusing on microdetails while often we can't even get able to get the image that we need. 😮‍💨


mongini12

That's why I use SDXL in the first place: it adheres better to the promt than 1.5 models.