Skip to content

sd: sync to master-504-636d3cb#1969

Merged
LostRuins merged 2 commits intoLostRuins:concedo_experimentalfrom
wbruna:kcpp_sd_update_202602_3
Feb 14, 2026
Merged

sd: sync to master-504-636d3cb#1969
LostRuins merged 2 commits intoLostRuins:concedo_experimentalfrom
wbruna:kcpp_sd_update_202602_3

Conversation

@wbruna
Copy link

@wbruna wbruna commented Feb 10, 2026

Another big diff. The merge went surprisingly smooth, considering the number of files being moved.

I've preserved the position of most files, but moved the vocab stuff to vocab/ to avoid a diff. On the plus side, I've managed to remove the KCPP_NO_BAKE_SD_VOCAB changes by simply replacing the vocab.cpp functions.

- restore the "arbitrarily high" 8192 limit, since it's used to turn
off the img_hard_limit (and if each side was always limited by 2048,
we wouldn't need hard_megapixel_res_limit)
- avoid changing the config cfg_square_limit during a generation
- apply the hard_megapixel_res_limit only in the configuration-changed
path, since the default path uses constants
- clean up comments

The calculation itself remains the same:
- default area limit: 832² for SD1.5/SD2, 1024² otherwise
- configured limit always between 64 and 2048
@wbruna wbruna force-pushed the kcpp_sd_update_202602_3 branch from 292ac52 to 7f3db44 Compare February 13, 2026 01:02
@wbruna
Copy link
Author

wbruna commented Feb 13, 2026

Fixed a few issues with the limit calculation. I've made it on top of this PR because I'm testing both together, but just let me know if you prefer a separate PR.

@LostRuins LostRuins added the enhancement New feature or request label Feb 13, 2026
@LostRuins
Copy link
Owner

If i understand this correctly, you're basically redefining the function e.g.load_mistral_vocab_json twice, and then picking the correct item to include during the compile?

@wbruna
Copy link
Author

wbruna commented Feb 13, 2026

If i understand this correctly, you're basically redefining the function e.g.load_mistral_vocab_json twice, and then picking the correct item to include during the compile?

Yeah: since those functions are now isolated on vocab.cpp, it's enough to not include that file on our build, and replace the functions directly.

@LostRuins LostRuins merged commit ae5183b into LostRuins:concedo_experimental Feb 14, 2026
@LostRuins
Copy link
Owner

@wbruna i don't know if its just me but image editing (qwen image edit) is now completely broken

@LostRuins
Copy link
Owner

LostRuins commented Feb 14, 2026

1.106.2 = working
1.107 = working
1.107.3 = already broken

@LostRuins
Copy link
Owner

I think the regression started on #1957

@LostRuins
Copy link
Owner

LostRuins commented Feb 14, 2026

I think I found the cause - you enabled flash attention for the vae and conditioner in that PR.

Commenting out

    // also switches flash attn for the vae and conditioner
    // params.flash_attn = params.diffusion_flash_attn;

seems to fix it @wbruna

As a hack, we can try disabling that line for Qwen image edit models? Or do you have a better idea

edit: problem with that is also we don't know the arch at that point in code. maybe we should just disable it fully

@LostRuins
Copy link
Owner

reference outputs before / after

image

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

Hm, I had no issues with that on any models on my card. But yeah, we could disable it.

Unless we extend the sdflasattention flag to a tri-state, like the conv direct? Something like off/diffusiononly/full?

@LostRuins
Copy link
Owner

Hm, I had no issues with that on any models on my card.

it only happens when you load a reference image. but yeah, cleanest is to disable it for now

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

Seems to be an issue specific to Qwen; editing with Klein 4b works fine.

Well, disabling should be just a matter of commenting/removing that line params.flash_attn = params.diffusion_flash_attn;.

@LostRuins
Copy link
Owner

Actually I can do better. After some trial and error, it seems only the conditioner is messed up. If I go into stable-diffusion.cpp and comment this

// cond_stage_model->set_flash_attention_enabled(true);

then it works fine. I think that's probably better, since the other parts can still benefit

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

Actually I can do better. After some trial and error, it seems only the conditioner is messed up. If I go into stable-diffusion.cpp and comment this

// cond_stage_model->set_flash_attention_enabled(true);

then it works fine. I think that's probably better, since the other parts can still benefit

Hm, and even the arch is known at that point.

Still, it may be safer to enhance the option flag. If a tri-state sounds good to you, I can work on it tomorrow (it's just past midnight here right now 🙂).

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

BTW, I don't think this bug was reported upstream yet.

@LostRuins
Copy link
Owner

If a tri-state sounds good to you

i think a tri-state is too confusing for users. Much better to just provide them with a "best-effort" flash attention if they want FA.

but yes, we can gate the cond fa by arch too.

@LostRuins
Copy link
Owner

LostRuins commented Feb 14, 2026

BTW, I don't think this bug was reported upstream yet.

oh hmm, yeah i guess it wasnt. you wanna do the honors? you can link this comment

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

i think a tri-state is too confusing for users. Much better to just provide them with a "best-effort" flash attention if they want FA.

But we already do that for the sdconvdirect; it would even be pretty similar.

oh hmm, yeah i guess it wasnt. you wanna do the honors? you can link this comment

If I manage to reproduce it, sure; but I don't have a Qwen Edit model downloaded right now :-(

@LostRuins
Copy link
Owner

Yeah but the difference is that was added as a brand new flag on the same release. Whereas this would have to deal with legacy configs and users who already know the flag

@wbruna
Copy link
Author

wbruna commented Feb 14, 2026

Hm. But we called it "sdflashattention", which is an appropriate name: it's still FA, just for other parts of the model.

It could accept values "false/off/0" and "true/on/1", meaning pretty much what they mean right now: a boolean, except we'd have a third value "diffusion" meaning "only for the diffusion model". Users would have to change it only for Qwen Edit.... and other broken models we haven't tested yet, of course: but instead of waiting for a fix, they'd just have to adjust the flag. FA is off by default, and is already something that can't be turned on for all cases.

Another middle ground would be supporting it through gendefaults: we keep flash_attn off, and allow it to be turned on manually there. By the way, did you notice cases which really benefit from it? Personally, I don't think I would notice if flash_attn was always off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants