sd: sync to master-504-636d3cb by wbruna · Pull Request #1969 · LostRuins/koboldcpp

wbruna · 2026-02-10T23:46:16Z

Another big diff. The merge went surprisingly smooth, considering the number of files being moved.

I've preserved the position of most files, but moved the vocab stuff to vocab/ to avoid a diff. On the plus side, I've managed to remove the KCPP_NO_BAKE_SD_VOCAB changes by simply replacing the vocab.cpp functions.

- restore the "arbitrarily high" 8192 limit, since it's used to turn off the img_hard_limit (and if each side was always limited by 2048, we wouldn't need hard_megapixel_res_limit) - avoid changing the config cfg_square_limit during a generation - apply the hard_megapixel_res_limit only in the configuration-changed path, since the default path uses constants - clean up comments The calculation itself remains the same: - default area limit: 832² for SD1.5/SD2, 1024² otherwise - configured limit always between 64 and 2048

wbruna · 2026-02-13T01:08:17Z

Fixed a few issues with the limit calculation. I've made it on top of this PR because I'm testing both together, but just let me know if you prefer a separate PR.

LostRuins · 2026-02-13T10:41:17Z

If i understand this correctly, you're basically redefining the function e.g.load_mistral_vocab_json twice, and then picking the correct item to include during the compile?

wbruna · 2026-02-13T11:29:55Z

If i understand this correctly, you're basically redefining the function e.g.load_mistral_vocab_json twice, and then picking the correct item to include during the compile?

Yeah: since those functions are now isolated on vocab.cpp, it's enough to not include that file on our build, and replace the functions directly.

LostRuins · 2026-02-14T02:53:20Z

@wbruna i don't know if its just me but image editing (qwen image edit) is now completely broken

LostRuins · 2026-02-14T02:59:18Z

1.106.2 = working
1.107 = working
1.107.3 = already broken

LostRuins · 2026-02-14T03:11:30Z

I think the regression started on #1957

LostRuins · 2026-02-14T03:15:36Z

I think I found the cause - you enabled flash attention for the vae and conditioner in that PR.

Commenting out

    // also switches flash attn for the vae and conditioner
    // params.flash_attn = params.diffusion_flash_attn;

seems to fix it @wbruna

As a hack, we can try disabling that line for Qwen image edit models? Or do you have a better idea

edit: problem with that is also we don't know the arch at that point in code. maybe we should just disable it fully

LostRuins · 2026-02-14T03:17:56Z

reference outputs before / after

wbruna · 2026-02-14T03:19:50Z

Hm, I had no issues with that on any models on my card. But yeah, we could disable it.

Unless we extend the sdflasattention flag to a tri-state, like the conv direct? Something like off/diffusiononly/full?

LostRuins · 2026-02-14T03:26:41Z

Hm, I had no issues with that on any models on my card.

it only happens when you load a reference image. but yeah, cleanest is to disable it for now

wbruna · 2026-02-14T03:32:32Z

Seems to be an issue specific to Qwen; editing with Klein 4b works fine.

Well, disabling should be just a matter of commenting/removing that line params.flash_attn = params.diffusion_flash_attn;.

LostRuins · 2026-02-14T03:34:05Z

Actually I can do better. After some trial and error, it seems only the conditioner is messed up. If I go into stable-diffusion.cpp and comment this

// cond_stage_model->set_flash_attention_enabled(true);

then it works fine. I think that's probably better, since the other parts can still benefit

wbruna · 2026-02-14T03:39:13Z

Actually I can do better. After some trial and error, it seems only the conditioner is messed up. If I go into stable-diffusion.cpp and comment this

// cond_stage_model->set_flash_attention_enabled(true);

then it works fine. I think that's probably better, since the other parts can still benefit

Hm, and even the arch is known at that point.

Still, it may be safer to enhance the option flag. If a tri-state sounds good to you, I can work on it tomorrow (it's just past midnight here right now 🙂).

wbruna · 2026-02-14T03:43:21Z

BTW, I don't think this bug was reported upstream yet.

LostRuins · 2026-02-14T03:43:27Z

If a tri-state sounds good to you

i think a tri-state is too confusing for users. Much better to just provide them with a "best-effort" flash attention if they want FA.

but yes, we can gate the cond fa by arch too.

LostRuins · 2026-02-14T03:43:59Z

BTW, I don't think this bug was reported upstream yet.

oh hmm, yeah i guess it wasnt. you wanna do the honors? you can link this comment

wbruna · 2026-02-14T03:56:24Z

i think a tri-state is too confusing for users. Much better to just provide them with a "best-effort" flash attention if they want FA.

But we already do that for the sdconvdirect; it would even be pretty similar.

oh hmm, yeah i guess it wasnt. you wanna do the honors? you can link this comment

If I manage to reproduce it, sure; but I don't have a Qwen Edit model downloaded right now :-(

LostRuins · 2026-02-14T04:45:15Z

Yeah but the difference is that was added as a brand new flag on the same release. Whereas this would have to deal with legacy configs and users who already know the flag

wbruna · 2026-02-14T10:28:54Z

Hm. But we called it "sdflashattention", which is an appropriate name: it's still FA, just for other parts of the model.

It could accept values "false/off/0" and "true/on/1", meaning pretty much what they mean right now: a boolean, except we'd have a third value "diffusion" meaning "only for the diffusion model". Users would have to change it only for Qwen Edit.... and other broken models we haven't tested yet, of course: but instead of waiting for a fix, they'd just have to adjust the flag. FA is off by default, and is already something that can't be turned on for all cases.

Another middle ground would be supporting it through gendefaults: we keep flash_attn off, and allow it to be turned on manually there. By the way, did you notice cases which really benefit from it? Personally, I don't think I would notice if flash_attn was always off.

wbruna added 2 commits February 12, 2026 21:30

sd: sync to master-504-636d3cb

f736382

wbruna force-pushed the kcpp_sd_update_202602_3 branch from 292ac52 to 7f3db44 Compare February 13, 2026 01:02

LostRuins added the enhancement New feature or request label Feb 13, 2026

LostRuins merged commit ae5183b into LostRuins:concedo_experimental Feb 14, 2026

Conversation

wbruna commented Feb 10, 2026

Uh oh!

wbruna commented Feb 13, 2026

Uh oh!

LostRuins commented Feb 13, 2026

Uh oh!

wbruna commented Feb 13, 2026

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

LostRuins commented Feb 14, 2026

Uh oh!

wbruna commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LostRuins commented Feb 14, 2026 •

edited

Loading

LostRuins commented Feb 14, 2026 •

edited

Loading

LostRuins commented Feb 14, 2026 •

edited

Loading