-
Notifications
You must be signed in to change notification settings - Fork 86
Pull requests: quic/efficient-transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Removed custom_io passing for bf16 dtype case
#983
opened May 13, 2026 by
asmigosw
Contributor
Loading…
Adding PagedAttention support for CausalLM models
#982
opened May 13, 2026 by
vaibverm
Contributor
Loading…
Onboarding DFlash: Block Diffusion Speculative Decoding
#981
opened May 13, 2026 by
fannanya
Loading…
Fix for fp16 export in qwen3vl & qwen3vlmoe models
#980
opened May 12, 2026 by
qcdipankar
Contributor
Loading…
Added support of Use for PR related to diffusers in efficient-transformers.
QEffDiffusionPipeline for Diffusers
Diffusers
#977
opened May 11, 2026 by
quic-amitraj
Contributor
•
Draft
Enable On Device Sampling for Qwen3ForCausalLM
#963
opened May 5, 2026 by
quic-sanising
Contributor
Loading…
[QEff. Finetuning] TP+DDP for transformers upgrade to v5.5.4
#960
opened May 4, 2026 by
smedhe
Contributor
Loading…
Enable ffn blocking for dense models with automatic blocking configurator
enhancement
New feature or request
qeff.blocking
#958
opened May 4, 2026 by
kdulla
Contributor
Loading…
Optimize attention blocking nested loops
#957
opened Apr 30, 2026 by
anujgupt-github
Contributor
Loading…
Layer wise changes for kimi model
#954
opened Apr 29, 2026 by
abhishek-singh591
Contributor
Loading…
fix: improve weight offloading to handle plain tensor attrs and use to_empty()
#952
opened Apr 28, 2026 by
quic-rishinr
Contributor
Loading…
First Block Caching Infra for diffusers
Diffusers
Use for PR related to diffusers in efficient-transformers.
#941
opened Apr 24, 2026 by
quic-amitraj
Contributor
Loading…
feat(moe): NSP-blocked expert dispatch for Qwen3MOE and GPT-OSS prefill
enhancement
New feature or request
#935
opened Apr 21, 2026 by
vbaddi
Contributor
Loading…
Enabling support of rerankers models 2B and 8B of qwen3vl
#921
opened Apr 18, 2026 by
quic-amitraj
Contributor
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.