Skip to content

Pull requests: quic/efficient-transformers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Removed custom_io passing for bf16 dtype case
#983 opened May 13, 2026 by asmigosw Contributor Loading…
Adding PagedAttention support for CausalLM models
#982 opened May 13, 2026 by vaibverm Contributor Loading…
Fix for fp16 export in qwen3vl & qwen3vlmoe models
#980 opened May 12, 2026 by qcdipankar Contributor Loading…
Diffusers CI conditional check
#978 opened May 11, 2026 by quic-amitraj Contributor Draft
Added support of QEffDiffusionPipeline for Diffusers Diffusers Use for PR related to diffusers in efficient-transformers.
#977 opened May 11, 2026 by quic-amitraj Contributor Draft
Layerwise int4 kimi
#973 opened May 7, 2026 by abhishek-singh591 Contributor Loading…
Glm4.7 flash reap
#972 opened May 6, 2026 by azajac-qcom Loading…
rebase test
#971 opened May 6, 2026 by qraniumcitest Loading…
TF and other package update
#967 opened May 6, 2026 by quic-hemagnih Contributor Draft
Gemma4
#966 opened May 6, 2026 by tchawada Contributor Loading…
Add DPO specific changes
#964 opened May 6, 2026 by quic-akuruvil Contributor Draft
Enable On Device Sampling for Qwen3ForCausalLM
#963 opened May 5, 2026 by quic-sanising Contributor Loading…
MLA Int4 Changes
#962 opened May 5, 2026 by quic-mamta Contributor Draft
[QEff. Finetuning] TP+DDP for transformers upgrade to v5.5.4
#960 opened May 4, 2026 by smedhe Contributor Loading…
Enable ffn blocking for dense models with automatic blocking configurator enhancement New feature or request qeff.blocking
#958 opened May 4, 2026 by kdulla Contributor Loading…
Optimize attention blocking nested loops
#957 opened Apr 30, 2026 by anujgupt-github Contributor Loading…
Layer wise changes for kimi model
#954 opened Apr 29, 2026 by abhishek-singh591 Contributor Loading…
First Block Caching Infra for diffusers Diffusers Use for PR related to diffusers in efficient-transformers.
#941 opened Apr 24, 2026 by quic-amitraj Contributor Loading…
feat(moe): NSP-blocked expert dispatch for Qwen3MOE and GPT-OSS prefill enhancement New feature or request
#935 opened Apr 21, 2026 by vbaddi Contributor Loading…
Added MDP generation to QEff Compile
#930 opened Apr 21, 2026 by quic-mohmeh Loading…
Enabled Qwen3-VL embedding model
#923 opened Apr 20, 2026 by quic-amitraj Contributor Loading…
[Qwen3_Omni]_Onboarding
#922 opened Apr 20, 2026 by mohiso22 Contributor Draft
Enabling support of rerankers models 2B and 8B of qwen3vl
#921 opened Apr 18, 2026 by quic-amitraj Contributor Loading…
ProTip! no:milestone will show everything without a milestone.