- Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: mlc-ai/mlc-llm
Author
Label
Projects
Milestones
Reviews
Assignee
Assigned to nobodyLoading
Sort
Pull requests list
Perf: load weights, create KV cache, initialize tokenizer in parallel
#3215 opened Apr 27, 2025 by BekabooLoading… updated Apr 27, 2025
[Serving] Support tool function calls under strict format constraints
#3190 opened Mar 26, 2025 by IrfnfnkemedLoading… updated Apr 24, 2025
[Android] Support LLaVA and Phi-V
#3195 opened Apr 6, 2025 by davidlightmysterionLoading… updated Apr 18, 2025
[Refactor] PagedKVCache spec for MLC-LLM
#3203 opened Apr 14, 2025 by annanyaprLoading… updated Apr 14, 2025
Update mlc_llm.rst correcting for mlc from mlc_llm
#3196 opened Apr 7, 2025 by agrajagcoLoading… updated Apr 7, 2025
Refactored random.h to have PhiloxRandomGenerator
#3181 opened Mar 18, 2025 by annanyaprLoading… updated Apr 6, 2025
[Serving] Add Structural-Tag api to RequestResponseFormat
#3187 opened Mar 24, 2025 by IrfnfnkemedLoading… updated Mar 26, 2025
[CPP_CLI] MLC Cli App over JSONEngine interface
#3114 opened Jan 30, 2025 by srkreddy1238Loading… updated Jan 31, 2025
[Serving] PagedKVCache Quantization
#2663 opened Jul 16, 2024 by davidpissarraLoading… updated Dec 21, 2024
[SERVE][CPP][Android] add native executable program to benchmark models
#2987 opened Oct 18, 2024 by pfk-betaLoading… updated Oct 18, 2024
[Model] Add use_qk_norm option for Cohere model
#2877 opened Sep 2, 2024 by tlopexLoading… updated Oct 9, 2024
ProTip! Mix and match filters to narrow down what you’re looking for.