Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Dec 10, 2025

Add better support for negated args like -no-cnv or --no-jinja, example:

--jinja, --no-jinja                     whether to use jinja template engine for chat (default: disabled)
                                        (env: LLAMA_ARG_JINJA)

For boolean options like --mmap or --kv-offload, the environment variable is handled as shown in this example:

  • LLAMA_ARG_MMAP=true means enabled, other accepted values are: 1, on, enabled
  • LLAMA_ARG_MMAP=false means disabled, other accepted values are: 0, off, disabled
  • If LLAMA_ARG_NO_MMAP is present (no matter the value), it means disabling mmap

Note: Before this change, LLAMA_ARG_JINJA=false will simply skip the --jinja arg but does not set --no-jinja. Now, falsey values like 0, false, no, disabled will trigger the negated value.

Same logic also apply to the recently-added preset feature: #17859

@ngxson ngxson requested a review from CISC December 10, 2025 22:22
@ngxson ngxson requested a review from ggerganov as a code owner December 10, 2025 22:22
@CISC
Copy link
Collaborator

CISC commented Dec 10, 2025

Would it make sense to add positive variants to --no-display-prompt, --no-show-timings, --no-warmup, --no-kv-offload, --no-repack, --no-mmproj-offload, --no-mmap, --no-op-offload, --no-ppl, --no-webui, --no-models-autoload, --no-prefill-assistant, (--no-perf?) and give them the same treatment for consistency? --no-host and --no-mmproj are tricky though. :)

Or perhaps just leave the positive variant empty and handle that?

Many of these can benefit from rewording.

@github-actions github-actions bot added testing Everything test related examples server labels Dec 11, 2025
@ngxson
Copy link
Collaborator Author

ngxson commented Dec 11, 2025

Yes I think we can add negation for most of the boolean variables to establish a pattern. The goal is to encourage future contributors to follow the same path. Wdyt @ggerganov ?

@ggerganov
Copy link
Member

Sounds good

@ngxson
Copy link
Collaborator Author

ngxson commented Dec 12, 2025

I added the support for most of the args with 2 exceptions:

  • --no-mmproj now has 2 aliases: --mmproj-auto and --no-mmproj-auto, avoid clashing with the existing --mmproj option
  • --no-host is kept as-is (no positive flag) because I'm not quite sure how this function works. Therefore, I cannot come up with a better flag name or a better help message. Suggestions are welcome

@ngxson
Copy link
Collaborator Author

ngxson commented Dec 12, 2025

A small nits is that some args have this in the help message: (default: true|false), while some use enabled|disabled

Probably we can have a string_format that automatically populate the default value to the help message, but it's better an optimization to be done in another PR

Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regen server docs again.

@ngxson
Copy link
Collaborator Author

ngxson commented Dec 12, 2025

already regen in e73c42b

@ngxson ngxson merged commit 380b4c9 into ggml-org:master Dec 12, 2025
63 of 73 checks passed
@ggerganov
Copy link
Member

The server is not handling the negated arguments correctly:

./bin/llama-server --no-mmap

0.00.000.443 W main: setting n_parallel = 4 and kv_unified = true (add -kvu to disable this)
0.00.000.450 I build: 7376 (380b4c984) with AppleClang 17.0.0.17000319 for Darwin arm64 (debug)
0.00.000.466 I system info: n_threads = 16, n_threads_batch = 16, total_threads = 24
0.00.000.468 I 
0.00.000.483 I system_info: n_threads = 16 (n_threads_batch = 16) / 24 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | MATMUL_INT8 = 1 | DOTPROD = 1 | ACCELERATE = 1 | REPACK = 1 | 
0.00.000.485 I 
0.00.000.516 I init: using 23 threads for HTTP server
WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info.
WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash.
See: https://github.com/ggml-org/llama.cpp/pull/17869
0   libggml-base.0.9.4.dylib            0x0000000103fad4c8 ggml_print_backtrace_symbols + 48
1   libggml-base.0.9.4.dylib            0x0000000103fad2f8 ggml_print_backtrace + 160
2   libggml-base.0.9.4.dylib            0x0000000103fc43c8 _ZL23ggml_uncaught_exceptionv + 12
3   libc++abi.dylib                     0x00000001817f0c2c _ZSt11__terminatePFvvE + 16
4   libc++abi.dylib                     0x00000001817f4394 __cxa_get_exception_ptr + 0
5   libc++abi.dylib                     0x00000001817f433c _ZN10__cxxabiv1L12failed_throwEPNS_15__cxa_exceptionE + 0
6   llama-server                        0x000000010244ede0 _Z19common_params_parseiPPc13llama_exampleRNSt3__13mapI10common_argNS2_12basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEENS2_4lessIS4_EENS8_INS2_4pairIKS4_SA_EEEEEE + 860
7   llama-server                        0x00000001022ded9c _ZN14server_presetsC2EiPPcR13common_paramsRKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEE + 852
8   llama-server                        0x00000001022df450 _ZN14server_presetsC1EiPPcR13common_paramsRKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEE + 60
9   llama-server                        0x00000001022e06bc _ZN13server_modelsC2ERK13common_paramsiPPcS4_ + 180
10  llama-server                        0x00000001022e1288 _ZN13server_modelsC1ERK13common_paramsiPPcS4_ + 60
11  llama-server                        0x00000001022ad8d0 _ZN20server_models_routesC2ERK13common_paramsiPPcS4_ + 80
12  llama-server                        0x00000001022ad870 _ZN20server_models_routesC1ERK13common_paramsiPPcS4_ + 60
13  llama-server                        0x00000001022ad824 _ZNSt3__114__construct_atB8ne200100I20server_models_routesJR13common_paramsRiRPPcS7_EPS1_EEPT_SA_DpOT0_ + 72
14  llama-server                        0x00000001022ad7bc _ZNSt3__123__optional_storage_baseI20server_models_routesLb0EE11__constructB8ne200100IJR13common_paramsRiRPPcS9_EEEvDpOT_ + 60
15  llama-server                        0x0000000102291e54 _ZNSt3__18optionalI20server_models_routesE7emplaceB8ne200100IJR13common_paramsRiRPPcS9_EvEERS1_DpOT_ + 68
16  llama-server                        0x000000010228f780 main + 1276
17  dyld                                0x0000000181475d54 start + 7184
libc++abi: terminating due to uncaught exception of type std::invalid_argument: error: invalid argument: --no-mmap
Abort trap: 6

@ngxson
Copy link
Collaborator Author

ngxson commented Dec 13, 2025

@ggerganov ah yeah, I forgot the change the function handling presets on server (it only affect router mode), the fix is here: #17991

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples server testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants