already filed via slack last friday but want to file here to track it.
even when doing trust_remote_code=true, minimax mxfp4 doesnt use it leading to this bug.
+ vllm serve amd/MiniMax-M2.5-MXFP4 --port 8888 --tensor-parallel-size=2 --gpu-memory-utilization 0.95 --max-model-len 2248 --block-size=32 --trust-remote-code
WARNING 03-20 02:24:48 [gpt_oss_triton_kernels_moe.py:56] Using legacy triton_kernels on ROCm
/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/chat_completion/protocol.py:346: SyntaxWarning: invalid escape sequence '\e'
"(e.g. 'abcdabcdabcd...' or '\emoji \emoji \emoji ...'). This feature "
/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/completion/protocol.py:176: SyntaxWarning: invalid escape sequence '\e'
"(e.g. 'abcdabcdabcd...' or '\emoji \emoji \emoji ...'). This feature "
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302]
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302] █ █ █▄ ▄█
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302] ▄▄ ▄█ █ █ █ ▀▄▀ █ version 0.17.1
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302] █▄█▀ █ █ █ █ model amd/MiniMax-M2.5-MXFP4
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302] ▀▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀ ▀
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:302]
(APIServer pid=1169773) INFO 03-20 02:24:49 [utils.py:238] non-default args: {'model_tag': 'amd/MiniMax-M2.5-MXFP4', 'port': 8888, 'model': 'amd/MiniMax-M2.5-MXFP4', 'trust_remote_code': True, 'max_model_len': 2248, 'tensor_parallel_size': 2, 'block_size': 32, 'gpu_memory_utilization': 0.95}
(APIServer pid=1169773) The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
(APIServer pid=1169773) [2026-03-20 02:24:49] WARNING configuration_utils.py:697: The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
(APIServer pid=1169773) The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
(APIServer pid=1169773) [2026-03-20 02:24:49] WARNING configuration_utils.py:697: The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
(APIServer pid=1169773) INFO 03-20 02:24:56 [model.py:531] Resolved architecture: MiniMaxM2ForCausalLM
(APIServer pid=1169773) INFO 03-20 02:24:56 [model.py:1554] Using max model len 2248
(APIServer pid=1169773) [aiter] start build [module_aiter_enum] under /usr/local/lib/python3.12/dist-packages/aiter/jit/build/module_aiter_enum
(APIServer pid=1169773) [2026-03-20 02:24:56] INFO core.py:549: start build [module_aiter_enum] under /usr/local/lib/python3.12/dist-packages/aiter/jit/build/module_aiter_enum
(APIServer pid=1169773) [aiter] finish build [module_aiter_enum], cost 7.8s
(APIServer pid=1169773) [2026-03-20 02:25:04] INFO core.py:699: finish build [module_aiter_enum], cost 7.8s
(APIServer pid=1169773) [aiter] import [module_aiter_enum] under /usr/local/lib/python3.12/dist-packages/aiter/jit/module_aiter_enum.so
(APIServer pid=1169773) [2026-03-20 02:25:04] INFO core.py:501: import [module_aiter_enum] under /usr/local/lib/python3.12/dist-packages/aiter/jit/module_aiter_enum.so
(APIServer pid=1169773) INFO 03-20 02:25:04 [scheduler.py:231] Chunked prefill is enabled with max_num_batched_tokens=8192.
(APIServer pid=1169773) Traceback (most recent call last):
(APIServer pid=1169773) File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=1169773) sys.exit(main())
(APIServer pid=1169773) ^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 73, in main
(APIServer pid=1169773) args.dispatch_function(args)
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 112, in cmd
(APIServer pid=1169773) uvloop.run(run_server(args))
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=1169773) return __asyncio.run(
(APIServer pid=1169773) ^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=1169773) return runner.run(main)
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=1169773) return self._loop.run_until_complete(task)
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=1169773) return await main
(APIServer pid=1169773) ^^^^^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 471, in run_server
(APIServer pid=1169773) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 490, in run_server_worker
(APIServer pid=1169773) async with build_async_engine_client(
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1169773) return await anext(self.gen)
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 96, in build_async_engine_client
(APIServer pid=1169773) async with build_async_engine_client_from_engine_args(
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=1169773) return await anext(self.gen)
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 122, in build_async_engine_client_from_engine_args
(APIServer pid=1169773) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=1169773) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1890, in create_engine_config
(APIServer pid=1169773) config = VllmConfig(
(APIServer pid=1169773) ^^^^^^^^^^^
(APIServer pid=1169773) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__
(APIServer pid=1169773) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=1169773) pydantic_core._pydantic_core.ValidationError: 1 validation error for VllmConfig
(APIServer pid=1169773) Value error, The repository amd/MiniMax-M2.5-MXFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/amd/MiniMax-M2.5-MXFP4 .
(APIServer pid=1169773) You can inspect the repository content at https://hf.co/amd/MiniMax-M2.5-MXFP4.
(APIServer pid=1169773) Please pass the argument `trust_remote_code=True` to allow custom code to be run. [type=value_error, input_value=ArgsKwargs((), {'model_co...transfer_config': None}), input_type=ArgsKwargs]
(APIServer pid=1169773) For further information visit https://errors.pydantic.dev/2.12/v/value_error
Your current environment
image:
vllm/vllm-openai-rocm:v0.17.1🐛 Describe the bug
already filed via slack last friday but want to file here to track it.
blocker for merging this PR in SemiAnalysisAI/InferenceX#827
even when doing trust_remote_code=true, minimax mxfp4 doesnt use it leading to this bug.
seems like @hongxiayang already working on fixing it
#37698
https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23326389246/job/67848378566?pr=827
Before submitting a new issue...