Nemotron3: Failed to load model with all configured dtypes.

When trying Nemotron nvfp4 quantizised version.
Same issue with `nvfp4` or `auto`.

Any hint how todo it? 
Got the model running on vLLM v0.14.1
transformers==4.57.6
numpy==2.2

```
# heretic --dtypes nvfp4 --trust-remote-code true --model cybermotaz/nemotron3-nano-nvfp4-w4a16 
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀  v1.1.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀  https://github.com/p-e-w/heretic

Detected 1 CUDA device(s):
* GPU 0: NVIDIA Thor

Loading model cybermotaz/nemotron3-nano-nvfp4-w4a16...
* Trying dtype nvfp4... Failed (NemotronHForCausalLM.__init__() got an unexpected keyword argument 'dtype')
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /opt/local/miniconda3/envs/vllm/bin/heretic:10 in <module>                                       │
│                                                                                                  │
│    7 │   │   sys.argv[0] = sys.argv[0][:-11]                                                     │
│    8 │   elif sys.argv[0].endswith(".exe"):                                                      │
│    9 │   │   sys.argv[0] = sys.argv[0][:-4]                                                      │
│ ❱ 10 │   sys.exit(main())                                                                        │
│   11                                                                                             │
│                                                                                                  │
│ /usr/src/heretic/src/heretic/main.py:888 in main                                                 │
│                                                                                                  │
│   885 │   install()                                                                              │
│   886 │                                                                                          │
│   887 │   try:                                                                                   │
│ ❱ 888 │   │   run()                                                                              │
│   889 │   except BaseException as error:                                                         │
│   890 │   │   # Transformers appears to handle KeyboardInterrupt (or BaseException)              │
│   891 │   │   # internally in some places, which can re-raise a different error in the handler   │
│                                                                                                  │
│ /usr/src/heretic/src/heretic/main.py:307 in run                                                  │
│                                                                                                  │
│   304 │   │   elif choice is None or choice == "":                                               │
│   305 │   │   │   return                                                                         │
│   306 │                                                                                          │
│ ❱ 307 │   model = Model(settings)                                                                │
│   308 │   print()                                                                                │
│   309 │   print_memory_usage()                                                                   │
│   310                                                                                            │
│                                                                                                  │
│ /usr/src/heretic/src/heretic/model.py:145 in __init__                                            │
│                                                                                                  │
│   142 │   │   │   break                                                                          │
│   143 │   │                                                                                      │
│   144 │   │   if self.model is None:                                                             │
│ ❱ 145 │   │   │   raise Exception("Failed to load model with all configured dtypes.")            │
│   146 │   │                                                                                      │
│   147 │   │   self._apply_lora()                                                                 │
│   148                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nemotron3: Failed to load model with all configured dtypes. #147

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Nemotron3: Failed to load model with all configured dtypes. #147

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions