I have a spare internal SATAIII SSD that I wasn't using for anything. I just installed the latest stable Ubuntu 24.04 on it to test Heretic with Qwen3.5-9B.
tester@test:~$ cat /proc/version;cat /etc/issue;nvidia-smi|grep -e CUDA -e RTX
Linux version 6.17.0-19-generic (buildd@lcy02-amd64-019) (x86_64-linux-gnu-gcc-13 (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0, GNU ld (GNU Binutils for Ubuntu) 2.42) #19~24.04.2-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 6 23:08:46 UTC 2
Ubuntu 24.04.4 LTS \n \l
| NVIDIA-SMI 580.126.09 Driver Version: 580.126.09 CUDA Version: 13.0 |
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 On | N/A |
tester@test:~$
The description at https://huggingface.co/Qwen/Qwen3.5-9B gives command syntax for installing the latest transformers, though I had to add --break-system-packages since it otherwise wouldn't run: pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@main" --break-system-packages.
Then I ran pip install --upgrade git+https://github.com/p-e-w/heretic.git --break-system-packages for the latest commits. The output ended with:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
heretic-llm 1.2.0 requires transformers~=5.3, but you have transformers 5.3.0.dev0 which is incompatible.
Successfully installed transformers-5.3.0.dev0
Then I ran pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@v5.3.0" --break-system-packages, which replaced 5.3.0.dev0 with 5.3.0. Then I tried running Heretic; here is the result, with some lines redacted for brevity:
tester@test:~$ PYTORCH_ALLOC_CONF=expandable_segments:True heretic --model Desktop/Qwen3.5-9B --quantization NONE --device-map auto --max-memory '{"0": "9GB", "cpu": "28GB"}' --batch-size 32 --print-responses
Traceback (most recent call last):
File "/home/tester/.local/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2096, in __getattr__
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[REDACTED for BREVITY]
ModuleNotFoundError: Could not import module 'PreTrainedModel'. Are this object's requirements defined correctly?
tester@test:~$
So I ran pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130 --break-system-packages, which fixed that particular error. Next Heretic run:
tester@test:~$ PYTORCH_ALLOC_CONF=expandable_segments:True heretic --model Desktop/Qwen3.5-9B --quantization NONE --device-map auto --max-memory '{"0": "9GB", "cpu": "28GB"}' --batch-size 32 --print-responses
bitsandbytes library load error: libnvJitLink.so.13: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/home/tester/.local/lib/python3.12/site-packages/bitsandbytes/cextension.py", line 320, in <module>
lib = get_native_library()
^^^^^^^^^^^^^^^^^^^^
File "/home/tester/.local/lib/python3.12/site-packages/bitsandbytes/cextension.py", line 298, in get_native_library
dll = ct.cdll.LoadLibrary(str(binary_path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/ctypes/__init__.py", line 460, in LoadLibrary
return self._dlltype(name)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libnvJitLink.so.13: cannot open shared object file: No such file or directory
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.2.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic
Detected 1 CUDA device(s) (11.62 GB total VRAM):
* GPU 0: NVIDIA GeForce RTX 3060 (11.62 GB)
It was continuing despite the error, but that hasn't seemed standard for anyone else, so I issued CTRL+C. Maybe bitsandbytes has trouble with CUDA 13?
I tried exporting an environment variable:
tester@test:~$ export BNB_CUDA_VERSION=130
tester@test:~$ PYTORCH_ALLOC_CONF=expandable_segments:True heretic --model Desktop/Qwen3.5-9B --quantization NONE --device-map auto --max-memory '{"0": "9GB", "cpu": "28GB"}' --batch-size 32 --print-responses
WARNING: BNB_CUDA_VERSION=130 environment variable detected; loading libbitsandbytes_cuda130.so.
This can be used to load a bitsandbytes version built with a CUDA version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
bitsandbytes library load error: libnvJitLink.so.13: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/home/tester/.local/lib/python3.12/site-packages/bitsandbytes/cextension.py", line 320, in <module>
lib = get_native_library()
^^^^^^^^^^^^^^^^^^^^
File "/home/tester/.local/lib/python3.12/site-packages/bitsandbytes/cextension.py", line 298, in get_native_library
dll = ct.cdll.LoadLibrary(str(binary_path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/ctypes/__init__.py", line 460, in LoadLibrary
return self._dlltype(name)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libnvJitLink.so.13: cannot open shared object file: No such file or directory
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.2.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic
^C
Shutting down...
Then I tried adding it to PATH:
tester@test:~$ find ~/.local -name "libnvJitLink.so.13"
/home/tester/.local/lib/python3.12/site-packages/nvidia/cu13/lib/libnvJitLink.so.13
tester@test:~$ export LD_LIBRARY_PATH=$HOME/.local/lib/python3.12/site-packages/nvidia/cu13/lib:$LD_LIBRARY_PATH
tester@test:~$ source .profile
Heretic still produced the "libnvJitLink.so.13" error; although, I just realized that might be because I didn't run source .profile in the same terminal window as where I ran Heretic. While typing this post, I happened to try Heretic in the terminal window in which I had run source .profile, and it ran alright. I'm not certain that was the issue, but it might be.
With Heretic working, Qwen3.5-9B is still giving me garbled output. I even re-downloaded it with hf download just in case my earlier download was corrupted. It's giving nonsensical mashups of Chinese, Russian, Greek, Latin alphabet, and other characters. I'm thinking maybe bitsandbytes, transformers, accelerate, or some other dependency doesn't play nice with CUDA 13 or Qwen3.5-9B. Should I try CUDA 12?
My main idea here is, how can each prospective Heretic user be sure to get the correct dependencies and the correct versions thereof? My guess is it can vary between components and operating systems.
I can re-do the fresh install of Ubuntu 24.04 and try again, but is there some more efficient and less error-prone way of installing Heretic and its dependencies than what I've been doing?
I have a spare internal SATAIII SSD that I wasn't using for anything. I just installed the latest stable Ubuntu 24.04 on it to test Heretic with Qwen3.5-9B.
The description at https://huggingface.co/Qwen/Qwen3.5-9B gives command syntax for installing the latest transformers, though I had to add --break-system-packages since it otherwise wouldn't run:
pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@main" --break-system-packages.Then I ran
pip install --upgrade git+https://github.com/p-e-w/heretic.git --break-system-packagesfor the latest commits. The output ended with:Then I ran
pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@v5.3.0" --break-system-packages, which replaced 5.3.0.dev0 with 5.3.0. Then I tried running Heretic; here is the result, with some lines redacted for brevity:So I ran
pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130 --break-system-packages, which fixed that particular error. Next Heretic run:It was continuing despite the error, but that hasn't seemed standard for anyone else, so I issued CTRL+C. Maybe bitsandbytes has trouble with CUDA 13?
I tried exporting an environment variable:
Then I tried adding it to PATH:
Heretic still produced the "libnvJitLink.so.13" error; although, I just realized that might be because I didn't run
source .profilein the same terminal window as where I ran Heretic. While typing this post, I happened to try Heretic in the terminal window in which I had runsource .profile, and it ran alright. I'm not certain that was the issue, but it might be.With Heretic working, Qwen3.5-9B is still giving me garbled output. I even re-downloaded it with
hf downloadjust in case my earlier download was corrupted. It's giving nonsensical mashups of Chinese, Russian, Greek, Latin alphabet, and other characters. I'm thinking maybe bitsandbytes, transformers, accelerate, or some other dependency doesn't play nice with CUDA 13 or Qwen3.5-9B. Should I try CUDA 12?My main idea here is, how can each prospective Heretic user be sure to get the correct dependencies and the correct versions thereof? My guess is it can vary between components and operating systems.
I can re-do the fresh install of Ubuntu 24.04 and try again, but is there some more efficient and less error-prone way of installing Heretic and its dependencies than what I've been doing?