SIGILL crash on CPUs without AVX-512
The moment inference starts (on first token), dlgo crashes with SIGILL: illegal instruction on CPUs that support AVX2 but not AVX-512. The i7-7700K supports SSE4.2, AVX, AVX2, FMA3 but not AVX-512. The crash originates in the SIMD quantization code:
SIGILL: illegal instruction
github.com/computerex/dlgo/blas.BatchQuantizeForType(...)
github.com/computerex/dlgo/blas.QBatchGEMMParallel(...)
The SIMD kernels appear to be compiled assuming AVX-512 availability. Would be great to have a fallback to AVX2 for pre-2019 consumer hardware (especially for a Vulkan-focused project like this)!
RAM availability underreported on Linux
On an idle Xfce desktop with 48GB RAM installed and virtually nothing running, dlgo reports only ~1.2GB free and reduces context from 8192 to 1024 tokens (task manager shows only 1.7GB used).
--gpu flag silently falls back to CPU with no explanation
Even with --gpu specified, dlgo silently ran Backend: CPU (8 threads) with no warning or error message. Vulkan drivers are installed and working (confirmed working with other Vulkan applications and llama.cpp Vulkan builds). It either failed to initialize the Vulkan backend and silently fell back to CPU, or never attempted GPU at all.
System: iMac 18,3, Linux Mint, i7-7700K, RX 580 / Radeon Pro 580 8GB, 48GB RAM, kernel 6.8
Extremely interested in this project as it would be a complete game changer for my hardware. Let me know if you need any additional information.
SIGILL crash on CPUs without AVX-512
The moment inference starts (on first token), dlgo crashes with SIGILL: illegal instruction on CPUs that support AVX2 but not AVX-512. The i7-7700K supports SSE4.2, AVX, AVX2, FMA3 but not AVX-512. The crash originates in the SIMD quantization code:
The SIMD kernels appear to be compiled assuming AVX-512 availability. Would be great to have a fallback to AVX2 for pre-2019 consumer hardware (especially for a Vulkan-focused project like this)!
RAM availability underreported on Linux
On an idle Xfce desktop with 48GB RAM installed and virtually nothing running, dlgo reports only ~1.2GB free and reduces context from 8192 to 1024 tokens (task manager shows only 1.7GB used).
--gpu flag silently falls back to CPU with no explanation
Even with --gpu specified, dlgo silently ran Backend: CPU (8 threads) with no warning or error message. Vulkan drivers are installed and working (confirmed working with other Vulkan applications and llama.cpp Vulkan builds). It either failed to initialize the Vulkan backend and silently fell back to CPU, or never attempted GPU at all.
System: iMac 18,3, Linux Mint, i7-7700K, RX 580 / Radeon Pro 580 8GB, 48GB RAM, kernel 6.8
Extremely interested in this project as it would be a complete game changer for my hardware. Let me know if you need any additional information.