Natural language to First-Order Logic (FOL) translation for ChEBI.
This example uses the Mistral FOL model: https://huggingface.co/fvossel/Mistral-Small-24B-Instruct-2501-nl-to-fol
Convert the Mistral model to a merged format by calling convert_mistral_to_gguf from:
nl_2_fol/prompting/custom_api/_to_gguf.py
Why this step matters:
- Hugging Face checkpoints are often split across multiple files.
- The conversion pipeline expects a clean merged model directory as input.
What this step does:
- Collects and organizes model artifacts into a local
mistral-mergedfolder. - Ensures the tokenizer/config/weights are in a format that
llama.cppconversion can read.
Expected result:
- A
mistral-mergeddirectory exists in your workspace and is ready for GGUF conversion.
This step prepares two required components:
llama.cpp, which provides theconvert_hf_to_gguf.pyconversion script.- A user-local Ollama installation, useful on clusters where you do not have
sudoaccess.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txtIf you do not have root access on the HPC cluster, install Ollama in your home directory:
mkdir -p "$HOME/ollama"
cd "$HOME/ollama"
curl -L -o ollama-linux-amd64.tar.zst https://ollama.com/download/ollama-linux-amd64.tar.zst
unzstd ollama-linux-amd64.tar.zst
tar -xf ollama-linux-amd64.tar
echo 'export PATH=$HOME/ollama/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
ollama --versionExpected result:
ollama --versionprints a version string.- You can run
ollamacommands without system-wide installation.
From inside llama.cpp:
python convert_hf_to_gguf.py ../mistral-merged --outfile mistral.ggufWhy this step matters:
- Ollama loads local models through GGUF files.
- This command translates the merged Hugging Face model into a runtime format Ollama can serve.
Expected result:
- A file named
mistral.ggufis created. - The conversion may take time and use significant CPU/RAM depending on model size.
Run the Ollama server in background using below command, so it keeps running while you execute your script or commands in same terminal.
export OLLAMA_HOST=http://localhost:<your_custom_port>
export OLLAMA_TIMEOUT=180 # in seconds
ollama serve > ollama.log 2>&1 &
OLLAMA_PID=$!After you are done with ollama, cleanly stop ollama server using below commands
kill $OLLAMA_PID 2>/dev/null
wait $OLLAMA_PID 2>/dev/nullCreate a Modelfile in the directory containing mistral.gguf with:
FROM ./mistral.gguf
Then run:
ollama create my-mistral -f Modelfile
ollama listWhy this step matters:
ollama createregisters your GGUF file under a model name (my-mistral).- After registration, you can refer to the model by name in CLI calls.
Expected result:
ollama listshowsmy-mistral.- You only need to run
ollama create ...once per model build.
This final step sends requests from your project CLI to the locally running Ollama server. On some clusters, proxy variables can interfere with localhost routing, so unset them first if needed.
export NO_PROXY=127.0.0.1,localhost,.local
export no_proxy=127.0.0.1,localhost,.localThen run:
python nl_2_fol/inference/cli.py --api_platform="ollama" --model_name="my-mistral"IMPORTANT: Ensure ollama serve and the inference command run on the same compute node or same allocated job/session if applicable.
For example, if ollama serve started on hpc3-52 but the inference command runs on hpc3-54, the connection might fail.
Expected result:
- The CLI connects to your local Ollama instance.
- The
my-mistralmodel is used for NL-to-FOL inference.