MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]

This repo is forked from https://github.com/Liyan06/MiniCheck (MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]) which is based on vLLM. However, in this repo, vLLM got removed entirely and has been replaced with transformers. For installation and usage, see below.

Installation

pip install "minicheck @ git+https://github.com/F4biian/MiniCheck-Transformers.git@main"

This code does not work with the latest transformers version, which is why it is forcefully downgraded to transformers==4.39.0!

You might be asked by nltk to download punkt_tab. Just do so in a python script once:

import nltk
nltk.download('punkt_tab')

Usage

The code for using Bespoke-MiniCheck-7B is almost the same as before except that the parameters tensor_parallel_size and enable_prefix_caching have been removed from the MiniCheck class.

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

# import nltk
# nltk.download('punkt_tab')
# exit()

from minicheck.minicheck import MiniCheck

doc = "A group of students gather in the school library to study for their upcoming final exams."
claim_1 = "The students are preparing for an examination."
claim_2 = "The students are on vacation."

# model_name can be one of:
# ['roberta-large', 'deberta-v3-large', 'flan-t5-large', 'Bespoke-MiniCheck-7B']

# bespokelabs/Bespoke-MiniCheck-7B will be auto-downloaded from Huggingface for the first time
# Bespoke-MiniCheck-7B is the most performant fact-checking model in the MiniCheck series AND
# it outperforms ALL exisiting specialized fact-checkers and off-the-shelf LLMs regardless of size.
scorer = MiniCheck(model_name='Bespoke-MiniCheck-7B')
pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2])

# Output of this repo's code:
print(pred_label) # [1, 0]
print(raw_prob)   # [np.float64(0.9840496778488159), np.float64(0.010986425913870335)]

# Output of original repo's code:
print(pred_label) # [1, 0]
print(raw_prob)   # [0.9840446675150499, 0.010986349594852094]

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
images		images
minicheck		minicheck
synthetic_data_gen		synthetic_data_gen
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark_evaluation_demo.ipynb		benchmark_evaluation_demo.ipynb
demo.py		demo.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]

Installation

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]

Installation

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages