RLHF_Sentiment_Alignment

RLHF implementation of sentiment alignments of DPO paper

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
evaluation		evaluation
notebooks		notebooks
ppo		ppo
README.md		README.md
generate_imdb_pairs.py		generate_imdb_pairs.py
requirements.txt		requirements.txt

Provide feedback