Skip to content

MultiturnRL/prm-o1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Main repo for PRM-o1 research project

TODO

  1. Do experiment on steps produced by llama 3.1 8b instruct (how many go over length, what is distribution of lengths, etc.)
  2. Implement LLM step expansion routine that involves generating N steps that are less than max tokens and the paths are semantically novel
  3. Implement A* search using QVM and PRM (maybe use skywork PRM?)

Notes

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors