Skip to content

Lack of Implementation for Large Language Models (LLMs) #1

@joelonsql

Description

@joelonsql

In the published paper on FunSearch, there is a mention of using pre-trained large language models (LLMs) like Codey (based on the PaLM2 model family) and a reference to StarCoder, an open-source LLM, in the supplementary information. However, the current GitHub repository for FunSearch does not include implementations or integration guidelines for these LLMs.

This issue is particularly evident in the sampler.py file, where the LLM class seems to be a placeholder without an actual implementation:

class LLM:
  """Language model that predicts continuation of provided source code."""

  def __init__(self, samples_per_prompt: int) -> None:
    self._samples_per_prompt = samples_per_prompt

  def _draw_sample(self, prompt: str) -> str:
    """Returns a predicted continuation of `prompt`."""
    raise NotImplementedError('Must provide a language model.')

Suggested Resolution:

  • It would be greatly beneficial for the community if the repository could include a basic implementation or integration guide for an open-source LLM, especially StarCoder, which was referenced in the paper.
  • Providing such an implementation or guide would enhance the reproducibility and usability of the FunSearch project for researchers and developers looking to explore or build upon this work.

Looking forward to any updates or guidance on this matter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions