Skip to content

Inquiry about Paper Details of Magicoder #42

@Alex-HaochenLi

Description

@Alex-HaochenLi

I am very excited to read the cool work Magicoder. I strongly believe that OSS-Instruct will push the boundaries of instruction tuning for code LLMs.

I want to ask a question about Magicoder. It seems that you do not test the correctness of the generated solutions from seed code snippets. I am curious about the reason why it is not necessary to go through the code validity checking process. Below are some assumptions I made about this:

  1. The most of generated solutions are just correct by manual checking, and LLMs are robust to some wrong codes during fine-tuning.
  2. OSS-Instruct creates new data more like a combination of seed code snippets. And the LLMs (GPT-3.5/GPT-4) used to generate solutions can handle the combination easily since they could see correct seed code snippets.

What’s your opinion on this problem? I am looking forward to your reply and thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions