Inquiry about Paper Details of Magicoder

I am very excited to read the cool work Magicoder. I strongly believe that OSS-Instruct will push the boundaries of instruction tuning for code LLMs.

I want to ask a question about Magicoder. It seems that you do not test the correctness of the generated solutions from seed code snippets. I am curious about the reason why it is not necessary to go through the code validity checking process. Below are some assumptions I made about this:
1.	The most of generated solutions are just correct by manual checking, and LLMs are robust to some wrong codes during fine-tuning.
2.	OSS-Instruct creates new data more like a combination of seed code snippets. And the LLMs (GPT-3.5/GPT-4) used to generate solutions can handle the combination easily since they could see correct seed code snippets.

What’s your opinion on this problem? I am looking forward to your reply and thanks for your help!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry about Paper Details of Magicoder #42

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Inquiry about Paper Details of Magicoder #42

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions