Prompting-Strategies-CodeGen-Study

Chain-of-Thought vs. Few-Shot: A Comparative Study of Prompting Strategies for Code Generation

This repository accompanies the research study and provides the code, dataset, and analysis artifacts referenced in the paper (see Associated Publication).

Contributors

Koorosh Nobakhtfar — Technical Analyst [GitHub | LinkedIn]
Kenan Çakılcı — Dataset Architect [GitHub | LinkedIn]
Ruken Zilan — Research Supervisor [LinkedIn]

1) Token Counter Program

Authored by us; uses the tiktoken library for tokenization.
Installation instructions for tiktoken are available in the official repository: https://github.com/openai/tiktoken.

2) Data and Analytics Excel File

Contains the raw data for prompts, responses, and human evaluations.
Includes basic analytics for quick inspection.

3) ANOVA Excel File

ANOVA conducted using the Analysis ToolPak add-in.
Effect sizes (eta-squared, partial eta-squared, and omega-squared) were calculated manually using standard definitions (see Effect Sizes).
To enable the add-in in Excel:
1. File → Options → Add-ins
2. From Manage, select Excel Add-ins, click Go…
3. Check Analysis ToolPak, click OK.

4) Dataset

Organized by the combination of Reasoning-Style (CoT vs. Non-CoT) and Example-Context (Zero-Shot vs. Few-Shot).
Each combination contains 20 tasks (cases).
Each task has three files:
- Prompt file: the prompt authored by the LLM.
- Response file: the LLM’s response to that prompt.
- Data file: structured metadata, evaluation results, and other task-level information.

Example layout:

Dataset/
├─ CoT Few-Shot (CFS)/
│  ├─ CFS 1/
│  │  ├─ task_031_data.json
│  │  ├─ task_031_prompt.txt
│  │  └─ task_031_response.txt
│  ├─ CFS 2/
│  │  ├─ ...
│  │  └─ ...
│  └─ ...
├─ CoT Zero-Shot (CZS)/
│  ├─ CZS 1/
│  │  ├─ ...
│  │  └─ ...
│  └─ ...
├─ Non-CoT Few-Shot (NCFS)/
│  ├─ NCFS 1/
│  │  ├─ ...
│  │  └─ ...
│  └─ ...
└─ Non-CoT Zero-Shot (NCZS)/
   ├─ NCZS 1/
   │  ├─ ...
   │  └─ ...
   └─ ...

Data Files

Data files are designed to store information about the prompt and response.
These files were originally produced by the LLM and then edited by humans to correct metadata and add additional information where necessary to ensure accuracy and completeness.

Notes on stored evaluations:

Self-evaluation refers to the model’s own assessment (values typically on a 1–10 scale). These were retained for completeness but disregarded in analysis due to unreliability.
Supervised (human) evaluations were performed by real evaluators. The rubric fields are:

field	weight	range	explanation
`factual_correctness`	25%	1 to 5	Are the facts and steps correct?
`Reasoning_quality`	25%	1 to 5	Is the logic transparent?
`coherency_and_clarity`	20%	1 to 5	Is the response clear and easy to follow?
`completeness`	20%	1 to 5	Does it cover all required aspects?
`understanding_depth`	10%	1 to 5	Does it show insight beyond surface-level?
`weighted_total`	N/A	0 to 100 (pct)	Final composite score from weights

For more information regarding the evaluation of accuracy, see the Accuracy Evaluation Process and Criteria file.

Effect Sizes

Eta-squared (η²), partial eta-squared (η_p²), and omega-squared (ω²) were derived from the ANOVA results using their standard formulas based on sums-of-squares (SS), mean-squares (MS) and degrees-of-freedom (df).

$$ \eta^{2} = \frac{SS_{\text{effect}}}{SS_{\text{total}}} $$ $$ \text{partial } \eta^{2} = \frac{SS_{\text{effect}}}{SS_{\text{effect}} + SS_{\text{error}}} $$ $$ \omega^{2} = \frac{ SS_{\text{effect}} - (df_{\text{effect}})(MS_{\text{error}}) }{ SS_{\text{total}} + MS_{\text{error}} } $$

Some formulas are not presented in our paper
For more information regarding the formulas, see the reference below.

Reference for effect-size formulas
B. G. Tabachnick and L. S. Fidell, Using Multivariate Statistics, 6th ed., Upper Saddle River, NJ: Pearson Education, 2013, pp. 54–55.

Associated Publication

This repository contains the source code, datasets, and experimental materials for the following publication:

Chain-of-Thought vs. Few-Shot: A Comparative Study of Prompting Strategies for Code Generation
K. Nobakhtfar, K. Çakılcı, R. Zilan

Published in the IEEE conference proceedings of the 5th International Informatics and Software Engineering Conference (IISEC 2026) and available through IEEE Xplore.

DOI: https://doi.org/10.1109/IISEC69317.2026.11418478

Publication Timeline

Event	Date
Paper Submission	December 1, 2025
Notification of Acceptance	January 9, 2026
Camera-Ready Submission	January 28, 2026
Date of IISEC 2026 Conference	February 5–6, 2026
Published on IEEE Xplore	March 10, 2026

Citation

If you use this work in your research, please cite the paper:

BibTeX

@INPROCEEDINGS{11418478,
  author={Nobakhtfar, Koorosh and Çakılcı, Kenan and Zilan, Ruken},
  booktitle={2026 5th International Informatics and Software Engineering Conference (IISEC)}, 
  title={Chain-of-Thought vs. Few-Shot: A Comparative Study of Prompting Strategies for Code Generation}, 
  year={2026},
  volume={},
  number={},
  pages={382-387},
  keywords={Costs;Codes;Accuracy;Limiting;Large language models;Cognition;Robustness;Prompt engineering;Informatics;Software engineering;Chain-of-Thought (CoT);Few-Shot Learning;Large Language Models;Prompt Engineering;Empirical Study},
  doi={10.1109/IISEC69317.2026.11418478}
}

Plain Text

K. Nobakhtfar, K. Çakılcı and R. Zilan, "Chain-of-Thought vs. Few-Shot: A Comparative Study of Prompting Strategies for Code Generation," 2026 5th International Informatics and Software Engineering Conference (IISEC), Ankara, Turkiye, 2026, pp. 382-387, doi: 10.1109/IISEC69317.2026.11418478.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompting-Strategies-CodeGen-Study

Contributors

Contents

1) Token Counter Program

2) Data and Analytics Excel File

3) ANOVA Excel File

4) Dataset

Data Files

Effect Sizes

Associated Publication

Publication Timeline

Citation

BibTeX

Plain Text

Links

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Prompting-Strategies-CodeGen-Study

Contributors

Contents

1) Token Counter Program

2) Data and Analytics Excel File

3) ANOVA Excel File

4) Dataset

Data Files

Effect Sizes

Associated Publication

Publication Timeline

Citation

BibTeX

Plain Text

Links