Towards an Understanding of Context Utilization in Code Intelligence: Dataset and Analysis

Code intelligence is an emerging domain in software engineering, aiming to improve the effectiveness and efficiency of various code-related tasks. Recent research suggests that incorporating contextual information beyond the basic original task inputs (i.e., source code) can substantially enhance model performance. Such contextual signals may be obtained directly or indirectly from sources such as API documentation or intermediate representations like abstract syntax trees can significantly improve the effectiveness of code intelligence. Despite growing academic interest, there is a lack of systematic analysis of \textbf{context} in code intelligence. To address this gap, we conduct an extensive literature review of 146 relevant studies published between September 2007 and August 2024. Our investigation yields four main contributions. (1) A quantitative analysis of the research landscape, including publication trends, venues, and the explored domains; (2) A novel taxonomy of context types used in code intelligence; (3) A task-oriented analysis investigating context integration strategies across diverse code intelligence tasks; (4) A critical evaluation of evaluation methodologies for context-aware methods. Based on these findings, we identify fundamental challenges in context utilization in current code intelligence systems and propose a research roadmap that outlines key opportunities for future research.

Note: This repository contains the paper collection from our systematic survey. The repository will be continuously updated as new relevant studies are published.

Contributions

Systematic Review: This is the first systematic review synthesizing context utilization in code intelligence tasks.
Context Taxonomy: We build a taxonomy of context types and task-specific integration patterns.
Quantitative Analysis: We conduct quantitative analysis of methodological trends and evaluation practices.
Research Roadmap: We provide a research roadmap addressing scalability, generalizability, and assessment gaps, identifying key challenges and emerging opportunities for enhancing code intelligence through effective context utilization.

Paper Categories

Clone Detection

Code Completion

Code Generation

Code Summarization

Commit Message Generation

Defect Detection

Program Repair

Citation

If you use this repository in your research, please cite:

@article{wang2025towards,
  title={Towards an Understanding of Context Utilization in Code Intelligence},
  author={Wang, et al.},
  journal={arXiv preprint arXiv:2504.08734},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards an Understanding of Context Utilization in Code Intelligence: Dataset and Analysis

Table of Contents

Contributions

Paper Categories

Clone Detection

Code Completion

Code Generation

Code Summarization

Commit Message Generation

Defect Detection

Program Repair

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

DeepSoftwareAnalytics/Repo-Context-Survey

Folders and files

Latest commit

History

Repository files navigation

Towards an Understanding of Context Utilization in Code Intelligence: Dataset and Analysis

Table of Contents

Contributions

Paper Categories

Clone Detection

Code Completion

Code Generation

Code Summarization

Commit Message Generation

Defect Detection

Program Repair

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages