Skip to content

Optimize file enumeration to skip build directories during traversal#14

Merged
HandyS11 merged 3 commits intorefacto/ImproveCodeQualityfrom
copilot/sub-pr-12-again
Feb 3, 2026
Merged

Optimize file enumeration to skip build directories during traversal#14
HandyS11 merged 3 commits intorefacto/ImproveCodeQualityfrom
copilot/sub-pr-12-again

Conversation

Copy link
Contributor

Copilot AI commented Feb 3, 2026

The current file enumeration in EntityFileDiscovery.cs enumerates all files including those in bin, obj, .git, and node_modules, then filters them out post-traversal. This is inefficient for large projects.

Changes

  • Replaced RecurseSubdirectories = true with manual recursion that filters directories before descending
  • Applied optimization to both SearchDirectoryForEntitiesAsync and SearchForBaseClassFiles
  • Extracted max search depth as named constant MaxSearchDepth

Before

var options = new EnumerationOptions { RecurseSubdirectories = true };
foreach (var csFile in Directory.EnumerateFiles(searchDir, "*.cs", options))
{
    var pathSegments = fullPath.Split(Path.DirectorySeparatorChar);
    if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules"))
        continue;  // Skip after enumeration
}

After

foreach (var subDir in Directory.EnumerateDirectories(currentDir))
{
    var dirName = Path.GetFileName(subDir);
    if (dirName is "bin" or "obj" or ".git" or "node_modules")
        continue;  // Skip before descending
    
    await SearchDirectoryRecursiveAsync(subDir, ...);
}

This prevents the file system from enumerating potentially thousands of files in build artifacts and dependency directories.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits February 3, 2026 14:53
…traversal

Co-authored-by: HandyS11 <62420910+HandyS11@users.noreply.github.com>
Co-authored-by: HandyS11 <62420910+HandyS11@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 3, 2026

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • http://api.nuget.org:443/DigiCertTrustedG4RSA4096SHA256TimeStampingCA.crl
    • Triggering command: /usr/bin/dotnet dotnet build --no-incremental (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title [WIP] Address feedback on code quality improvements from PR #12 Optimize file enumeration to skip build directories during traversal Feb 3, 2026
Copilot AI requested a review from HandyS11 February 3, 2026 14:58
@HandyS11 HandyS11 marked this pull request as ready for review February 3, 2026 14:59
Copilot AI review requested due to automatic review settings February 3, 2026 14:59
@HandyS11 HandyS11 merged commit 9cf639d into refacto/ImproveCodeQuality Feb 3, 2026
6 checks passed
@HandyS11 HandyS11 deleted the copilot/sub-pr-12-again branch February 3, 2026 15:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes file enumeration in EntityFileDiscovery.cs by implementing manual directory recursion that filters out build and version control directories (bin, obj, .git, node_modules) before traversing them, rather than enumerating all files first and filtering afterward.

Changes:

  • Replaced RecurseSubdirectories = true with manual recursion in both entity and base class file discovery methods
  • Introduced MaxSearchDepth constant to limit recursion depth in base class file searches
  • Added early exit optimizations in SearchForBaseClassFilesRecursive to stop searching once all files are found

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

HashSet<string> entityTypeNames,
string normalizedContextPath,
Dictionary<string, string> entityFiles)
{
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SearchDirectoryForEntitiesAsync method is now just a thin wrapper that directly calls SearchDirectoryRecursiveAsync without adding any additional logic. Consider either removing this wrapper and calling SearchDirectoryRecursiveAsync directly from DiscoverEntityFilesAsync (line 49), or add the depth parameters to SearchDirectoryRecursiveAsync and initialize them in this wrapper method to make it serve a clear purpose.

Suggested change
{
{
// Basic validation to ensure we do not attempt to search with invalid inputs.
if (string.IsNullOrWhiteSpace(searchDir) ||
entityTypeNames == null ||
entityFiles == null)
{
return;
}
// Avoid starting a recursive search on a directory that does not exist.
if (!System.IO.Directory.Exists(searchDir))
{
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +207 to +211
private static async Task SearchDirectoryRecursiveAsync(
string currentDir,
HashSet<string> entityTypeNames,
string normalizedContextPath,
Dictionary<string, string> entityFiles)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SearchDirectoryRecursiveAsync method lacks depth tracking and maximum depth enforcement, unlike SearchForBaseClassFilesRecursive which has these safeguards. This creates an inconsistency and potential risk of unbounded recursion if the file system contains symbolic links or deeply nested directory structures. Consider adding currentDepth and maxDepth parameters to match the pattern used in SearchForBaseClassFilesRecursive, and add an early return check similar to line 461.

Copilot uses AI. Check for mistakes.
Comment on lines +234 to 244
// Recursively process subdirectories, skipping excluded directories
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
{
continue;
}

await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
await SearchDirectoryRecursiveAsync(subDir, entityTypeNames, normalizedContextPath, entityFiles);
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SearchDirectoryRecursiveAsync should check if all entities have been found and return early, similar to the optimization in SearchForBaseClassFilesRecursive (lines 484-487, 502-505). This would prevent unnecessary traversal once all entity files have been discovered. Consider adding a check after processing files and before recursing into subdirectories: if (entityTypeNames.All(entityFiles.ContainsKey)) return;

Copilot uses AI. Check for mistakes.
public static class EntityFileDiscovery
{
/// <summary>
/// Maximum recursion depth when searching for base class files to prevent infinite recursion
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for MaxSearchDepth indicates it's only used for "base class files", but this constant should also apply to entity file searching in SearchDirectoryRecursiveAsync to prevent unbounded recursion. Update the documentation to reflect that this constant applies to all recursive file search operations in this class.

Suggested change
/// Maximum recursion depth when searching for base class files to prevent infinite recursion
/// Maximum recursion depth for recursive file search operations in this class to prevent infinite recursion

Copilot uses AI. Check for mistakes.
Comment on lines +207 to +243
private static async Task SearchDirectoryRecursiveAsync(
string currentDir,
HashSet<string> entityTypeNames,
string normalizedContextPath,
Dictionary<string, string> entityFiles)
{
try
{
// Process files in the current directory
var options = new EnumerationOptions
{
RecurseSubdirectories = true, IgnoreInaccessible = true, AttributesToSkip = FileAttributes.System
RecurseSubdirectories = false,
IgnoreInaccessible = true,
AttributesToSkip = FileAttributes.System
};

foreach (var csFile in Directory.EnumerateFiles(searchDir, EfAnalysisConstants.FilePatterns.CSharpFiles,
options))
foreach (var csFile in Directory.EnumerateFiles(currentDir, EfAnalysisConstants.FilePatterns.CSharpFiles, options))
{
var fullPath = Path.GetFullPath(csFile);
if (fullPath.Equals(normalizedContextPath, StringComparison.OrdinalIgnoreCase))
{
continue;
}

// Skip common non-source directories that can be large
var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules"))
await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
}

// Recursively process subdirectories, skipping excluded directories
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
{
continue;
}

await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
await SearchDirectoryRecursiveAsync(subDir, entityTypeNames, normalizedContextPath, entityFiles);
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file WorkspaceTypeDiscovery.cs (lines 61-71) still uses the old pattern of RecurseSubdirectories = true with post-traversal filtering. For consistency and performance, consider applying the same optimization pattern used in this PR to that file as well.

Copilot uses AI. Check for mistakes.
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directory name filtering uses pattern matching which is case-sensitive ("bin" or "obj" or ".git" or "node_modules"). On Windows, directory names are case-insensitive, so "Bin", "BIN", "Obj", "OBJ" would not be filtered out. Consider using case-insensitive comparison, similar to how path comparisons are handled elsewhere in the codebase (e.g., line 226 uses StringComparison.OrdinalIgnoreCase). Example: if (dirName.Equals("bin", StringComparison.OrdinalIgnoreCase) || dirName.Equals("obj", StringComparison.OrdinalIgnoreCase) || ...)

Suggested change
if (dirName is "bin" or "obj" or ".git" or "node_modules")
if (string.Equals(dirName, "bin", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, "obj", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, ".git", System.StringComparison.OrdinalIgnoreCase)
|| string.Equals(dirName, "node_modules", System.StringComparison.OrdinalIgnoreCase))

Copilot uses AI. Check for mistakes.
foreach (var subDir in Directory.EnumerateDirectories(currentDir, "*", options))
{
var dirName = Path.GetFileName(subDir);
if (dirName is "bin" or "obj" or ".git" or "node_modules")
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directory name filtering uses pattern matching which is case-sensitive ("bin" or "obj" or ".git" or "node_modules"). On Windows, directory names are case-insensitive, so "Bin", "BIN", "Obj", "OBJ" would not be filtered out. Consider using case-insensitive comparison, similar to how path comparisons are handled elsewhere in the codebase (e.g., line 226 uses StringComparison.OrdinalIgnoreCase). Example: if (dirName.Equals("bin", StringComparison.OrdinalIgnoreCase) || dirName.Equals("obj", StringComparison.OrdinalIgnoreCase) || ...)

Suggested change
if (dirName is "bin" or "obj" or ".git" or "node_modules")
if (dirName.Equals("bin", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals("obj", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals(".git", System.StringComparison.OrdinalIgnoreCase)
|| dirName.Equals("node_modules", System.StringComparison.OrdinalIgnoreCase))

Copilot uses AI. Check for mistakes.
Comment on lines +223 to +232
foreach (var csFile in Directory.EnumerateFiles(currentDir, EfAnalysisConstants.FilePatterns.CSharpFiles, options))
{
var fullPath = Path.GetFullPath(csFile);
if (fullPath.Equals(normalizedContextPath, StringComparison.OrdinalIgnoreCase))
{
continue;
}

// Skip common non-source directories that can be large
var pathSegments = fullPath.Split(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar);
if (pathSegments.Any(s => s is "bin" or "obj" or ".git" or "node_modules"))
await ProcessSourceFileAsync(fullPath, entityTypeNames, entityFiles);
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This foreach loop immediately maps its iteration variable to another variable - consider mapping the sequence explicitly using '.Select(...)'.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants