Skip to content

Bug Fixes & Improvements: Windows Stdio hangs, Cascading Gitignore, UTF-8 Snippets #530

Description

@tarun1790

codebase-memory-mcp Codebase Analysis & Improvements Report

This report outlines 5 high-impact bugs, build failures, and architectural issues identified in the DeusData/codebase-memory-mcp repository, along with their root causes and suggested code fixes.


1. Windows stdio MCP Handshake Hangs & Early Exits

Important

Impact: Prevents the MCP server from running in stdio mode on Windows when spawned as a subprocess by clients like Cursor or Claude Code.

Root Cause

In mcp.c, the event loop polls stdin for new data to manage idle timeouts. On Windows, it tries to wait directly on the stdin file descriptor handle:

#ifdef _WIN32
        /* Windows: WaitForSingleObject on stdin handle */
        HANDLE hStdin = (HANDLE)_get_osfhandle(fd);
        DWORD wr = WaitForSingleObject(hStdin, STORE_IDLE_TIMEOUT_S * MCP_TIMEOUT_MS);
        if (wr == WAIT_FAILED) {
            break;
        }

Windows anonymous pipes (which redirect standard inputs for subprocesses) do not support synchronization and cannot be used with wait functions. Thus, WaitForSingleObject fails immediately and returns WAIT_FAILED, causing the server loop to break and terminate the process during handshake.

Additionally, stdout is block-buffered on Windows when redirected to a pipe. Although fflush is called in the loop, if the process crashes or hangs on a blocked read, standard outputs remain stuck in the OS-level stream buffer.

Suggested Fix

  1. Disable WaitForSingleObject on pipe handles or use PeekNamedPipe to non-blockingly check if input is available:
    #ifdef _WIN32
            HANDLE hStdin = (HANDLE)_get_osfhandle(fd);
            DWORD type = GetFileType(hStdin);
            if (type == FILE_TYPE_PIPE) {
                // Bypass WaitForSingleObject for anonymous pipes to prevent WAIT_FAILED
                DWORD bytes_avail = 0;
                if (PeekNamedPipe(hStdin, NULL, 0, NULL, &bytes_avail, NULL)) {
                    if (bytes_avail == 0) {
                        cbm_usleep(10000); // Back off 10ms to prevent CPU pinning
                        continue;
                    }
                }
            } else {
                DWORD wr = WaitForSingleObject(hStdin, STORE_IDLE_TIMEOUT_S * MCP_TIMEOUT_MS);
                if (wr == WAIT_FAILED) break;
                if (wr == WAIT_TIMEOUT) {
                    cbm_mcp_server_evict_idle(srv, STORE_IDLE_TIMEOUT_S);
                    continue;
                }
            }
    #endif
  2. Disable buffering on stdout at server startup in main.c:
    setvbuf(stdout, NULL, _IONBF, 0);

2. Cascading .gitignore Files Ignored in Subdirectories

Warning

Impact: Subdirectory indexing ignores .gitignore files nested inside subfolders, causing build/dist folders or other excluded patterns to be scanned.

Root Cause

In discover.c, try_load_nested_gitignore checks:

static cbm_gitignore_t *try_load_nested_gitignore(const walk_frame_t *frame) {
    if (frame->local_gi || frame->prefix[0] == '\0') {
        return NULL;
    }
    ...
}

If a parent directory has a .gitignore loaded, frame->local_gi is set. When entering subdirectories, try_load_nested_gitignore exits early, meaning nested .gitignore files deeper in the tree are never loaded.

Suggested Fix

Allow loading nested .gitignore files even if a parent .gitignore was loaded. If a nested .gitignore is found in the current directory, merge its rules with the parent or stack them in a chain of active ignore filters.


3. get_code_snippet Hangs on Non-UTF-8 Encoded Files

Important

Impact: MCP clients (e.g. Node-based clients) crash or hang when requesting snippets from files using non-UTF-8 encodings (such as EUC-KR / CP949 or Shift-JIS).

Root Cause

The read_file_lines function reads raw bytes from files. When a non-UTF-8 file is read, it inserts invalid UTF-8 bytes directly into the JSON response structure.
yy_doc_to_str serializes the document using YYJSON_WRITE_ALLOW_INVALID_UNICODE which allows invalid bytes to be written straight to standard output. When the receiving MCP client (which expects strictly valid UTF-8 JSON) parses the stdout stream, it chokes on the invalid bytes and hangs.

Suggested Fix

Sanitize the file source content in build_snippet_response before adding it to the JSON response. Validate the UTF-8 sequences and replace any invalid bytes with the Unicode replacement character (\uFFFD or ?) to guarantee the JSON string output is always valid UTF-8.


4. libgit2 1.8+ Compilation Failure

Tip

Impact: Compilation fails when building with newer libgit2 versions (v1.8.0+).

Root Cause

The git_allocator struct was moved from <git2.h> to <git2/sys/alloc.h> in libgit2 1.8+. Because cbm.c only includes <git2.h>, compiling with libgit2 1.8+ throws an incomplete type error for git_allocator.

Suggested Fix

Include <git2/sys/alloc.h> conditionally in cbm.c:

 #if defined(HAVE_LIBGIT2)
 #include <git2.h>
+#include <git2/sys/alloc.h> // For git_allocator on libgit2 1.8+
 #endif

5. Directory Discovery Ignores .git/info/exclude

Note

Impact: Local-only ignore rules configured in .git/info/exclude are not respected, leading to unwanted local files being indexed.

Root Cause

The discovery process in src/discover/discover.c searches for .gitignore and .cbmignore files but never attempts to load .git/info/exclude.

Suggested Fix

During setup, check for the presence of .git/info/exclude in the repository root and load it as a base gitignore filter alongside .gitignore.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingwindowsWindows-specific issues

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions