Skip to content

Fix: Add null/undefined validation before .trim() calls to prevent TypeError #42

@vzegnameta

Description

@vzegnameta

Description

The pdf-analyzer service crashes with TypeError: Cannot read properties of undefined (reading 'trim') when requests contain malformed or missing pdf_source parameters.

Error Logs

[/analyze] Error processing PDF: TypeError: Cannot read properties of undefined (reading 'trim')
    at validateLocalPath (file:///app/dist/pdf-utils.js:37:33)
    at readPdfBytes (file:///app/dist/pdf-utils.js:90:5)
    at analyzePdf (file:///app/dist/service.js:248:11)

Root Cause

The pdf-utils.ts file calls .trim() on string parameters without first validating that they are non-null and of type string. When a request contains an undefined or null pdf_source, the code crashes with an unhelpful TypeError instead of a descriptive validation error.

Solution

Add null/undefined checks before calling .trim() in the following functions in src/pdf-utils.ts:

Changes Required

1. isGeminiFileUri() function

export function isGeminiFileUri(source: string): boolean {
  if (source == null || typeof source !== "string") {
    return false;
  }
  return source.startsWith(GEMINI_FILE_URI_PREFIX);
}

2. isUrl() function

export function isUrl(source: string): boolean {
  if (source == null || typeof source !== "string") {
    return false;
  }
  if (isGeminiFileUri(source)) {
    return false;
  }
  try {
    const url = new URL(source);
    return url.protocol === "http:" || url.protocol === "https:";
  } catch {
    return false;
  }
}

3. validateLocalPath() function

export function validateLocalPath(pdfPath: string): void {
  if (pdfPath == null) {
    throw new Error("PDF path is required but was null or undefined");
  }

  if (typeof pdfPath !== "string") {
    throw new Error(`PDF path must be a string, got ${typeof pdfPath}`);
  }

  const trimmedPath = pdfPath.trim();

  if (trimmedPath === "") {
    throw new Error("PDF path cannot be empty");
  }

  if (!path.isAbsolute(trimmedPath)) {
    throw new Error(`PDF path must be absolute: ${trimmedPath}`);
  }

  if (!fs.existsSync(trimmedPath)) {
    throw new Error(`PDF file not found: ${trimmedPath}`);
  }

  const stats = fs.statSync(trimmedPath);
  if (stats.isDirectory()) {
    throw new Error(`Path is a directory, not a file: ${trimmedPath}`);
  }

  if (!trimmedPath.toLowerCase().endsWith(".pdf")) {
    throw new Error(`File is not a PDF: ${trimmedPath}`);
  }
}

4. fetchPdfFromUrl() function

export async function fetchPdfFromUrl(url: string): Promise<Buffer> {
  if (url == null) {
    throw new Error("URL is required but was null or undefined");
  }

  if (typeof url !== "string") {
    throw new Error(`URL must be a string, got ${typeof url}`);
  }

  const trimmedUrl = url.trim();
  if (trimmedUrl === "") {
    throw new Error("URL cannot be empty");
  }

  const controller = new AbortController();
  const timeoutId = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);

  let response: Response;
  try {
    response = await fetch(trimmedUrl, { signal: controller.signal });
  } catch (error) {
    if (error instanceof Error && error.name === "AbortError") {
      throw new Error(`Request timed out after ${FETCH_TIMEOUT_MS / 1000}s`);
    }
    const message = error instanceof Error ? error.message : "Unknown error";
    throw new Error(`Failed to fetch URL: ${message}`);
  } finally {
    clearTimeout(timeoutId);
  }

  if (!response.ok) {
    throw new Error(`HTTP ${response.status}: ${response.statusText}`);
  }

  const contentType = response.headers.get("content-type");
  if (
    contentType &&
    !contentType.includes("application/pdf") &&
    !contentType.includes("octet-stream")
  ) {
    throw new Error(`URL does not point to a PDF file. Content-Type: ${contentType}`);
  }

  const arrayBuffer = await response.arrayBuffer();
  return Buffer.from(arrayBuffer);
}

5. readPdfBytes() function

export function readPdfBytes(pdfPath: string): Buffer {
  if (pdfPath == null) {
    throw new Error("PDF path is required but was null or undefined");
  }

  if (typeof pdfPath !== "string") {
    throw new Error(`PDF path must be a string, got ${typeof pdfPath}`);
  }

  validateLocalPath(pdfPath);
  return fs.readFileSync(pdfPath.trim());
}

Test Updates

Update the test in src/service.test.ts to reflect the new error message:

// Before
it("trims whitespace from paths", () => {
  expect(() => validateLocalPath("   ")).toThrow("must be absolute");
  expect(() => validateLocalPath("  /nonexistent.pdf  ")).toThrow("not found");
});

// After
it("trims whitespace from paths", () => {
  expect(() => validateLocalPath("   ")).toThrow("cannot be empty");
  expect(() => validateLocalPath("  /nonexistent.pdf  ")).toThrow("not found");
});

Benefits

  1. Better error messages: Instead of cryptic TypeError: Cannot read properties of undefined, users get descriptive errors like "PDF path is required but was null or undefined"
  2. Graceful handling: isUrl() and isGeminiFileUri() return false for invalid inputs instead of crashing
  3. Defensive programming: Validates inputs at function boundaries
  4. Easier debugging: Clear error messages help identify the source of malformed requests

Impact

This fix prevents service crashes and provides better error messages for API consumers when they send malformed requests.

Testing

All existing tests pass with these changes (except one unrelated test that fails due to missing fixture file in the test environment).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions