Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
400 changes: 400 additions & 0 deletions contrib/diagnose_corruption/check_flipped_bits.c

Large diffs are not rendered by default.

107 changes: 107 additions & 0 deletions contrib/externalSequenceProducer/main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
/*
* Copyright (c) Yann Collet, Meta Platforms, Inc.
* All rights reserved.
*
* This source code is licensed under both the BSD-style license (found in the
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
* in the COPYING file in the root directory of this source tree).
* You may select, at your option, one of the above-listed licenses.
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

#define ZSTD_STATIC_LINKING_ONLY
#include "zstd.h"
#include "zstd_errors.h"
#include "sequence_producer.h" // simpleSequenceProducer

#define CHECK(res) \
do { \
if (ZSTD_isError(res)) { \
printf("ERROR: %s\n", ZSTD_getErrorName(res)); \
return 1; \
} \
} while (0) \
Comment on lines +21 to +27
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resource leaks on error paths when CHECK fails.

The CHECK macro returns early on error but doesn't clean up previously allocated resources, causing memory leaks at Lines 48, 77, and 84.

Consider refactoring to use goto-based cleanup or refactor CHECK to support cleanup:

-#define CHECK(res)                                      \
-do {                                                    \
-    if (ZSTD_isError(res)) {                            \
-        printf("ERROR: %s\n", ZSTD_getErrorName(res));  \
-        return 1;                                       \
-    }                                                   \
-} while (0)                                             \

 int main(int argc, char *argv[]) {
+    ZSTD_CCtx* zc = NULL;
+    char* src = NULL;
+    char* dst = NULL;
+    char* val = NULL;
+    int ret = 0;
+
     if (argc != 2) {
         printf("Usage: externalSequenceProducer <file>\n");
         return 1;
     }
 
-    ZSTD_CCtx* const zc = ZSTD_createCCtx();
+    zc = ZSTD_createCCtx();
+    if (!zc) {
+        printf("ERROR: Failed to create compression context\n");
+        return 1;
+    }
 
     // ... rest of allocations with error checking ...
 
     {
         size_t const res = ZSTD_CCtx_setParameter(zc, ZSTD_c_enableSeqProducerFallback, 1);
-        CHECK(res);
+        if (ZSTD_isError(res)) {
+            printf("ERROR: %s\n", ZSTD_getErrorName(res));
+            ret = 1;
+            goto cleanup;
+        }
     }
 
     // ... similar for other CHECK calls ...
 
+cleanup:
+    ZSTD_freeCCtx(zc);
+    free(src);
+    free(dst);
+    free(val);
+    return ret;
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In contrib/externalSequenceProducer/main.c around lines 21 to 27, the CHECK
macro returns immediately on error and leaks resources allocated earlier
(notably at lines ~48, ~77 and ~84); change error handling to perform cleanup
before returning by either (A) refactoring CHECK to accept a label name and do a
goto cleanup on error, or (B) remove the macro and replace each CHECK invocation
with an if (ZSTD_isError(res)) { fprintf(...); goto cleanup; } pattern. Add a
single cleanup: label near function exit that frees any allocated memory, closes
handles and releases resources in reverse allocation order, set an appropriate
non-zero exit code, and ensure all paths (success and error) jump to that
cleanup label.


int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: externalSequenceProducer <file>\n");
return 1;
}

ZSTD_CCtx* const zc = ZSTD_createCCtx();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add null check after context creation.

ZSTD_createCCtx() can return NULL on allocation failure. Using the context without validation can cause a segmentation fault.

Apply this diff to add a null check:

 ZSTD_CCtx* const zc = ZSTD_createCCtx();
+if (!zc) {
+    printf("ERROR: Failed to create compression context\n");
+    return 1;
+}
🤖 Prompt for AI Agents
In contrib/externalSequenceProducer/main.c around line 35, the code calls
ZSTD_createCCtx() and assigns to ZSTD_CCtx* const zc without checking for NULL;
add a null check immediately after creation and handle allocation failure (e.g.,
log an error, free any previously allocated resources, and return/exit with an
error code) so the function never dereferences zc when it is NULL; ensure the
success path continues to use zc and that any early-return paths clean up other
state as needed.


int simpleSequenceProducerState = 0xdeadbeef;

// Here is the crucial bit of code!
ZSTD_registerSequenceProducer(
zc,
&simpleSequenceProducerState,
simpleSequenceProducer
);

{
size_t const res = ZSTD_CCtx_setParameter(zc, ZSTD_c_enableSeqProducerFallback, 1);
CHECK(res);
Comment on lines +35 to +48
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Check CCtx creation and sequence‑producer registration results

ZSTD_createCCtx() is used without checking for NULL, and ZSTD_registerSequenceProducer()’s return value is ignored. On allocation failure or API error this will lead to later dereferences of an invalid context or silently continuing after a failed registration.

Consider validating both and bailing out (with cleanup) on error, e.g.:

-    ZSTD_CCtx* const zc = ZSTD_createCCtx();
+    ZSTD_CCtx* const zc = ZSTD_createCCtx();
+    if (zc == NULL) {
+        fprintf(stderr, "ERROR: failed to create ZSTD_CCtx\n");
+        return 1;
+    }
@@
-    ZSTD_registerSequenceProducer(
-        zc,
-        &simpleSequenceProducerState,
-        simpleSequenceProducer
-    );
+    {
+        size_t const res = ZSTD_registerSequenceProducer(
+            zc,
+            &simpleSequenceProducerState,
+            simpleSequenceProducer);
+        CHECK(res);
+    }



🏁 Script executed:

#!/bin/bash
# Read the file and extract the relevant section
cat contrib/externalSequenceProducer/main.c | head -60 | tail -35

Length of output: 895


🏁 Script executed:

#!/bin/bash
# Get more context: find the function containing these lines and see cleanup
cat contrib/externalSequenceProducer/main.c

Length of output: 2951


🏁 Script executed:

#!/bin/bash
# Check if ZSTD_registerSequenceProducer is used elsewhere in the file
rg -n "registerSequenceProducer" contrib/externalSequenceProducer/main.c

Length of output: 37


🏁 Script executed:

#!/bin/bash
# Check for CHECK macro definition and error handling patterns
rg -n -B2 -A2 "CHECK|ZSTD_createCCtx|ZSTD_registerSequenceProducer" contrib/externalSequenceProducer/main.c | head -80

Length of output: 922


🏁 Script executed:

#!/bin/bash
# Look for resource cleanup: search for free() and ZSTD_freeCCtx() calls
rg -n "free|ZSTD_free" contrib/externalSequenceProducer/main.c

Length of output: 83


🏁 Script executed:

#!/bin/bash
# Verify ZSTD API: check if there's any documentation or header hints
fd -e h "zstd" --search-path contrib/ --search-path "." 2>/dev/null | head -10

Length of output: 337


Add NULL check for ZSTD_createCCtx() and verify ZSTD_registerSequenceProducer() return value

Both issues are confirmed in the code:

  1. Line 35: ZSTD_createCCtx() is called without NULL checking. If allocation fails, the NULL pointer is dereferenced at line 40.

  2. Lines 40-44: ZSTD_registerSequenceProducer() return value is ignored. A registration failure is not detected and silently continues.

Additionally, note that cleanup at lines 102-105 is only reachable on the success path. If any CHECK fails before that point (e.g., line 48), zc will leak.

The proposed diff in the review is incomplete—it wraps registerSequenceProducer in CHECK but omits the NULL check for createCCtx(). Both should be added:

ZSTD_CCtx* const zc = ZSTD_createCCtx();
if (zc == NULL) {
    fprintf(stderr, "ERROR: failed to create ZSTD_CCtx\n");
    return 1;
}

// ... later ...

{
    size_t const res = ZSTD_registerSequenceProducer(
        zc,
        &simpleSequenceProducerState,
        simpleSequenceProducer);
    CHECK(res);
}
🤖 Prompt for AI Agents
In contrib/externalSequenceProducer/main.c around lines 35–48 (and consider
cleanup at 102–105), ZSTD_createCCtx() is not NULL-checked and
ZSTD_registerSequenceProducer() return value is ignored; add an immediate NULL
check after ZSTD_createCCtx() that prints an error and returns non-zero if
allocation fails, wrap the ZSTD_registerSequenceProducer() call and capture its
return value then CHECK() it (or handle the error), and ensure any early returns
free the context (call ZSTD_freeCCtx or equivalent) to avoid leaking zc when
later CHECKs fail.

}

FILE *f = fopen(argv[1], "rb");
assert(f);
{
int const ret = fseek(f, 0, SEEK_END);
assert(ret == 0);
}
size_t const srcSize = ftell(f);
{
int const ret = fseek(f, 0, SEEK_SET);
assert(ret == 0);
}

char* const src = malloc(srcSize + 1);
assert(src);
{
size_t const ret = fread(src, srcSize, 1, f);
assert(ret == 1);
int const ret2 = fclose(f);
assert(ret2 == 0);
}
Comment on lines +51 to +70
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace assert() with proper error handling for file operations.

Using assert() for file I/O error checking is problematic:

  • Assertions are disabled in release builds (-DNDEBUG), leaving no error checking in production
  • Abrupt termination provides poor user experience instead of clean error messages

Additionally, Line 63 allocates srcSize + 1 bytes but only reads srcSize bytes, suggesting possible intent to null-terminate that isn't implemented.

Apply this diff to add proper error handling:

 FILE *f = fopen(argv[1], "rb");
-assert(f);
+if (!f) {
+    printf("ERROR: Failed to open file %s\n", argv[1]);
+    ZSTD_freeCCtx(zc);
+    return 1;
+}
 {
-    int const ret = fseek(f, 0, SEEK_END);
-    assert(ret == 0);
+    if (fseek(f, 0, SEEK_END) != 0) {
+        printf("ERROR: Failed to seek in file\n");
+        fclose(f);
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
 }
 size_t const srcSize = ftell(f);
 {
-    int const ret = fseek(f, 0, SEEK_SET);
-    assert(ret == 0);
+    if (fseek(f, 0, SEEK_SET) != 0) {
+        printf("ERROR: Failed to seek in file\n");
+        fclose(f);
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
 }
 
-char* const src = malloc(srcSize + 1);
-assert(src);
+char* const src = malloc(srcSize);
+if (!src) {
+    printf("ERROR: Failed to allocate memory\n");
+    fclose(f);
+    ZSTD_freeCCtx(zc);
+    return 1;
+}
 {
-    size_t const ret = fread(src, srcSize, 1, f);
-    assert(ret == 1);
+    if (fread(src, srcSize, 1, f) != 1) {
+        printf("ERROR: Failed to read file\n");
+        free(src);
+        fclose(f);
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
-    int const ret2 = fclose(f);
-    assert(ret2 == 0);
+    fclose(f);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FILE *f = fopen(argv[1], "rb");
assert(f);
{
int const ret = fseek(f, 0, SEEK_END);
assert(ret == 0);
}
size_t const srcSize = ftell(f);
{
int const ret = fseek(f, 0, SEEK_SET);
assert(ret == 0);
}
char* const src = malloc(srcSize + 1);
assert(src);
{
size_t const ret = fread(src, srcSize, 1, f);
assert(ret == 1);
int const ret2 = fclose(f);
assert(ret2 == 0);
}
FILE *f = fopen(argv[1], "rb");
if (!f) {
printf("ERROR: Failed to open file %s\n", argv[1]);
ZSTD_freeCCtx(zc);
return 1;
}
{
if (fseek(f, 0, SEEK_END) != 0) {
printf("ERROR: Failed to seek in file\n");
fclose(f);
ZSTD_freeCCtx(zc);
return 1;
}
}
size_t const srcSize = ftell(f);
{
if (fseek(f, 0, SEEK_SET) != 0) {
printf("ERROR: Failed to seek in file\n");
fclose(f);
ZSTD_freeCCtx(zc);
return 1;
}
}
char* const src = malloc(srcSize);
if (!src) {
printf("ERROR: Failed to allocate memory\n");
fclose(f);
ZSTD_freeCCtx(zc);
return 1;
}
{
if (fread(src, srcSize, 1, f) != 1) {
printf("ERROR: Failed to read file\n");
free(src);
fclose(f);
ZSTD_freeCCtx(zc);
return 1;
}
fclose(f);
}


Comment on lines +51 to +71
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Don’t rely on assert() for runtime I/O/allocation failures

fopen(), fseek(), ftell(), malloc(), fread(), and fclose() are only guarded by assert(). In release builds with NDEBUG defined these asserts disappear, and failures (e.g., missing file, OOM, short read) will lead to undefined behavior (e.g., calling fseek() on a NULL FILE*, using uninitialized buffers, or mis‑sized allocations).

Prefer explicit checks and graceful exits, for example:

-    FILE *f = fopen(argv[1], "rb");
-    assert(f);
+    FILE *f = fopen(argv[1], "rb");
+    if (!f) {
+        perror("fopen");
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
@@
-    size_t const srcSize = ftell(f);
+    long const tellRes = ftell(f);
+    if (tellRes < 0) {
+        perror("ftell");
+        fclose(f);
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
+    size_t const srcSize = (size_t)tellRes;
@@
-    char* const src = malloc(srcSize + 1);
-    assert(src);
+    char* const src = (char*)malloc(srcSize);
+    if (!src) {
+        fclose(f);
+        ZSTD_freeCCtx(zc);
+        return 1;
+    }
@@
-        size_t const ret = fread(src, srcSize, 1, f);
-        assert(ret == 1);
+        size_t const ret = fread(src, 1, srcSize, f);
+        if (ret != srcSize) {
+            fprintf(stderr, "ERROR: short read\n");
+            fclose(f);
+            free(src);
+            ZSTD_freeCCtx(zc);
+            return 1;
+        }

I'll verify the review comment by examining the actual code and context in the repository.


🏁 Script executed:

#!/bin/bash
# Find and examine the file
fd -t f "main.c" | grep -i "externalSequenceProducer"

Length of output: 39


🏁 Script executed:

#!/bin/bash
# Extract the full context around lines 51-71, plus surrounding code
if [ -f "contrib/externalSequenceProducer/main.c" ]; then
  echo "=== Full file context (lines 1-100) ==="
  head -n 100 "contrib/externalSequenceProducer/main.c" | cat -n
  echo ""
  echo "=== Checking for ZSTD_freeCCtx and zc variable ==="
  grep -n "ZSTD_freeCCtx\|zc" "contrib/externalSequenceProducer/main.c" | head -20
fi

Length of output: 3919


🏁 Script executed:

#!/bin/bash
# Check the entire file for assert usage and similar error handling patterns
if [ -f "contrib/externalSequenceProducer/main.c" ]; then
  echo "=== All assert() calls in file ==="
  grep -n "assert" "contrib/externalSequenceProducer/main.c"
fi

Length of output: 252


I need to see the complete file to verify the full cleanup path and ensure the review's suggested fixes are complete.


🏁 Script executed:

#!/bin/bash
# Get the complete file content
wc -l "contrib/externalSequenceProducer/main.c"
echo ""
echo "=== Full file content ==="
cat "contrib/externalSequenceProducer/main.c"

Length of output: 3022


Replace assert()-based error checks with explicit error handling to prevent undefined behavior in release builds.

The code relies entirely on assert() for runtime error handling at lines 52, 55, 60, 64, 67, 69, 74, and 80. With NDEBUG defined (standard in release builds), these asserts vanish, causing undefined behavior:

  • fopen() returning NULL leads to fseek() called on invalid pointer
  • malloc() failures go undetected, using NULL pointers
  • fread() short reads undetected

Additionally, there's a critical resource leak at line 99: the early return 1; on memcmp mismatch skips cleanup of allocated memory and the compression context, even in the current code.

Implement explicit checks with proper cleanup paths for all I/O and allocation operations (lines 51–70), and ensure all error paths properly free resources before returning.

size_t const dstSize = ZSTD_compressBound(srcSize);
char* const dst = malloc(dstSize);
assert(dst);

size_t const cSize = ZSTD_compress2(zc, dst, dstSize, src, srcSize);
CHECK(cSize);

char* const val = malloc(srcSize);
assert(val);
Comment on lines +72 to +80
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add null checks after memory allocations.

Both malloc() calls can return NULL on allocation failure. Using these pointers without validation can cause undefined behavior.

Apply this diff to add null checks:

 size_t const dstSize = ZSTD_compressBound(srcSize);
 char* const dst = malloc(dstSize);
-assert(dst);
+if (!dst) {
+    printf("ERROR: Failed to allocate compression buffer\n");
+    free(src);
+    ZSTD_freeCCtx(zc);
+    return 1;
+}
 
 size_t const cSize = ZSTD_compress2(zc, dst, dstSize, src, srcSize);
 CHECK(cSize);
 
 char* const val = malloc(srcSize);
-assert(val);
+if (!val) {
+    printf("ERROR: Failed to allocate validation buffer\n");
+    free(dst);
+    free(src);
+    ZSTD_freeCCtx(zc);
+    return 1;
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
size_t const dstSize = ZSTD_compressBound(srcSize);
char* const dst = malloc(dstSize);
assert(dst);
size_t const cSize = ZSTD_compress2(zc, dst, dstSize, src, srcSize);
CHECK(cSize);
char* const val = malloc(srcSize);
assert(val);
size_t const dstSize = ZSTD_compressBound(srcSize);
char* const dst = malloc(dstSize);
if (!dst) {
printf("ERROR: Failed to allocate compression buffer\n");
free(src);
ZSTD_freeCCtx(zc);
return 1;
}
size_t const cSize = ZSTD_compress2(zc, dst, dstSize, src, srcSize);
CHECK(cSize);
char* const val = malloc(srcSize);
if (!val) {
printf("ERROR: Failed to allocate validation buffer\n");
free(dst);
free(src);
ZSTD_freeCCtx(zc);
return 1;
}


{
size_t const res = ZSTD_decompress(val, srcSize, dst, cSize);
CHECK(res);
}

if (memcmp(src, val, srcSize) == 0) {
printf("Compression and decompression were successful!\n");
printf("Original size: %lu\n", srcSize);
printf("Compressed size: %lu\n", cSize);
Comment on lines +89 to +90
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use portable format specifier for size_t.

Using %lu for size_t is not portable across all platforms. Use %zu, the standard C99 format specifier for size_t.

Apply this diff:

-    printf("Original size: %lu\n", srcSize);
-    printf("Compressed size: %lu\n", cSize);
+    printf("Original size: %zu\n", srcSize);
+    printf("Compressed size: %zu\n", cSize);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
printf("Original size: %lu\n", srcSize);
printf("Compressed size: %lu\n", cSize);
printf("Original size: %zu\n", srcSize);
printf("Compressed size: %zu\n", cSize);
🤖 Prompt for AI Agents
In contrib/externalSequenceProducer/main.c around lines 89 to 90, the printf
calls use the non-portable "%lu" to print variables of type size_t; change the
format specifier to "%zu" for both prints (or cast to unsigned long and keep
"%lu" if you intentionally need that type), so use printf("Original size:
%zu\n", srcSize); and printf("Compressed size: %zu\n", cSize); to ensure
portable printing of size_t.

} else {
printf("ERROR: input and validation buffers don't match!\n");
for (size_t i = 0; i < srcSize; i++) {
if (src[i] != val[i]) {
printf("First bad index: %zu\n", i);
break;
}
}
return 1;
}

ZSTD_freeCCtx(zc);
free(src);
free(dst);
free(val);
return 0;
}
Loading