From 45a721d01dfe56fc4800b95da52e9b4193b76d04 Mon Sep 17 00:00:00 2001 From: willpartcl Date: Sun, 11 Jan 2026 15:34:26 -0800 Subject: [PATCH] Fix protobuf parser regex to handle scientific notation Problem: The regex pattern on line 241 could not parse floating-point values with scientific notation (e.g., 1.42109e-16), causing failures when loading 15 out of 17 IBM (ICCAD04) benchmarks. Root Cause: The pattern `\-*\w+\.\*\/{0,1}\w*[\w+\/{0,1}\w*]*` splits on word boundaries, breaking scientific notation into separate tokens: - Input: "f: 1.42109e-16" - Parsed as: ['f', '1.42109e', '-16'] - Caused float('1.42109e') to fail with ValueError Solution: Updated regex to explicitly match scientific notation first, then fallback to other patterns: - Pattern: r'[-+]?\d+\.?\d*[eE][-+]?\d+|[-]?\w+\.?[\w/]*' - Now parses: ['f', '1.42109e-16'] - Correctly handles positive/negative exponents Testing: Verified all 17 IBM benchmarks now parse successfully: - ibm01-ibm18 (excluding ibm05) all load without errors - Regex still correctly handles: * Regular floats (0.4, -0.4) * Integers (123) * Strings (TOP, BOTTOM) * Paths (foo/bar) * Scientific notation (1.42109e-16, 5.68434e+10) Impact: This fix enables plc_client_os to parse the full ICCAD04 benchmark suite without requiring Circuit Training's proprietary parser. Signed-off-by: willpartcl --- CodeElements/Plc_client/plc_client_os.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/CodeElements/Plc_client/plc_client_os.py b/CodeElements/Plc_client/plc_client_os.py index 665b4b61..773e77ef 100644 --- a/CodeElements/Plc_client/plc_client_os.py +++ b/CodeElements/Plc_client/plc_client_os.py @@ -238,7 +238,9 @@ def __read_protobuf(self): # advance, expect value item line = fp.readline() - line_item = re.findall(r'\-*\w+\.*\/{0,1}\w*[\w+\/{0,1}\w*]*', line) + # Fixed regex to handle scientific notation (e.g., 1.42109e-16) + # Pattern matches: scientific notation OR identifiers/paths/numbers + line_item = re.findall(r'[-+]?\d+\.?\d*[eE][-+]?\d+|[-]?\w+\.?[\w/]*', line) attr_dict[key] = line_item