Skip to content

Commit b980964

Browse files
author
Dylan Huang
committed
Enhance JSON line error handling in load_jsonl function
- Added regex-based extraction of "row_id" to provide more context in error messages when JSON parsing fails. This improvement aids in debugging by including the problematic row ID in the raised ValueError.
1 parent ca126dd commit b980964

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

eval_protocol/common_utils.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import json
2+
import re
23
from typing import Any, Dict, List
34

45

@@ -20,5 +21,10 @@ def load_jsonl(file_path: str) -> List[Dict[str, Any]]:
2021
data.append(json.loads(line.strip()))
2122
except json.JSONDecodeError as e:
2223
print(f"Error parsing JSON line for file {file_path} at line {line_number}")
24+
# attempt to find "row_id" in the line by finding index of "row_id" and performing regex of `"row_id": (.*),`
25+
row_id_index = line.find("row_id")
26+
if row_id_index != -1:
27+
row_id = re.search(r'"row_id": (.*),', line[row_id_index:])
28+
raise ValueError(f"{e.msg} at line {line_number}: {line} ({row_id})")
2329
raise e
2430
return data

0 commit comments

Comments
 (0)