Handling blank lines in x12xml_simple.seg, x12file._parse_segment#35
Handling blank lines in x12xml_simple.seg, x12file._parse_segment#35gumption wants to merge 1 commit intoazoner:masterfrom
Conversation
Blank lines in 835 files were causing AttributeError exceptions in x12xml_simple._seg() and IndexError exceptions in x12file._parse_segment(). Added code to check for error conditions, which may or may not align with desired behavior.
|
I'll take a look at the pull requests. Any unhandled exception like this There is a limit to the level of malformed input this library should John Holland On Fri, Mar 6, 2015 at 12:08 PM, Joe McCarthy notifications@github.com
|
I've been parsing a bunch of 835 files and found the following error terminating the parsing of many of them:
Traceback: Traceback (most recent call last):
...
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12n_document.py", line 240, in x12n_document
xmldoc.seg(node, seg)
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12xml_simple.py", line 72, in seg
if child_node.usage == 'N' or seg_data.get('%02i' % (i + 1)).is_empty():
AttributeError: 'NoneType' object has no attribute 'usage'
It appears this error is caused by 835 files that have blank lines, which I realize are malformed (out of spec), but my assumption is that the goal is to enable X12 parsing to continue (i.e., not terminate) while logging errors encountered during parsing.
I fixed this error by adding a
child_node is Noneclause in thatifstatement, but I see there is anelseclause at the end that raises anEngineError, so I'm not sure if this fix aligns with the desired behavior.Once this was fixed, a new
IndexErrorwas generated byx12file._parse_segment(), due - I think - to the presence of a blank line between an invalid IEA segment and a valid ISA segment.I fixed this by adding a check for empty loops before checking the preceding loop to generate error messages, but was not sure what error code to use ... and am less confident that my fix in this file is consistent with desired behavior.
FWIW, here is the section of the log file that is generated for one of the files with errant blank lines and a malformed IEA segment:
20150306 16:37:58 pyx12.error_handler ERROR: Line:4383 SEG:1 - Segment
ISA*00 not found. Started at /ISA_LOOP/IEA
20150306 16:37:58 pyx12.error_handler ERROR: No current segment in error_handler. Line:4383 SEG:1 - Segment identifier "
ISA" is invalid
20150306 16:37:58 pyx12.error_handler ERROR: Line:4610 SEG:5 - Segment IEA exceeded max count. Found 2, should have 1
20150306 16:37:58 pyx12.error_handler ERROR: Line:4382 ISA:000 - IEA loop with malformed preceding segment
I thought it might be more useful to submit a pull request - even though one or both fixes may not be acceptable - to help more easily identify the problem areas. I suspect that if these errors are to be caught, a more extensive set of error checks will need to be added to one or both files.
If it is more helpful to simply report errors than trying to fix them, let me know, and I'll switch tactics.