Skip to content

submissions_broker doesn't recover from record file corruption #99

@marcmengel

Description

@marcmengel

The submissions broker can get in a state where the json files it keeps with POMS for queues, etc. get corrupted, at
which point we get repeated logs like:

2026-01-11 16:18:47,540 helper_functions.py:95:{'service': 'Agent Queue', 'message': JSONDecodeError('Expecting value: line 1 column 1 (char 0)'), 'level': 'exception', 'run_number': 13696, 'timestamp': '2026-01-11T22:18:47.540687+00:00', 'runtime': '1 minutes, 6 seconds', 'class': 'Agent', 'function': 'poll'}
Traceback (most recent call last):
  File "/home/poms/poms/.venv/lib64/python3.9/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib64/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

and the broker basically stops updating anything, but appears to supervisord to be running...

We need to figure out how to handle this case and have the broker recover somehow; perhaps re-initializing the json files, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions