Skip to content

WIP Fix PII badger background job#2018

Draft
gbp wants to merge 1 commit intomasterfrom
ppi-badger-job-fixes
Draft

WIP Fix PII badger background job#2018
gbp wants to merge 1 commit intomasterfrom
ppi-badger-job-fixes

Conversation

@gbp
Copy link
Copy Markdown
Member

@gbp gbp commented Jul 3, 2025

Sometimes this is breaking for files which have redactions/masks.

In the case of some XLSX files text masking of email addresses in the underlying ZIP format meant the Python Excel Analyzer code couldn't decompress the file and analyse the file contents.

The patch for the ActiveStorage mirror service is needed for our server deployment as these jobs run on a secondary application instance which doesn't have the raw email on disk, so we have to rely on fetching the S3 mirrored version. This patch should probably be moved elsewhere.

Sometimes this is breaking for files which have redactions/masks.

In the case of some XLSX files text masking of email addresses in the
underlying ZIP format meant the Python Excel Analyzer code couldn't
decompress the file and analyse the file contents.

The patch for the ActiveStorage mirror service is needed for our server
deployment as these jobs run on a secondary application instance which
doesn't have the raw email on disk, so we have to rely on fetching the
S3 mirrored version. This patch should probably be moved elsewhere.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant