Skip to content

Anonymize DICOM image data by detecting text and replacing it with black squares

Notifications You must be signed in to change notification settings

mmiv-center/RewritePixel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sanitize DICOM image data with text annotations

This project uses the tesseract >4.0 OCR engine to identify text that is burned into DICOM image data. For each text fragment (usually a word) a square black frame is written into the DICOM pixel information. The resulting DICOM file should be inspected - hopefully it is free of participant identifying information.

Warning: This program does not try to anonymize DICOM tags. Please check out the https://github.com/mmiv-center/DICOMAnonymizer project for a fast tag anonymizer.

Warning: There is no information yet on false/positive detection rates, verify the output by hand!

Build

We are using cmake to create a make file for the compilation. The program depends on a number of libraries (gdcm, tesseract) - best to look at the Dockerfile to get an idea on how to compile this program.

# in the best of all worlds this is sufficient to create the build system
cmake -DCMAKE_BUILD_TYPE=Debug .
make

Using docker:

> docker build -t rewritepixel -f Dockerfile .
...
> docker run -it --rm rewritepixel 
USAGE: rewritepixel [options]

Options:
  --help              Rewrite DICOM images to remove text. Read DICOM image
                      series and write out an anonymized version of the image
                      data.
  --input, -i         Input directory.
  --output, -o        Output directory.
  --confidence, -c    Confidence threshold (0..100).
  --numthreads, -t    How many threads should be used (default 4).
  --storemapping, -m  Store the detected strings as a JSON file.

Examples:
  rewritepixel --input directory --output directory
  rewritepixel --help

Notice: Don't forget that docker will not automatically see your systems directories. You need to use the '-v' option to make a folder visible inside the system before you can access data stored on your system. Here an example. Our data folder 'test_input' and 'test_output' are in the current users home directory.

docker run -it -v /home/<user name>/Documents/:/data --rm rewritepixel -i /data/test_input/ -o /data/test_output/

About

Anonymize DICOM image data by detecting text and replacing it with black squares

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published