Skip to content

Piper Files

Sean Finan edited this page Feb 9, 2026 · 1 revision

Create custom pipelines to extract more information than is available through the Default Clinical Pipeline. Special Analysis Engines are in various cTAKES modules. Analysis Engines can be removed or added to pipelines to obtain desired results.

There are four methods available to create custom pipelines.

  1. XML Descriptor files are the original method used to create pipelines in Apache UIMA™. Though self-descriptive, they are verbose and error-prone.
  2. Apache uimaFIT™ enables creation of pipelines through Java code. This greatly simplifies unit testing and experimentation.
  3. The PipelineBuilder class in ctakes-core is a simplified facade for uimaFIT™ factories and objects.
  4. Piper files are a modern equivalent of the XML descriptor files.

Piper files consist of basic commands and parameters in an easily readable flat format. For a reference to all commands and their use, see basic commands.

Step-by-step guide

  1. Create an empty text file. The standard file extension for piper files is .piper
  2. Use reader to specify a collection reader for your pipeline. To set values to parameters used by the reader class, simply add one or more name=value pairs after the reader name.
reader my.components.MyReader my/input/dir
  1. add annotation engines and output writers to your pipeline. To set values to parameters used by a component, simply add one or more name=value pairs after the component name.
add my.components.MyFirstAnnotator mySetting1=myValueA myDataDirSetting=my/data/dir
add my.components.MySecondAnnotator mySetting2=myValueB myDataDirSetting=my/data/dir
add my.components.MyThirdAnnotator mySetting3=myValueC myDataDirSetting=my/data/dir
  1. load other instructions and settings from another piper file. See Table 2 for piper files in cTAKES.
load my/pipelines/MySubPipeline
  1. reader, load and the add* commands all take component names or file directories as their first parameter.
    If the class is not in a standard cTAKES module's cr ae or cc package, or a piper file is not in a standard module's pipeline/ directory then the package or path must be specified for that component or file.
  2. Use package to simplify adding multiple pipeline components from a package not standard to cTAKES.
// Command cTAKES to search the package my.components for pipeline components.
package my.components
reader MyReader my/input/dir
add MyFirstAnnotator mySetting1=myValueA myDataDirSetting=my/data/dir
add MySecondAnnotator mySetting2=myValueB myDataDirSetting=my/data/dir
add MyThirdAnnotator mySetting3=myValueC myDataDirSetting=my/data/dir/XYZ

// Command cTAKES to search the directory my/pipelines for files. 
package my/pipelines
load MySubPipeline
  1. Use set to assign a value to a parameter used by following components.
// Command cTAKES to search the package my.components for pipeline components.
package my.components
reader MyReader my/input/dir

// Command cTAKES to use a value for a named setting for all following instances not otherwise specified.
set myDataDirSetting=my/data/dir
add MyFirstAnnotator mySetting1=myValueA
add MySecondAnnotator mySetting2=myValueB
add MyThirdAnnotator mySetting3=myValueC myDataDirSetting=my/data/dir/XYZ

// Command cTAKES to search the directory my/pipelines for files. 
package my/pipelines
load MySubPipeline

A name=value pair on a component line will, for that component, override a set parameter value.

Piper Commands

For a reference to all commands and their use, see basic commands.

Clone this wiki locally