Skip to content

Blueprint for Dimension reducer / Data Processor #362

@odunbar

Description

@odunbar

Proposal

To construct a data processing tool that can unify the data processing of emulators.

Takes in

  • Data: input-output pairs, or the EKI object (with i-o extracted internally)
  • a schedule of DataProcessing & Dimension reduction

Gives back

  • the processed data pairs

Example

The user could specify N-stage processing:
In each stage the data can be processed with an:

process_schedule = [
    ("in", DataProcessor1(...)), # first process inputs with processor 1
    ("out", DataProcessor2(...)), # next process outputs with processor 2
    ("joint", DataProcessor3(...)), # next jointly process inputs and outputs with processor 3
    ("out", DataProcessor4(...)), # finally. process outputs with processor 4
]

The user then builds emulators on the processed data

io_pairs_processed = process_data(io_pairs, process_schedule) 
# one catch: interfaces must handle that io_pairs_processed may be a different dimension to io_pairs
io_pairs_recovered= reverse_process_data(io_pairs, process_schedule) 
# one catch: interfaces must handle that io_pairs_processed may be a different dimension to io_pairs

Where DataProcessor(...) could be selected from (just input or output)

  • PCA
  • simple scalings, or regularization
  • Nonlinear dim reduction

or (joint input and output)

  • data-informed or likelihood informed subspaces
  • CCA

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions