Skip to content

Environment Setup

ltrestka edited this page Mar 28, 2024 · 2 revisions

In configuring Data Dispatcher/Metacat, some vital environment variables in your launch scripts and campaign configuration (.cfg) files need attention. They encompass:

  • Minimum Required Environment Variables for POMS and fife_wrap:

    Note: The following information can be included in your login script.

    • Variables used by fife_wrap at submission time for authentication with Metacat and Data Dispatcher clients, and for project retrieval/creation:
      • Always required:
        • $X509_USER_PROXY
        • Employed for login/authentication with Metacat and Data Dispatcher client.
      • Required if not set in POMS or fife_wrap:

        These variables might be automatically provided via POMS and fife_wrap in future releases.

        • $METACAT_SERVER_URL
          • Directs the Metacat client for API calls.
        • $METACAT_AUTH_SERVER_URL
          • Used for authentication and login.
  • Environment Variables Generated and Used by POMS

    • Upon the initiation of a campaign or a campaign stage, POMS will generate some, or all, of the following fields:
      • $POMS_DATA_DISPATCHER_TASK_ID:
        • This identifier will always be generated by POMS, and is unique to each data dispatcher submission.
      • $POMS_DATA_DISPATCHER_PROJECT_ID:
        • This is provided by POMS in every scenario EXCEPT:
          • Stages employing the 'Draining', 'Multiparam', or 'List' Split Types only if the user inputs a "param" value in the data_dispatcher_dataset_query field, as opposed to a project_id or a query.
          • For instance:
            [campaign_stage dd_sample_campaign_stage_1]
            cs_split_type=list
            data_dispatcher_dataset = ['param: 01-20030-es', 'project_id: 23', query:'files from namespace:name']
      • $POMS_DATA_DISPATCHER_DATASET_QUERY:
        • This field will always be provided by POMS,
        • and is utilized to establish a project within the wrapper if a project_id is not furnished, or if POMS does not initiate creation.
      • $POMS_DATA_DISPATCHER_PARAMETER:
        • This will be furnished by POMS when employing a split type as elaborated above for the 'Draining', 'Multiparam', or 'List' Split Types.
  • Setting Up Configuration Files

    • Begin with the configuration of the [global] section:

      [global]
      dd_task_id = override_me
      dd_project_id = override_me
      dd_dataset_query = override_me
      dd_param = override_me
      • Note: The variables listed above are optional and can be named according to your preference. Ensure to include overrides in your param_overrides, for instance: -Oglobal.dd_task_id=$POMS_DATA_DISPATCHER_TASK_ID
    • Configure the [data_dispatcher] section:

      [data_dispatcher]
      task_id = %(dd_task_id)s
      project = %(dd_project_id)s
      dataset_query = %(dd_dataset_query)s
      parameter = %(dd_param)s
      namespace = poms_test
      query_limit = 5
      load_limit = 5
      user = %(account)s

      Note: Since most of these fields are defined in POMS, users can consider most of this section as optional. However, it's worth noting that defining a query or load limit here may affect your submission. Likewise, if you define one of these fields in POMS, it will take precedence over what is in your configuration file.

    • Configure the [submit] section:

      [submit]
      e...
      e_4 = POMS_DATA_DISPATCHER_TASK_ID
      e_5 = POMS_DATA_DISPATCHER_PROJECT_ID
      e_6 = POMS_DATA_DISPATCHER_DATASET_QUERY
      e_7 = POMS_DATA_DISPATCHER_PARAMETER
      e_n ... 

      Note: fife_launch and fife_wrap have been modified to automatically add the above environment variables to the jobsub_submit command during a POMS launch. That said, it is good practice to have these variable names in your job configuration file.

    • Configure the [env_pass] section:

      [env_pass]
      DATA_DISPATCHER_URL=https://metacat.fnal.gov:9443/hypot_dd/data
      METACAT_SERVER_URL=https://metacat.fnal.gov:9443/hypot_meta_dev/app
      DATA_DISPATCHER_AUTH_URL=https://metacat.fnal.gov:8143/auth/hypot_dev
      METACAT_AUTH_SERVER_URL=https://metacat.fnal.gov:8143/auth/hypot_dev
      • Note: these values may already be known by POMS or fife_utils, and may not be required unless trying to use a specific server. Check with our POMS if you have questions about metacat/dd servers.
    • Configure the [job_setup] section:

      setup_1 = metacat
      setup_2 = data_dispatcher
      • Note: the UPS packages above are available in cvmfs, and require python3.7 or higher to function properly. In the future, UPS will be phased out, and replaced by spack.

Clone this wiki locally