Skip to content
natlibfi-arlehiko edited this page Mar 22, 2019 · 23 revisions

Functionality

  1. Poll for blobs: GET /blobs/
    1. If the blob state is PENDING_TRANSFORMATION
      1. Retrieve the profile specified in blob metadata: GET /profiles/{id}
      2. Dispatch a transformer container as specified in profile (transformation.image, transformation.env). The total maximum number of containers dispatched by the controller must not be exceed (Specified by CONTAINERS_CONCURRENCY)
      3. Call POST /blobs/{id} with op=transformationStarted
    2. If the blob state is TRANSFORMED
      1. Retrieve the profile specified in blob metadata: GET /profiles/{id}
      2. Dispatch importer containers according to the profile (import.image, import.env). The maximum number of containers to dispatch for a profile is specified by environment variable IMPORTER_CONCURRENCY (And the total maximum of all containers dispatched by the controller is specified by CONTAINER_CONCURRENCY)
    3. If the blob state is ABORTED
      1. Terminate any importer containers for the blob
      2. Remove all records related to the blob from the import queue
  2. Check status of dispatched containers periodically
    1. Terminate containers for which the health check fails.

Environment variables

Mandatory ones are bolded

  • AMQP_URL - Connection URL to AMQP (format: 'amqp://host:port')
  • URL_API - API's URL to query blobs (example: 'http://127.0.0.1:3000')
  • MONGODB_URI - Connection URI to mongo DB, uses collection jobs (example: mongodb://generalAdmin:[email protected]:27017/melinda-record-import-api)
  • WORK_PEND - Frequency of checking pending jobs (example '10 seconds')
  • WORK_TRANS - Frequency of checking transformed jobs (example '10 seconds')
  • WORK_ABORT - Frequency of checking aborted jobs (example '10 seconds')
  • API_USERNAME - Crowd username used for API requests
  • API_PASS - Crowd password used for API requests
  • IMPORTER_CONCURRENCY - How many importer containers for a blob can be running simultaneously (default: 1)
  • DEBUG - If set to 'true' logs are printed to stdout
  • USE_DEF - If set to 'true' all mandatory variables (besides CROWD_USERNAME and CROWD_PASS) are loaded from configuration (for development)

Frequencies

Jobs are run with Agenda method every. You can specify frequencies in human interval-format:

  • '10 seconds'
  • 'one minute';
  • '1.5 minutes';
  • '3 days and 4 hours'
  • '3 days, 4 hours and 36 seconds'
Clone this wiki locally