FHIR Pipelines Control Panel
Last run failed! Please find error logs here

Run pipelines

Incremental pipeline is scheduled to run at NOT SCHEDULED
Run Incremental Pipeline

This fetches the resources since last run and merges them into the data-warehouse.

Run Full Pipeline

This fetches all the resources and create a new data-warehouse snapshot.

Recreate Views

This reads the current data-warehouse snapshot and recreates the flat views in the sink database.

List of DWH snapshots

Refresh icon:

Latest: /dwh/controller_DWH_TIMESTAMP_2025_10_02T10_27_28_573904282Z/

Configuration Settings

Parameter Value Default Value Description
fhirdata.fhirFetchMode HAPI_JDBC
fhirdata.fhirServerUrl
fhirdata.dwhRootPrefix /dwh/controller_DWH
fhirdata.incrementalSchedule * */30 * * * *
fhirdata.purgeSchedule 0 30 * * * *
fhirdata.numOfDwhSnapshotsToRetain 2
fhirdata.resourceList MedicationRequest,MedicationDispense
fhirdata.numThreads 4
fhirdata.dbConfig config/hapi-postgres-config_local.json
fhirdata.viewDefinitionsDir /app/views
fhirdata.sinkDbConfigPath config/hapi-postgres-config_local_views.json
fhirdata.fhirSinkPath
fhirdata.sinkUserName
fhirdata.sinkPassword
fhirdata.structureDefinitionsPath classpath:/r4-us-core-definitions
fhirdata.fhirVersion R4
fhirdata.rowGroupSizeForParquetFiles 33554432
fhirdata.recursiveDepth 1
fhirdata.createParquetViews false

Parameter Value Default Value Description
fhirDatabaseConfigPath config/hapi-postgres-config_local.json ../utils/hapi-postgres-config.json Path to FHIR database config for JDBC mode; the default value file (i.e., hapi-postgres-config.json) is for a HAPI server with PostgreSQL database. There is also a sample file for an OpenMRS server with MySQL database (dbz_event_to_fhir_config.json); the Debezium config can be ignored for batch.
fhirFetchMode HAPI_JDBC null Mode through which the FHIR resources have to be fetched from the source FHIR server
fhirVersion R4 null The fhir version to be used for the FHIR Context APIs
outputParquetPath /dwh/controller_DWH_TIMESTAMP_2025_12_17T16_57_21_627238324Z The base name for output Parquet files; for each resource, one fileset will be created.
resourceList MedicationRequest,MedicationDispense Patient,Encounter,Observation Comma separated list of resource to fetch, e.g., 'Patient,Encounter,Observation'.
rowGroupSizeForParquetFiles 33554432 0 The approximate size (bytes) of the row-groups in Parquet files. When this size is reached, the content is flushed to disk. A large value means more data for one column can fit into one big column chunk which means better compression and faster IO/query. On the downside, larger value means more in-memory size will be needed to hold the data before writing to files. The default value of 0 means use the default row-group size of Parquet writers.
runner class org.apache.beam.runners.flink.FlinkRunner null The pipeline runner that will be used to execute the pipeline. For registered runners, the class name can be specified, otherwise the fully qualified name needs to be specified.
sinkDbConfigPath config/hapi-postgres-config_local_views.json Path to the sink database config; if not set, no sink DB is used. If viewDefinitionsDir is set, the output tables will be the generated views (the `name` field value will be used as the table name); if not, one table per resource type is created with the JSON content of a resource and its `id` column for each row.
structureDefinitionsPath classpath:/r4-us-core-definitions Directory containing the structure definition files for any custom profiles that needs to be supported. If it starts with `classpath:` then the classpath is searched; and the path should always start with `/`. Do not use this if custom profiles are not needed. Example: `classpath:/r4-us-core-definitions` is the classpath name under the resources folder of module `extension-structure-definitions`.
viewDefinitionsDir /app/views The directory from which SQL-on-FHIR-v2 ViewDefinition json files are read. Note: For the Incremental Run, this directory must contain all the ViewDefinitions used to create views in both data-warehouses!

Parameter Value Default Value Description
activePeriod The active period with format: 'DATE1_DATE2' OR 'DATE1'. The first form declares the first date-time (non-inclusive) and last date-time (inclusive); the second form declares the active period to be from the given date-time (non-inclusive) until now. Resources outside the active period are only fetched if they are associated with Patients in the active period. All requested resources in the active period are fetched. The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime For example: --activePeriod=2020-11-10T00:00:00_2020-11-20 Note this feature implies fetching Patient resources that were active in the given period. Default empty string disables this feature, i.e., all requested resources are fetched.
batchSize 100 100 The number of resources to be fetched in one API call. For the JDBC mode passing > 170 could result in HTTP 400 Bad Request. Note by default the maximum bundle size for OpenMRS FHIR module is 100.
cacheBundleForParquetWrites false false This is an experimental feature which is intended for Dataflow runner only. The purpose is to cache output Parquet records for each Beam bundle such that the DoFn is idempotent, or to be more precise, can be retried for an incomplete bundle (Beam's bundle not FHIR) without corrupting the Parquet output.
fhirServerOAuthClientId The `client_id` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`.
fhirServerOAuthClientSecret The `client_secret` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`.
fhirServerOAuthTokenEndpoint The `token_endpoint` to be used in the OAuth Client Credential flow when interacting with the FHIR server. If set, `fhirServerOAuthClientId` and `fhirServerOAuthClientSecret` should also be set. In that case, Basic Auth username/password is ignored.
fhirServerPassword Fhir source server BasicAuth password
fhirServerUrl Fhir source server URL, e.g., http://localhost:8091/fhir, etc.
fhirServerUserName Fhir source server BasicAuth username
fhirSinkPath The path to the target generic fhir store, or a GCP fhir store with the format: `projects/[\w-]+/locations/[\w-]+/datasets/[\w-]+/fhirStores/[\w-]+`, e.g., `projects/my-project/locations/us-central1/datasets/fhir_test/fhirStores/test`
jdbcFetchSize 1000 1000 This flag is used in the JDBC mode. In the context of an OpenMRS source, this is the size of each ID chunk. In the context of a HAPI source, this is the size of each database query. Setting high values (~10000 for OpenMRS, ~1000 for HAPI) will yield faster query execution.
jdbcInitialPoolSize 3 3 DEPRECATED! This is ignored; by default 3 connections are used initially.
jdbcMaxPoolSize 50 50 JDBC maximum pool size
parquetInputDwhRoot The path to the data-warehouse directory of Parquet files to be read. The content of this directory is expected to have the same structure used in output data-warehouse, i.e., one dir per each resource type. If this is enabled, --fhirServerUrl and --fhirDatabaseConfigPath should be disabled because input resources are read from Parquet files. This is for example useful when we want to regenerate the views. [EXPERIMENTAL]
recreateSinkTables false false If true, drops the old view tables first and recreate them; otherwise create tables only if they do not exit.
recursiveDepth 1 1 The maximum depth for traversing StructureDefinitions in Parquet schema generation (if it is non-positive, the default 1 will be used). Note in most cases, the default 1 is sufficient and increasing that can result in significantly larger schema and more complexity. For details see: https://github.com/FHIR/sql-on-fhir/blob/master/sql-on-fhir.md#recursive-structures
secondsToFlushParquetFiles 600 600 The number of seconds after which records are flushed into Parquet/text files; use 0 to disable (note this may have undesired memory implications).
since Fetch only FHIR resources that were updated after the given timestamp.The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime This feature is currently implemented only for HAPI JDBC mode.
sinkPassword Sink BasicAuth Password
sinkUserName Sink BasicAuth Username
sourceJsonFilePatternList Comma separated list of file patterns for input JSON files, e.g., 'PATH1/*,PATH2/*'. Each file should be one Bundle resource [EXPERIMENTAL]
sourceNdjsonFilePatternList Comma separated list of input NDJSON files, e.g., 'PATH1/*,PATH2/*'. Each file contain FHIR resources serialized with no whitespace, and separated by a newline pair.
tempLocation null null A pipeline level default location for storing temporary files.

Pipeline Metrics

Fetch the latest pipeline metrics