Last run failed! Please find error logs here
Run pipelines
Incremental pipeline is scheduled to run at NOT SCHEDULED
Run Incremental Pipeline
This fetches the resources since last run and merges them into the data-warehouse.
Run Full Pipeline
This fetches all the resources and create a new data-warehouse snapshot.
Recreate Views
This reads the current data-warehouse snapshot and recreates the flat views in the sink database.
List of DWH snapshots
Refresh icon:
Latest: /dwh/controller_DWH_TIMESTAMP_2025_10_02T10_27_28_573904282Z/
Configuration Settings
| Parameter | Value | Default Value | Description |
| fhirdata.fhirFetchMode | HAPI_JDBC | ||
| fhirdata.fhirServerUrl | |||
| fhirdata.dwhRootPrefix | /dwh/controller_DWH | ||
| fhirdata.incrementalSchedule | * */30 * * * * | ||
| fhirdata.purgeSchedule | 0 30 * * * * | ||
| fhirdata.numOfDwhSnapshotsToRetain | 2 | ||
| fhirdata.resourceList | MedicationRequest,MedicationDispense | ||
| fhirdata.numThreads | 4 | ||
| fhirdata.dbConfig | config/hapi-postgres-config_local.json | ||
| fhirdata.viewDefinitionsDir | /app/views | ||
| fhirdata.sinkDbConfigPath | config/hapi-postgres-config_local_views.json | ||
| fhirdata.fhirSinkPath | |||
| fhirdata.sinkUserName | |||
| fhirdata.sinkPassword | |||
| fhirdata.structureDefinitionsPath | classpath:/r4-us-core-definitions | ||
| fhirdata.fhirVersion | R4 | ||
| fhirdata.rowGroupSizeForParquetFiles | 33554432 | ||
| fhirdata.recursiveDepth | 1 | ||
| fhirdata.createParquetViews | false |
| Parameter | Value | Default Value | Description |
| fhirDatabaseConfigPath | config/hapi-postgres-config_local.json | ../utils/hapi-postgres-config.json | Path to FHIR database config for JDBC mode; the default value file (i.e., hapi-postgres-config.json) is for a HAPI server with PostgreSQL database. There is also a sample file for an OpenMRS server with MySQL database (dbz_event_to_fhir_config.json); the Debezium config can be ignored for batch. |
| fhirFetchMode | HAPI_JDBC | null | Mode through which the FHIR resources have to be fetched from the source FHIR server |
| fhirVersion | R4 | null | The fhir version to be used for the FHIR Context APIs |
| outputParquetPath | /dwh/controller_DWH_TIMESTAMP_2025_12_17T16_57_21_627238324Z | The base name for output Parquet files; for each resource, one fileset will be created. | |
| resourceList | MedicationRequest,MedicationDispense | Patient,Encounter,Observation | Comma separated list of resource to fetch, e.g., 'Patient,Encounter,Observation'. |
| rowGroupSizeForParquetFiles | 33554432 | 0 | The approximate size (bytes) of the row-groups in Parquet files. When this size is reached, the content is flushed to disk. A large value means more data for one column can fit into one big column chunk which means better compression and faster IO/query. On the downside, larger value means more in-memory size will be needed to hold the data before writing to files. The default value of 0 means use the default row-group size of Parquet writers. |
| runner | class org.apache.beam.runners.flink.FlinkRunner | null | The pipeline runner that will be used to execute the pipeline. For registered runners, the class name can be specified, otherwise the fully qualified name needs to be specified. |
| sinkDbConfigPath | config/hapi-postgres-config_local_views.json | Path to the sink database config; if not set, no sink DB is used. If viewDefinitionsDir is set, the output tables will be the generated views (the `name` field value will be used as the table name); if not, one table per resource type is created with the JSON content of a resource and its `id` column for each row. | |
| structureDefinitionsPath | classpath:/r4-us-core-definitions | Directory containing the structure definition files for any custom profiles that needs to be supported. If it starts with `classpath:` then the classpath is searched; and the path should always start with `/`. Do not use this if custom profiles are not needed. Example: `classpath:/r4-us-core-definitions` is the classpath name under the resources folder of module `extension-structure-definitions`. | |
| viewDefinitionsDir | /app/views | The directory from which SQL-on-FHIR-v2 ViewDefinition json files are read. Note: For the Incremental Run, this directory must contain all the ViewDefinitions used to create views in both data-warehouses! |
| Parameter | Value | Default Value | Description |
| activePeriod | The active period with format: 'DATE1_DATE2' OR 'DATE1'. The first form declares the first date-time (non-inclusive) and last date-time (inclusive); the second form declares the active period to be from the given date-time (non-inclusive) until now. Resources outside the active period are only fetched if they are associated with Patients in the active period. All requested resources in the active period are fetched. The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime For example: --activePeriod=2020-11-10T00:00:00_2020-11-20 Note this feature implies fetching Patient resources that were active in the given period. Default empty string disables this feature, i.e., all requested resources are fetched. | ||
| batchSize | 100 | 100 | The number of resources to be fetched in one API call. For the JDBC mode passing > 170 could result in HTTP 400 Bad Request. Note by default the maximum bundle size for OpenMRS FHIR module is 100. |
| cacheBundleForParquetWrites | false | false | This is an experimental feature which is intended for Dataflow runner only. The purpose is to cache output Parquet records for each Beam bundle such that the DoFn is idempotent, or to be more precise, can be retried for an incomplete bundle (Beam's bundle not FHIR) without corrupting the Parquet output. |
| fhirServerOAuthClientId | The `client_id` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`. | ||
| fhirServerOAuthClientSecret | The `client_secret` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`. | ||
| fhirServerOAuthTokenEndpoint | The `token_endpoint` to be used in the OAuth Client Credential flow when interacting with the FHIR server. If set, `fhirServerOAuthClientId` and `fhirServerOAuthClientSecret` should also be set. In that case, Basic Auth username/password is ignored. | ||
| fhirServerPassword | Fhir source server BasicAuth password | ||
| fhirServerUrl | Fhir source server URL, e.g., http://localhost:8091/fhir, etc. | ||
| fhirServerUserName | Fhir source server BasicAuth username | ||
| fhirSinkPath | The path to the target generic fhir store, or a GCP fhir store with the format: `projects/[\w-]+/locations/[\w-]+/datasets/[\w-]+/fhirStores/[\w-]+`, e.g., `projects/my-project/locations/us-central1/datasets/fhir_test/fhirStores/test` | ||
| jdbcFetchSize | 1000 | 1000 | This flag is used in the JDBC mode. In the context of an OpenMRS source, this is the size of each ID chunk. In the context of a HAPI source, this is the size of each database query. Setting high values (~10000 for OpenMRS, ~1000 for HAPI) will yield faster query execution. |
| jdbcInitialPoolSize | 3 | 3 | DEPRECATED! This is ignored; by default 3 connections are used initially. |
| jdbcMaxPoolSize | 50 | 50 | JDBC maximum pool size |
| parquetInputDwhRoot | The path to the data-warehouse directory of Parquet files to be read. The content of this directory is expected to have the same structure used in output data-warehouse, i.e., one dir per each resource type. If this is enabled, --fhirServerUrl and --fhirDatabaseConfigPath should be disabled because input resources are read from Parquet files. This is for example useful when we want to regenerate the views. [EXPERIMENTAL] | ||
| recreateSinkTables | false | false | If true, drops the old view tables first and recreate them; otherwise create tables only if they do not exit. |
| recursiveDepth | 1 | 1 | The maximum depth for traversing StructureDefinitions in Parquet schema generation (if it is non-positive, the default 1 will be used). Note in most cases, the default 1 is sufficient and increasing that can result in significantly larger schema and more complexity. For details see: https://github.com/FHIR/sql-on-fhir/blob/master/sql-on-fhir.md#recursive-structures |
| secondsToFlushParquetFiles | 600 | 600 | The number of seconds after which records are flushed into Parquet/text files; use 0 to disable (note this may have undesired memory implications). |
| since | Fetch only FHIR resources that were updated after the given timestamp.The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime This feature is currently implemented only for HAPI JDBC mode. | ||
| sinkPassword | Sink BasicAuth Password | ||
| sinkUserName | Sink BasicAuth Username | ||
| sourceJsonFilePatternList | Comma separated list of file patterns for input JSON files, e.g., 'PATH1/*,PATH2/*'. Each file should be one Bundle resource [EXPERIMENTAL] | ||
| sourceNdjsonFilePatternList | Comma separated list of input NDJSON files, e.g., 'PATH1/*,PATH2/*'. Each file contain FHIR resources serialized with no whitespace, and separated by a newline pair. | ||
| tempLocation | null | null | A pipeline level default location for storing temporary files. |
Pipeline Metrics
Fetch the latest pipeline metrics