Want to make creations as awesome as this one?

More creations to inspire you

Transcript

Consolidate and submit data to the central cloud environment

Create Extract

D5

Remove sensitive information as per site-specific DUA's, perform QC

DEIDENTIFY & QC DATA

D4

Connect patient and encounter information between modalities

LINK DATA

D3

Harmonize data to standard formats to enable downstream integration

STANDARDIZE DATA

D2

Capture and characterize data across various modalities

GET DATA

D1

Iteratively update D1-D5 to improve quality and completeness

Improve data

D6

Evaluate quality and completeness of the extract

Assess Data

C2

Process and load data to facilitate standard assessment

INGEST Data

C1

Once data fulfills CHoRUS requirements, merge with other approved extracts

Approve & MERGE

C3

DATA GENERATING SITEs

CENTRAL CLOUD

TO ANALYTICS ENCLAVE

data sources

Structured EHR

Flowsheet

Free TeXT

imaging

WAVEFORM

Discussions

office hours

progress

Standards

Data acq.

Tooling

Standards

Data acq.

Tooling

Standards

Data acq.

This task refers to a standard process of evaluating data extracts for their quality and fitness for use in the broader data enclave.

Motivation

Assess quality and data fitness

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • ICU Module: DQD

This task refers to the process of verifying the identify-ability, plausibility, completeness, and conformance of the dataset. Here, we will use established open-source tools (e.g. Achilles, DQD, Ares) to execute a series of validated checks and then produce an extract (i.e. AresIndex) that can be visualized and compared with other OMOP instances with regard to its richness, quality, and diversity. It is this extract that data contributing sites will be required to submit to the central MGH cloud instance for evaluation and feedback.

Motivation

Deidentify and perform quality control

X

Resources

  • Deidentification

S.O.P.

  • Achilles Output

OFFICE HOURS

  • OHDSI Achilles

Codebase

  • DQD Output

  • OHDSI DQD

  • Quality Control

  • DQD Overview

  • Ares Overview

  • Contributing to Ares

  • OHDSI AresIndexer

  • OHDSI Ares

[TYPES OF IMAGING DATA]

DESCRIPTION

Imaging DATA

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

[TYPES OF WAVEFORM DATA]

DESCRIPTION

WAVEFORM DATA

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

This task refers to connecting data from diverse modes together in a way that enables the selection and characterization of patient cohorts using a data mode(s) of choice. For example, a cohort could be selected based upon (1) diagnoses registered in a patient's EHR, (2) measurement values recorded in flowsheet data, (3) complications outlined in a discharge report, (4) artifacts identified in a CT image, or (5) artifacts extracted from a waveform signal. Once created, this dataset would contain all data modes available for the associated cohort, any of which could be used in downstream analyses.

Motivation

Linking Data modalities

X

Resources

  • Data Linkage

S.O.P.

  • Imaging Modalities

OFFICE HOURS

  • Image Parsing

Codebase

  • Private Tags

  • Waveform Parsing

This task refers to using open-source tooling like WhiteRabbit and other internal data analysis methods to investigate and understand the data available to each data contributing site. For relational EHR data, this typically requires running a database scan or producing metadata about the contents of relevant tables. For non-relational data, characterizations will likely focus on identifying quantity (storage space, number of files, etc) and diversity (unique codes, ontology structures, etc.) of data and an overview of the metadata available that will require mapping in subsequent stages.

Motivation

Collecting and characterizing data

X

Resources

  • Data Collection

S.O.P.

  • White Rabbit

OFFICE HOURS

  • White Rabbit

Codebase

This task refers to placing data in the organizational structure defined by the CHoRUS DataAcquisition team. Thus far, the convention is to create per-person directories, each with three sub-directories (OMOP, Image, Waveform). This structure is subject to change depending on results of preliminary ingestion processes in the central cloud instance.

Motivation

create and submit data extract

X

Resources

  • Extract Creation

S.O.P.

  • TO BE ADDED

OFFICE HOURS

  • MIMIC Images

Codebase

  • MIMIC Waveform

[TYPES OF FLOWSHEET DATA]

DESCRIPTION

FLOWSHEET Data

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

This task refers refers to making connections between source representations of medical events or concepts (e.g. EPIC procedural code referring to an appendectomy) and standard representations of those elements (e.g. ICDPCS Procedure for appendectomy). Through the DelPhi process, we have generated a prioritized list of medical concepts that are relevant for the downstream analyses proposed in Bridge2AI.

Motivation

Standardizing Data Elements

X

Resources

  • Standardization

S.O.P.

  • Delphi MIMIC

OFFICE HOURS

  • OHDSI USAGI

Codebase

  • Mapping 101

  • Map Validation pt 1

  • Map Validation pt 2

  • Vocab Gaps

  • Flowsheets pt 1

  • Flowsheets pt 2

  • Flowsheets pt 3

  • Usagi & STCM

  • OMOP Vocab pt 1

  • OMOP Vocab pt 2

OTHER

  • Delphi Mappings

  • Workload Disc.

  • Sharing Disc.

  • Delphi Disc.

  • Athena Search

This task refers to providing the data generating sites with feedback about the extracts that they delivered.

Motivation

Return extract-specific feedback

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • CHoRUS Reports

This task refers to evaluating the feedback provided by the central cloud team and revising any elements that need attention.

Motivation

Review feedback and improve quality

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

[TYPES OF FREE-TEXT DATA]

DESCRIPTION

FREE-TEXT NOTES

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

This task refers to ingesting csv files into a staging database and executing processing steps like date shifting and quality checks.

Motivation

Ingest data extract at central cloud

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • Ingestion ETL

This task refers to the process of sharing the organized data extract with the central Azure instance hosted by MGH. This process is currently being done using Azure Data Share and the Azure CLI

Motivation

Submit Data to Central cloud

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

[TYPES OF STRUCTURED EHR DATA]

DESCRIPTION

Structured EHR Data

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • TO BE ADDED

This task refers to defining and evaluating quality thresholds necessary for approval, and once an extract meets those extracts, to execute a merge process to link those data with other approved extracts while retaining relationality.

Motivation

Approve Extract and merge with others

X

Resources

S.O.P.

  • TO BE ADDED

OFFICE HOURS

Codebase

  • TO BE ADDED

  • MERGE ETL