Coffee Chat Brewing AI Knowledge

eng kor

[Paper] TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records (2023)

Yang, Zhichao, et al. “TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records.” Nature Communications 14.1 (2023): 7857.

Paper Link

Points

  1. New pre-training objective: predicting all diseases or outcomes of a future visit

    • Helps the model uncover the complex interrelations among different diseases and outcomes
  2. TransformEHR, the generative encoder-decoder framework to predict patients’ ICD codes using their longitudinal EHRs

  3. Validation of the generalizability using both internal and external datasets

    • Demonstrated a strong transfer learning capability of the model

    • Could be great with limited data and computing resources


Background

  • Longitudinal electronic health records (EHRs) have been successfully used to predict clinical diseases or outcomes (congestive heart failure, sepsis mortality, mechanical ventilation, septic shock, diabetes, PTSD, etc.)
  • With the availability of large cohorts and computational resources, deep learning (DL) based models outperform traditional machine learning (ML) models (Med-BERT, BEHRT, BRLTM, etc.)
  • The existing pre-training tasks were limited in predicting a fraction of ICD codes within each visit :arrow_right: A novel pre-training strategy, which predicts the complete set of diseases and outcomes within a visit, might improve clinical predictive modeling


Method

Data

*VHA: Veterans Health Administration, the largest integrated healthcare system in the US, providing care at 1,321 healthcare facilities

Pre-training data: around 6M patients who received care from more than 1,200 facilities of the US VHA

  • Two common and uncommon disease/outcome agnostic prediction (DOAP) datasets

    • ICD-10CM codes with more than a 2% prevalence ratio for common dataset
    • Those with a 0.04%-0.05% prevalence ratio for uncommon dataset
  • Non-VHA dataset: from MIMIC-IV dataset (29,482)

    • Only selected objects with ICD-10CM records to match the cohorts from VHA


Longitudinal EHRs

image-20240306100731535

  • Include demographic information (gender, age, race, and marital status) and ICD-10CM codes as predictors

  • Group ICD codes at the visit level

  • Order the codes by priority, where the primary diagnosis is typically given the highest priority

  • Form multiple visits as a time-stamped input of a sequence by date of visit


Embeddings

image-20240306103928325

Multi-level embeddings: visit embeddings + time embeddings + code embeddings

  • Time embeddings: embed days difference as relative time information by getting the difference between a certain visit and the last visit in the EHR
    • Includes the date of each visit to integrate temporal information, not only sequential order
    • Date is important as the importance of predictor in a visit can vary over time


Model Architecture

Encoder-decoder transformer-based architecture

image-20240306114845710

  • Encoder: performs cross-attention, unlikely BERT, over representations and assigns an attention weight for each representation
    • Cross-attention is implemented by masking the complete set of ICD codes of a future visit as shown in Fig. 2b
  • Decoder: generates ICD codes of the masked future visit with the weighted representations from the encoder
    • Generates the codes following the order of code priority within a visit


Evaluation

Metrics: PPV (precision), AUROC, AUPRC

Baseline models: logistic regression, LSTM, BERT without pre-training, BERT with pre-training   # what’s the objective when pre-training BERT? MLM or the objective proposed in this paper?


Pre-training

Task: Disease or outcome agnostic prediction (DOAP); Predicting the ICD codes of a patient’s future visit based on longitudinal information up to the current visit

Ablation study

  1. Visit masking vs. code (part of visit) masking for an encoder-decoder model
    • Visit masking performed better; pre-training of all diseases outperform traditional pre-training objective (2.52-2.96% in AUROC)
  2. Encoder-decoder vs. encoder-only (BERT) on DOAP
    • ​Encoder-decoder outperformed; 0.74-1.16% in AUROC   # the possibility if the parameter size affected this result?
  3. Time embeddings O vs. X
    • The model with the time embeddings outperformed moderately; 0.43 in AUROC
    • Days difference is more effective than specific date as the embeddings


Fine-tuning

Tasks: the pancreatic cancer onset prediction (Table 3) and intentional self-harm prediction in patients with PTSD (Table 4)

  • TransformEHR outperforms on the both tasks
  • AUPRC was consistent when using different set of demographics

  • Results with all visits was better than with recent few(five) visits
  • In generalizability evaluation,
    • When testing with internal dataset which is included data from VHA facilities not used for pre-training, there’s no statistical difference in AUPRC on the intentional self-harm prediction task among PTSD

[Study] Relations between white matter hyperintensity (WMH) feature and amyloid-beta (β-amyloid) and tau burden

백질 변성이 오래되고 진행한 73세 환자의 뇌 MRI 사진. 뇌 중심부에 하얀색으로 넓게 퍼져 있음. 부천성모병원 제공

WMH: A form of cerebrovascular degeneration in which the white matter, the medulla, which acts as a pathway between the cortex and gray matter, is dilated and damaged. Usually seen with MR FLAIR imaging

References

Alban, Sierra L., et al. “The association between white matter hyperintensities and amyloid and tau deposition.” NeuroImage: Clinical 38 (2023): 103383. link

Graff-Radford, Jonathan, et al. “White matter hyperintensities: relationship to amyloid and tau burden.” Brain 142.8 (2019): 2483-2491. link

Hedden, Trey, et al. “Cognitive profile of amyloid burden and white matter hyperintensities in cognitively normal older adults.” Journal of Neuroscience 32.46 (2012): 16233-16242. link

Keys

  • The relationship between WMH and amyloid beta accumulation remains unclear: From whether they are related to each other to how they are related if they are related.

  • A growing number of studies are attempting to elucidate the relationship between WMH and Ab.

  • There is no relationship between WMH and tau accumulation.


The association between white matter hyperintensities and amyloid and tau deposition (2023)

Abstract

… Finally, the regions where β-amyloid and WMH count were most positively associated were the middle temporal region in the right hemisphere (r = 0.18, p = 0.002) and the fusiform region in the left hemisphere (r = 0.017, p = 0.005). β-amyloid and WMH have a clear association, though the mechanism facilitating this association is still not fully understood. The associations found between β-amyloid and WMH burden emphasize the relationship between β-amyloid and vascular lesion formation while factors like CVRFs, age, and sex affect AD development through various mechanisms

Data

The subset of ADNI-3 participants who had all the T1-weighted, 3D FLAIR, Amyloid, and Tau PET modalities available

Snippets

  • The percentage of white matter volume occupied by WMH was significantly and positively correlated with β-amyloid PET SUVR (Fig. 1; r = 0.28, p < 0.001). We observed WMH volume to significantly predict global amyloid accumulation when controlling for age, sex, years of education, and scanner manufacturer (F(1, 309) = 13.9, p = 0.0002).
  • The correlational analyses were repeated after the log transformation of WMH volume and outcomes remained the same, resulting in a significant positive correlation between WMH volume and β-amyloid (r = 0.24, p = 4.9e-5), and a nonsignificant positive correlation between WMH volume and meta-temporal tau (r = 0.09, p = 0.12).
  • The inclusion of MOCA, MMSE, and Global CDR, as covariates, did not change the significant relationship between WMH volume and β-amyloid. We observed a significant effect of hippocampal volume fraction on WMH volume (F(1, 580) = 16.9, p = 4.5e-5) and β-amyloid (F(1, 309) = 32.5, p = 2.8e-8).
  • WMH volume percent of participants with either amyloid (A+) or tau (T+) pathology was higher than controls (A-/T-) (Fig. 2). We observed a significantly higher WMH percent in AD pathology participants (A+/T+) compared to controls (A-/T-) (p = 0.007, Cohen’s d = 0.4, t = -2.5). … No significant association was found in the A-/T- group (r = 0.06, p = 0.45). A significant positive correlation was observed between β-amyloid SUVR and WMH count in the A+/T+ group only.
  • WMH count was used as another method of measuring WMH burden. … Both statistical tests on the WMH volume and the WMH count showed similar results, confirming that WMH count is an accurate measure to use alongside WMH volume. Correlations of WMH count with β-amyloid within A/T pathological groups also paralleled the WMH volume analysis result.
  • Our regional analysis showed that β-amyloid and WMH accumulation in the precentral, cuneus, fusiform, isthmus cingulate, lateral occipital, lingual, superior parietal, and supramarginal regions were most significantly associated across all pathological groups when averaged across hemispheres. Variations in the locations of increased WMHs are indicative of AD and its phase of progression, some of these regions being more implicated in cognitive decline than others.
  • We observed neither a correlation nor an association between WMH and Tau uptake in the entire cohort (Fig. 4: correlation p = 0.25; association p = 0.4, controlling for age, sex, years of education, and scanner manufacturer).

  • Additionally, WMH volume was only predicted by CN and MCI diagnoses, not AD. The relationship between WMH volume often predicts AD in the preclinical stages, likely accounting for the relationship we observed.

  • The inclusion of cognitive scores MMSE, MOCA, and Global CDR had no effect on the associations found between β-amyloid and WMH volume, therefore not significantly impacting this relationship. In the literature, worse performance on these cognitive tests has been associated with increased WMH volume (Wang et al., 2020).

img File:Gray726 temporal pole.png

above: the isthmus cingulate; bottom: the temporal pole

  • The relationship between higher MMSE scores and increased WMH volume showed significance with multiple comparison corrections while controlling for age and sex in the isthmus cingulate region

  • For Global CDR scores, this spatial distribution was significant in the isthmus cingulate, temporal pole, and pars triangularis

  • Lastly, the MOCA scores showed significant positive spatial relationships in the isthmus cingulate, linguistic, and temporal pole regions

  • We also did not observe any significant effect of APOE-ε4 presence on the established relationship between β-amyloid and WMH volume or WMH count. … Although, we did have a small sample size of individuals with the homozygous APOE-ε4 genotype.


White matter hyperintensities: relationship to amyloid and tau burden (2019)

Abstract

White matter hyperintense volumes in the detected topographic pattern correlated strongly with lobar cerebral microbleeds (P < 0.001, age and sex-adjusted Cohen’s d = 0.703). In contrast, there were no white matter hyperintense regions significantly associated with increased tau burden using voxel-based analysis or region-specific analysis, among non-demented elderly, amyloid load correlated with a topographic pattern of white matter hyperintensities. Further, the amyloid-associated, white matter hyperintense regions strongly correlated with lobar cerebral microbleeds suggesting that cerebral amyloid angiopathy contributes to the relationship between amyloid and white matter hyperintensities.

Data

img

  • Participants, aged 50 to 89, were enrolled in the Mayo Clinic Study of Aging (MCSA), a population-based study of Olmsted County, Minnesota residents.
  • 434 non-demented participants with FLAIR-MRI, tau-PET (AV-1451), and Pittsburgh compound B (PiB)-PET (amyloid) scans to assess the relationship between FLAIR WMH and Alzheimer’s disease pathologies.

Snippets

  • In the study of non-demented individuals, we found that amyloid burden measured by PET was associated with a topographic pattern of WMH. These amyloid-related WMH regions were associated with lobar CMBs suggesting that regional changes correlate with CAA. We found no evidence to support an association between tau burden and WMH burden.

  • We did not detect an association between tau burden and WMH in either the voxel-level analyses or region-level analyses.


Cognitive Profile of Amyloid Burden and White Matter Hyperintensities in Cognitively Normal Older Adults (2012)

Abstract

Amyloid burden and WMH were not correlated with one another. Age was associated with lower performance in all cognitive domains, while higher estimated verbal intelligence was associated with higher performance in all domains. Hypothesis-driven tests revealed that amyloid burden and WMH had distinct cognitive profiles, with amyloid burden having a specific influence on episodic memory and WMH primarily associated with executive function but having broad (but lesser) effects on the other domains. These findings suggest that even before clinical impairment, amyloid burden, and WMH likely represent neuropathological cascades with distinct etiologies and dissociable influences on cognition.

Data

  • 168 (95 female) cognitively normal, community-dwelling older adults (aged 65–86, M=73.24, SD=5.80).

  • Participants in the Harvard Aging Brain Study, an ongoing longitudinal study currently in the baseline assessment phase

  • Because of the staged nature of the visits (all baseline visits must be completed within 6 months), positron emission tomography (PET) amyloid imaging and magnetic resonance imaging (MRI) estimates of WMH were currently available for 109 of the older adults.

[Paper] BEHRT: Transformer for Electronic Health Records (2020)

Li, Yikuan, et al. “BEHRT: transformer for electronic health records.” Scientific reports 10.1 (2020): 7155.

Paper Link

Points

  1. BEHRT (BERT for EHR): BERT-based architecture

    • In EHR, certain diseases can be reversed, or the time interval between two diagnoses can be shorter or longer than recorded.

      → Bidirectional contextual awareness of the model’s representation is a big advantage with EHR data.

  2. Transfer Learning: pre-training on predicting of masked disease words, such as Masked Language Modeling (MLM), and then fine-tuning on three disease prediction tasks
  3. Disease embeddings: show the relations between the various diseases

Background

  • In traditional research on EHR data, individuals are represented by models as features. Experts had to define the appropriate features.

  • Studies applying Deep Learning (DL) to EHR started to show that DL models can outperform the traditional Machine Learning (ML) methods
  • With DL architectures for sequence data, such as Recurrent Neural Networks (RNNs), the application of DL models for EHR was improved in terms of capturing the long-term dependencies among events.
  • Similarities between sequences in EHR and natural language lead to the successful transferability of techniques

Method

Data

Clinical Practice Research Datalink (CPRD)

  • one of the largest linked primary care EHR systems

  • Contains longitudinal data from a network of 674 general practitioner practices in the UK

스크린샷 2024-02-29 오후 8.29.23

  • from ​8 million patients (eligible for linkage to HES, meet CPRD’s quality standards) to 1.6 million patients (having at least 5 visits in the EHR)

  • Only the data from GP practices considered in this study, which consented to record linkage with HES

Input Features

스크린샷 2024-02-29 오후 8.30.30

  • Only consider the diagnoses and ages

The patient $p$’s EHR: \[ V_p={v^1_p},{v^2_p},{v^3_p}, …,{v^n_p} \]

The patient $p$’s EHR of the $j$th visit: $v^j_p$

$v^j_p$ is a list consisting of single or multiple $m$ diagnoses: $v^j_p={d_1, …d_m}$

Input sequence: \[ I_p={CLS, v^1_p, SEP, v^2_p, SEP,…,v^{n_p}_p, SEP} \]

  • $CLS$ token: the start of a medical history

  • $SEP$ token: between visits

Embeddings

스크린샷 2024-02-29 오후 8.31.09

  • A combination of 4 embeddings: disease + position + age + visit segment

  • Disease embeddings: past diseases can improve the accuracy of the prediction for future diagnoses

  • Positional encodings: determine the relative position in the EHR sequence

Tasks

Pre-training: prediction of masked disease words (MLM)

  • 86.5% of the disease words were unchanged, 12% of them replaced with the mask token, and the remaining 1.5% words replaced with randomly-chosen disease words

Fine-tuning: 3 different disease prediction tasks

  • Prediction of diseases in the next visit (T1)

    1. Randomly choose an index $j (3<j<n_p)$ for each patient
    2. Form input as $x_p={v^1_p, …, v^j_p}$
    3. Output $y_p=w_{j+1}$, $w_{j+1}$ is a multi-one-hot vector, indexed for disease that exist in $v^{j+1}_p$
    • Each patient contributes only an input-output pair to the training and evaluation process.
  • Prediction of diseases in the next 6 months (T2) & in the next 12 months (T3)

    1. Not include patients that don’t have 6 or 12 months worth of EHR in the analysis
    2. choose $j$ randomly from $(3, n*)$, where $n*$ is the highest index after 6 or 12 months
    3. output $y_p=w_{6m}$ and $y_p=w_{12m}$
  • The number of patients for each task: 699K, 391K, 342K

Results

Disease Embeddings

스크린샷 2024-03-05 오전 9.39.18

T-SNE Results

  • The natural stratification of gender-specific diseases

  • The natural clusters are formed that in most cases consist fo disease of the same chapter

  • 10 closest diseases by cosine similarity of their embeddings are founded as similar as those provided by a clinical researcher (0.757 overlab) - Supplementary table S3

Attention and Interpretability

Found the relationships among events which goes beyond temporal/sequence adjacency

Analysed the attention-based patterns by Vig

  • Strong connections between rheumatoid arthritis and enthesopathies and synovial disorders → Attention can go beyond recent events and find long-range dependencies among diseases

Disease Prediction

Metrics: average precision score(APS), AUROC

  • APS: a weighted mean of precision and recall achieved at different thresholds
  1. Performance scores

    스크린샷 2024-03-05 오전 9.49.39

    → Outperformed the best model by more than around 8% in predicting for a range of more than 300 diseases

  2. Comparing performance for each disease

    APS and AUROC scores with the all \(y_p\) and \(y*_p\) vectors of a disease

스크린샷 2024-03-05 오전 10.08.39

스크린샷 2024-03-05 오전 10.08.53

Discussion

  • Flexibility of BEHRT from employed 4 key concepts from EHR: disease, age, segment, position
    • gains insights about underlying generating process of EHR
    • Distributed/complex representations that are capable of capturing concepts of disease
    • Future work: adding new concepts to the embeddings
  • Disease embeddings provide great insights into how various diseases are related to each other: co-occurrence and closeness of diseases
    • Could be used for future research as reliable disease vectors
  • Important features of EHR for prediction

    • Robust, gender-specific predictions without inclusion of gender

    • Position was important

    • Age embeddings might be vital in diagnosing age-related diseases

[LearnMRI] Vascular changes related to Alzheimer's dementia observed through MRI

When amyloid beta (A$\beta$) deposits, the likelihood of Alzheimer’s dementia increases. Prior to brain damage caused by A$\beta$ deposition, vascular changes are observed, which can be improved through lifestyle changes or medication. Various types of vascular changes that can be observed through various forms of MRI exist.


Choroid plexus (ChP)

  • MRI type: T1-weighted image (T1WI)
  • ChP: A network of vessels and cells found in the brain’s ventricles.
    • Acts as a gateway for immune cells between the blood and the brian.
    • Produces cerebrospinal fluid (CSF) aiding in clearing waste and toxins from brain cells.
  • The volume of ChP correlates with the severity of cognitive impairment.


Perivascular space (PVS)

  • MRI type: T2-weighted image (T2WI)
  • PVS: Space surrounding arteries penetrating the brain
    • Concept encompassing fluids and tissues within and around vessel walls.
    • A component of the blood-brain barrier (BBB).
  • Fluid dynamic
    • Acts as a network between cerebrospinal fluid (CSF) and interstitial fluid (ISF), cleansing byproducts of the brain.
    • Known to play a role in the brain’s lymphatic system → glymphatic system.
  • Enlarged perivascular spaces (EPVS) are associated with degenerative brain diseases and various brain disorders.
  • The volume of PVS is proportional to the likelihood of amyloid beta (A$\beta$) positivity.
    • Pronounced tendency in the temporal lobe: A$\beta$ positive patients often have abundant temporal PVS.


White matter hyperintensity (WMH)

  • MRI type: FLAIR
  • WMH: An excessive concentration of white matter
    • Primarily caused by aging, manifested by loss of blood flow and cellular damage.
    • Subcortical hyperintensity: WMH near the basal ganglia.
  • While direct association with Alzheimer’s dementia hasn’t been confirmed, observations suggest a proportional relationship between A$\beta$ positivity and WMH volume in patients.


[LearnMRI] The Types of MRI Modalities and Observable Brain Patterns

According to the Spin echo technique, T1-weighted images (T1WI) and T2-weighted images (T2WI) can be obtained. By manipulating these images, various MR modality images can be created. Since the signal intensity of lesion tissue varies in each image, the types of lesions emphasized are different.


T1-weighted Image (T1WI)

  • Spin echo: Both the repetition time (TR) and the echo time (TE) are set short.
    • When TR is shortened, the recovery time (T1 realxation time) of $Mz$ varies depending on the tissue, emphasizing the difference. Some tissues have fully recovered, while others have not fully recovered by the time of the second pulse, leading to varying influences from the second pulse. This difference is reflected in the image.
    • TE should be set short to minimize its influence on the T2 relaxation time values.
  • Signal intensity
    • Signal intensity is higher than T2 → anatomical structures are more clearly distinguished.
    • Subcutaneous fat and blood appear hyperintense, which means brighter, while muscles appear intermediate, and water appears hypointense, which means darker.
    • Marrow, being rich in fat, appears hyperintense, while cortex, having less water, appears hypointense.
    • Lesions: Lipoma, acute hemorrhage, lesions containing high protein content (e.g., mucocele)
  • Observation: Cortical morphology (anatomical detail), vascular changes, blood-brain barrier integrity
  • Feature: Cortical thickness, choroid plexus (ChP)

T2-weighted image (T2WI)

  • Spin echo: Both TR and TE are set long.
    • Lengthening TR minimizes its impact on T1 relaxation time.
    • Longer TE emphasizes the contrast in the extent of $Mxy$ decrease, resulting in different tissue representations in the image.
  • Signal intensity:
    • Water appears hyperintense, aiding in the detection of pathological tissues with higher water content, such as lesions.
    • Most lesions appear as low signal intensity(hypointense) on T1 and high signal intensity(hyperintense) on T2.
    • The brightness of water in T2 images varies, with cysts appearing brightest, followed by edema, and then normal tissue.
    • Muscles, fat, and blood appear hypointense.
    • Cerebrospinal fluid (CSF) also appears hyperintense, making it challenging to distinguish lesions, such as Perivascular space (PVS).
  • Observation: Lesions, hypointense lesions (such as acute hematomas, fungal balls, etc.), arteries (veins show varying signal intensities due to differing blood flow rates)
  • Feature: Perivascular space (PVS)


FLAIR (Fluid Attenuation Inversion Recovery)

  • CSF is rendered black in T2 images.
  • Non-free-flowing water appears hyperintense, while fat appears hypointense.
  • Observation: Lesions around the ventricles, edema (which appears bright due to stanant fluid), grey-white matter differentiation.
  • Feature: Lesions, white matter hyperintensity (WMH)


GRE (Gradient Echo; T2*)

  • Paramagnetic substances such as blood, calcium, and metal appear hyperintense, allowing for the observation of iron deposition.
  • Observation: Excellent for detecting microbleeds in early and late brain hemorrhages, as well as diffuse axonal injury.
    • *Diffuse axonal injury: One of the components of brain trauma, characterized by axonal damage leadidng to a coma state after trauma.
  • Feature: bleeding