CSV schema

CSV outputs provide subject-level phenotype tables for downstream analysis. The authoritative documentation layer is the output inventory plus the promoted phenotype pages that name exact output columns.

Modality
All modalities
Pipeline step
Feature extraction, aggregation, and schema documentation
Outputs
Aggregate CSV phenotype tables keyed by eid
Maturity
Source-audited reference page

Required schema principles

  • Use a stable subject key, usually eid.
  • Keep column names stable and documented.
  • Preserve units in column names when the pipeline emits them.
  • Treat legacy spelling as schema debt, not as an invitation for silent cleanup.
  • Validate duplicate eid, missing columns, empty files, type drift, and column-order drift during aggregation or documentation audit.

Current artifact families

Artifact familyRegistry sourceExample rows
Ventricular volume/functiondocs/data/output_column_inventory.ymlLV: V_ED [mL], RV: EF [%], LV: CI [L/min/m^2]
Atrial volume/functionoutput inventory and phenotype dictionaryLA: V_max [mL], RA: EF_total [%], LA: PER-E [mL/s]
Myocardial mass/wall thicknessoutput inventory and phenotype dictionaryMyo: Mass [g], Myo: Thickness (Global) [mm]
Mechanics/strainoutput inventory and phenotype dictionaryStrain-SAX:*, Strain-LAX:*, Strain-Tagged:*
Aortic and flowoutput inventory and phenotype dictionaryAortic Structure-*, Aortic Flow:*, Aortic Distensibility-*
Tissueoutput inventory and phenotype dictionaryNative T1:*, Native T1-Corrected:*
Cross-chamberoutput inventory and phenotype dictionaryAV: AVPD [cm], AV: IPVT [mL]

Schema is part of documentation

CSV column names are not implementation details. They are the bridge between source code, exported data, phenotype pages, validators, and downstream paper tables.

Source audit

  • Artifact families were checked against docs/data/output_column_inventory.yml.
  • Page-level exact output coverage is enforced for promoted phenotype pages by website/scripts/validate-phenotype-pages.cjs.
  • Textbook context boundary: broad clinical textbook context is not surfaced here because this page documents data schema contracts.