Feature formulas and units
Feature documentation must make units, formula conventions, upstream dependencies, and exact output labels explicit. The phenotype page remains the authority for page-specific formulas; this page summarizes cross-cutting conventions.
- Modality
- All modalities
- Pipeline step
- Feature extraction and output documentation
- Outputs
- Documented formulas, units, column naming, and missingness rules
- Maturity
- Source-audited method page
Formula conventions
| Feature | Current convention | Unit |
|---|---|---|
| Stroke volume | EDV - ESV | mL |
| Ejection fraction | (EDV - ESV) / EDV * 100 | % |
| Cardiac output | stroke volume * heart rate * 1e-3 | L/min |
| Cardiac index | cardiac output / BSA | L/min/m^2 |
| Indexed volume or mass | raw value divided by BSA | mL/m^2 or g/m^2 |
| Aortic equivalent diameter | 2 * sqrt(area / pi) | mm |
| Aortic distensibility | (area_max - area_min) / (area_min * central pulse pressure) * 1e3 | 10^-3/mmHg |
| Regurgitant fraction | backward flow divided by forward flow | % |
| Native T1 correction | aggregate-fitted blood-pool correction | ms |
Unit rules
| Feature class | Typical unit | Notes |
|---|---|---|
| Volume | mL | Indexed variants should state denominator |
| Mass | g | Myocardial density convention belongs on myocardial pages |
| Ejection/emptying fraction | % | Derived from phase-specific volumes |
| Strain | % | Sign convention and backend must be documented |
| Strain rate | 1/s | Peak definitions are method-sensitive |
| Torsion/recoil | degree/cm and degree/cm/s | Length normalization should be stated |
| T1 | ms | Acquisition- and correction-specific interpretation |
| Flow | mL, mL/s, cm/s, cm^2 | Depends on phase-contrast conventions |
Column naming rule
Every documented output should preserve the pipeline column name. Units should remain visible in the field name when the CSV already contains units, for example LV: V_ED [mL], Native T1: Myocardium-Global [ms], or Aortic Flow: Regurgitant Fraction [%].
Legacy labels are contracts
Do not silently normalize legacy output strings. If a current CSV contains a typo or old naming convention, document it as schema debt and migrate with an explicit versioned plan.
Missing input behavior
Derived features can be conditional. BSA-indexed rows depend on BSA lookup, distensibility rows depend on pressure data, atrial-contribution rows depend on ECG-derived timing, and strain-rate rows depend on peak detection. Public pages should say when a row is conditional rather than implying universal availability.
Source audit
- Formula conventions were checked against current phenotype pages and implementation sources under
src/feature_extraction/**. - Output names and schema-debt conventions were checked against
docs/data/output_column_inventory.ymlanddocs/data/phenotype_dictionary.yml. - Textbook context boundary: broad clinical textbook context is not surfaced here because this page documents formula and schema conventions.