Skip to content
Pillar VI · Data Quality & Validation

The check before the chart.

Every dataset that enters Aixys is profiled — column types, missingness, range, distribution, duplicates, units. Issues are surfaced before analysis begins, with a confidence score for the dataset as a whole. The same checks rerun on every refresh so a study cannot quietly drift onto bad data.

Readiness score
0–100
Built-in checks
18
Re-runs automatically
Every refresh
Readiness · 0 – 100
Re-runs on every refresh
86/ 100ANALYSIS-READY
01 · capability

What you can do

Click any card to expand. Every action carries provenance — what data was used, what was decided, when, by whom.

01.5 · what gets flagged

The checks that stop bad numbers reaching a chart

Every dataset is profiled column-by-column, then re-profiled on every refresh. Issues come with a recommended next action, not just a flag.

Missingness · 24 rows shownPattern detected · MAR · Block C
Plot0%Block0%Yield3%N rate0%Rainfall12%Soil N18%Cultivar0%Date4%presentmissing · per row24 rows shown · 3,400 total
Reading: rainfall and soil-N missingness is concentrated in the same blocks — consistent with a sensor or paperwork gap. Aixys recommends backfill from adjacent plots or a sensitivity model excluding those blocks.
Issues · prioritised4 actionable · 0 blocking
  • high
    Soil N missing in 18% of rows

    Pattern concentrated in Block C — recommend backfill from adjacent plots or exclude block from N-sensitive models.

  • medium
    Rainfall column has 12% gaps

    Missing at-random across blocks; safe to impute with site-mean for visualisation, exclude rows for regression.

  • medium
    Two yield outliers detected

    Row 27 (z = 2.81) and Row 44 (z = -2.52). Cross-check field log; flag for sensitivity analysis but retain.

  • low
    Date column has 4% non-ISO values

    Mixed dd/mm/yyyy and mm-dd-yyyy. Aixys normalised to ISO 8601 — please confirm intent.

02 · in practice

How it shows up in the field

Three representative scenarios drawn from real research workflows. Click any to expand the follow-up that the team typically runs next.

03 · sectors

Where this belongs

Industries where this module is most directly relevant. The underlying engine is general — sector templates accelerate the work.

AgricultureBiotechLab QCField operationsAcademic studiesRegulated research
Continue · 07

See the data quality module against your own data.

A 30-minute working session with the Aixys studio. Bring a real dataset; leave with an analysis, a plan, and answers to your team's hardest research-ops questions.