Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Validation Rules

xportrs provides built-in validation to catch compliance issues before file writing. This page documents the validation rules and their severity levels.

Validation Overview

graph LR
    subgraph "Validation Pipeline"
        A[Dataset] --> B[Agency Rules]
        B --> C[V5 Format Rules]
        C --> D[Issues Collection]
        D --> E{Has Errors?}
        E -->|Yes| F[Block Write]
        E -->|No| G[Allow Write]
    end

Severity Levels

SeverityMeaningBlocks Write?
ErrorFile would be rejectedYes
WarningReview recommendedNo
InfoBest practice suggestionNo

Built-in Validation Rules

Variable Name Rules

RuleSeverityMessage
Name emptyError“Variable name cannot be empty”
Name >8 bytesError“Variable name exceeds 8 bytes”
Invalid charactersError“Variable name contains invalid characters”
Starts with numberError“Variable name must start with a letter”

Variable Label Rules

RuleSeverityMessage
Label missingWarning“Variable ‘X’ is missing a label”
Label >40 bytesError“Variable label exceeds 40 bytes”
Non-ASCII (FDA)Error“Variable label contains non-ASCII characters”

Dataset Rules

RuleSeverityMessage
Name emptyErrorDataset name cannot be empty”
Name >8 bytesError“Dataset name exceeds 8 bytes”
Label missingWarning“Dataset is missing a label”
Label >40 bytesError“Dataset label exceeds 40 bytes”

Data Rules

RuleSeverityMessage
Column length mismatchError“Columns have different lengths”
Character >200 bytesError“Character value exceeds 200 bytes”

Using Validation

Basic Validation

#![allow(unused)]
fn main() {
use xportrs::Xpt;

let validated = Xpt::writer(dataset).finalize() ?;

// Check for any issues
if validated.has_errors() {
eprintln ! ("Cannot write file due to errors:");
for issue in validated.issues() {
if issue.severity() == xportrs::Severity::Error {
eprintln ! ("  ERROR: {}", issue);
}
}
return Err("Validation failed".into());
}

// Proceed with write
validated.write_path("output.xpt") ?;
}

Agency-Specific Validation

#![allow(unused)]
fn main() {
use xportrs::{Agency, Xpt};

// FDA validation (strict ASCII)
let fda_validated = Xpt::writer(dataset.clone())
.agency(Agency::FDA)
.finalize() ?;

// PMDA validation (allows extended characters)
let pmda_validated = Xpt::writer(dataset)
.agency(Agency::PMDA)
.finalize() ?;
}

Filtering Issues

#![allow(unused)]
fn main() {
use xportrs::{Severity, Xpt};

let validated = Xpt::writer(dataset).finalize() ?;

// Get only errors
let errors: Vec<_ > = validated.issues()
.iter()
.filter( | i| i.severity() == Severity::Error)
.collect();

// Get only warnings
let warnings: Vec<_ > = validated.issues()
.iter()
.filter( | i| i.severity() == Severity::Warning)
.collect();
}

Checking Specific Variables

#![allow(unused)]
fn main() {
let validated = Xpt::writer(dataset).finalize() ?;

for issue in validated.issues() {
// Check what the issue targets
match issue.target() {
"USUBJID" => println ! ("Issue with USUBJID: {}", issue),
"AESEQ" => println ! ("Issue with AESEQ: {}", issue),
_ => {}
}
}
}

Pinnacle 21 Rules

xportrs validation covers XPT-level rules. For full CDISC compliance, use Pinnacle 21:

Rules Covered by xportrs

Pinnacle 21 RuleDescriptionxportrs
SD1001Variable name >8 characters✅ Error
SD1002Variable label >40 characters✅ Error
SD0063Missing/mismatched variable label✅ Warning
SD0063AMissing/mismatched dataset label✅ Warning

Rules Requiring External Validation

Pinnacle 21 RuleDescriptionWhy External
SD0001Missing required variableDomain-specific
SD0002Null value in required fieldData content
SD0060Variable not in define.xmlRequires define.xml
CT2002Invalid controlled terminologyRequires CDISC CT
SE0063Label doesn’t match SDTM standardRequires SDTM metadata

Custom Validation

You can add custom validation before writing:

use xportrs::{Dataset, Xpt};

fn validate_custom(dataset: &Dataset) -> Vec<String> {
    let mut issues = vec![];

    // Check for required variables
    let required = ["STUDYID", "USUBJID"];
    for var in required {
        if dataset.column(var).is_none() {
            issues.push(format!("Missing required variable: {}", var));
        }
    }

    // Check STUDYID consistency
    if let Some(col) = dataset.column("STUDYID") {
        if let xportrs::ColumnData::String(values) = col.data() {
            let first = values.first().and_then(|v| v.as_ref());
            for (i, value) in values.iter().enumerate() {
                if value.as_ref() != first {
                    issues.push(format!("STUDYID inconsistent at row {}", i));
                }
            }
        }
    }

    issues
}

fn main() -> xportrs::Result<()> {
    let dataset = /* ... */;

    // Custom validation
    let custom_issues = validate_custom(&dataset);
    if !custom_issues.is_empty() {
        for issue in custom_issues {
            eprintln!("Custom validation: {}", issue);
        }
        return Err(xportrs::Error::invalid_data("Custom validation failed"));
    }

    // xportrs validation
    let validated = Xpt::writer(dataset).finalize()?;
    validated.write_path("output.xpt")?;

    Ok(())
}

Validation Best Practices

[!TIP] Run validation early in your pipeline to catch issues before processing large datasets.

  1. Validate incrementally: Check validation after each transformation step
  2. Log all issues: Even warnings may indicate data quality problems
  3. Use agency-specific validation: Different agencies have different requirements
  4. Combine with Pinnacle 21: xportrs + Pinnacle 21 provides comprehensive coverage
  5. Document exceptions: If you must ship with warnings, document why