Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Validation API

xportrs provides comprehensive validation for XPT files. This page details the validation API.

Validation Overview

graph TB
    subgraph "Validation Pipeline"
        A[Dataset] --> B[Agency Rules]
        B --> C[Format Rules]
        C --> D[CDISC Rules]
        D --> E[Issue Collection]
    end
    
    subgraph "Issue Types"
        E --> F[Errors]
        E --> G[Warnings]
        E --> H[Info]
    end

ValidatedWrite

The ValidatedWrite type represents a validated dataset ready for writing:

use xportrs::{Severity, Xpt, Dataset, Column, ColumnData};
fn main() -> xportrs::Result<()> {
let dataset = Dataset::new("AE", vec![Column::new("A", ColumnData::F64(vec![Some(1.0)]))])?;
let validated = Xpt::writer(dataset).finalize()?;

// Check for issues
println!("Has errors: {}", validated.has_errors());
println!("Has warnings: {}", validated.has_warnings());

// Get all issues
let issues = validated.issues();

// Only write if no errors
if !validated.has_errors() {
    validated.write_path("output.xpt")?;
}
Ok(())
}

Issue Type

The Issue enum represents validation issues:

Issue Variants

use xportrs::Issue;
let issue: Issue = todo!();
match issue {
    Issue::VariableNameTooLong { variable, length } => {
        println!("Variable {} name is {} bytes (max 8)", variable, length);
    }
    Issue::VariableLabelTooLong { variable, length } => {
        println!("Variable {} label is {} bytes (max 40)", variable, length);
    }
    Issue::MissingVariableLabel { variable } => {
        println!("Variable {} is missing a label", variable);
    }
    Issue::MissingDatasetLabel { dataset } => {
        println!("Dataset {} is missing a label", dataset);
    }
    Issue::InvalidFormatSyntax { variable, format, reason } => {
        println!("Variable {} has invalid format '{}': {}", variable, format, reason);
    }
    // ... other variants
    _ => {}
}

Issue Properties

use xportrs::{Severity, Issue};
let issue: Issue = todo!();
// Severity level
let severity: Severity = issue.severity();

// Target (variable name, dataset name, etc.)
let target: &str = issue.target();

// Display representation
println!("{}", issue);

// Debug representation
println!("{:?}", issue);

Severity Levels

use xportrs::Severity;
let severity = Severity::Error;
match severity {
    Severity::Error => {
        // Blocks file writing
        // File would be rejected by agency
    }
    Severity::Warning => {
        // Does not block writing
        // Review recommended
    }
    Severity::Info => {
        // Informational only
        // Best practice suggestion
    }
}

// Severity is ordered
assert!(Severity::Info < Severity::Warning);
assert!(Severity::Warning < Severity::Error);

Filtering Issues

use xportrs::{Xpt, Dataset, Column, ColumnData, Severity};
fn main() -> xportrs::Result<()> {
let dataset = Dataset::new("AE", vec![Column::new("A", ColumnData::F64(vec![Some(1.0)]))])?;
let validated = Xpt::writer(dataset).finalize()?;

// Get only errors
let errors: Vec<_> = validated.issues()
    .iter()
    .filter(|i| i.severity() == Severity::Error)
    .collect();

// Get issues for specific variable
let usubjid_issues: Vec<_> = validated.issues()
    .iter()
    .filter(|i| i.target() == "USUBJID")
    .collect();

// Count by severity
let error_count = validated.issues()
    .iter()
    .filter(|i| i.severity() == Severity::Error)
    .count();
Ok(())
}

Agency-Specific Validation

use xportrs::{Agency, Xpt, Dataset, Column, ColumnData};
fn main() -> xportrs::Result<()> {
let dataset = Dataset::new("AE", vec![Column::new("A", ColumnData::F64(vec![Some(1.0)]))])?;
// FDA: Strict ASCII validation
let fda_result = Xpt::writer(dataset.clone())
    .agency(Agency::FDA)
    .finalize()?;

// Check for ASCII violations
for issue in fda_result.issues() {
    if format!("{}", issue).contains("ASCII") {
        println!("ASCII issue: {}", issue);
    }
}
Ok(())
}

Validation Rules

Variable Name Rules

RuleSeverityTrigger
Empty nameErrorName is empty string
Name too longErrorName > 8 bytes
Invalid charactersErrorNon-alphanumeric (except _)
Starts with numberErrorFirst char is digit
Non-uppercaseInfoLowercase letters present

Variable Label Rules

RuleSeverityTrigger
Missing labelWarningLabel is None or empty
Label too longErrorLabel > 40 bytes
Non-ASCII (FDA)ErrorNon-ASCII characters

Dataset Rules

RuleSeverityTrigger
Empty nameErrorDomain code is empty
Name too longErrorDomain code > 8 bytes
Missing labelWarningDataset label is None
Label too longErrorLabel > 40 bytes

Data Rules

RuleSeverityTrigger
Column length mismatchErrorColumns have different lengths
Character too longErrorCharacter value > 200 bytes

Custom Pre-Validation

Add custom validation before xportrs validation:

use xportrs::{Dataset, Xpt};

fn custom_validate(dataset: &Dataset) -> Result<(), String> {
    // Check for required variables
    let required = ["STUDYID", "USUBJID"];
    for var in required {
        if dataset.column(var).is_none() {
            return Err(format!("Missing required variable: {}", var));
        }
    }

    // Check STUDYID consistency
    // ... additional checks ...

    Ok(())
}

fn write_with_validation(dataset: Dataset, path: &str) -> xportrs::Result<()> {
    // Custom validation first
    custom_validate(&dataset)
        .map_err(|e| xportrs::Error::invalid_data(e))?;

    // Then xportrs validation
    let validated = Xpt::writer(dataset).finalize()?;

    if validated.has_errors() {
        return Err(xportrs::Error::invalid_data("Validation failed"));
    }

    validated.write_path(path)?;
    Ok(())
}

Validation Reporting

use xportrs::{Severity, Xpt};

fn report_validation(dataset: xportrs::Dataset) {
    let validated = Xpt::writer(dataset).finalize().unwrap();

    // Summary
    let errors = validated.issues().iter()
        .filter(|i| i.severity() == Severity::Error).count();
    let warnings = validated.issues().iter()
        .filter(|i| i.severity() == Severity::Warning).count();
    let infos = validated.issues().iter()
        .filter(|i| i.severity() == Severity::Info).count();

    println!("Validation Summary:");
    println!("  Errors:   {}", errors);
    println!("  Warnings: {}", warnings);
    println!("  Info:     {}", infos);

    // Detailed report
    if !validated.issues().is_empty() {
        println!("\nDetails:");
        for issue in validated.issues() {
            let prefix = match issue.severity() {
                Severity::Error => "ERROR",
                Severity::Warning => "WARN ",
                Severity::Info => "INFO ",
            };
            println!("  [{}] {} - {}", prefix, issue.target(), issue);
        }
    }
}

Integration with Pinnacle 21

[!NOTE] xportrs validation covers XPT-level rules. For complete CDISC validation, use Pinnacle 21 or similar tools.

Validation AreaxportrsPinnacle 21
Variable names
Variable labels
Format metadata
Controlled terminology
Required variables
Cross-dataset consistency
define.xml matching

Best Practices

  1. Validate early: Check validation before processing large datasets
  2. Log all issues: Keep records of validation results
  3. Fail on errors: Don’t write files with validation errors
  4. Review warnings: Warnings may indicate data quality issues
  5. Document exceptions: If shipping with warnings, document why