Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Welcome to Trial Submission Studio

Trial Submission Studio

Transform clinical trial data into FDA-compliant CDISC formats with confidence.

Trial Submission Studio is a free, open-source desktop application for transforming clinical trial source data (CSV) into CDISC-compliant submission formats.

Caution

ALPHA SOFTWARE - ACTIVE DEVELOPMENT

Trial Submission Studio is currently in early development. Features are incomplete, APIs may change, and bugs are expected. Do not use for production regulatory submissions.

Always validate all outputs with qualified regulatory professionals before submission to regulatory authorities.


See It in Action

Select your CDISC standard and open your study data:

Welcome Screen

Automatic domain discovery with intelligent column mapping:

Column Mapping

Built-in validation against CDISC standards:

Validation



Key Features

FeatureDescription
Multi-format OutputXPT V5/V8, Dataset-XML, Define-XML 2.1
Intelligent MappingFuzzy matching for automatic column-to-variable mapping
CDISC ValidationBuilt-in controlled terminology validation
Cross-platformNative GUI for macOS, Windows, and Linux
Offline OperationAll CDISC standards embedded locally

Supported Standards

Currently Supported:

  • SDTM-IG v3.4
  • Controlled Terminology (2024-2025 versions)

Planned:

  • ADaM-IG v1.3
  • SEND-IG v3.1.1

Getting Help


License

Trial Submission Studio is open source software licensed under the MIT License.


Built with Rust and Iced

Installation

Download the latest release for your platform from our GitHub Releases page.

Download Options

PlatformArchitectureFormatDownload
macOSApple Silicon (M1/M2/M3+).dmg or .zipDownload
macOSIntel (x86_64).dmg or .zipDownload
Windowsx86_64 (64-bit).zipDownload
WindowsARM64.zipDownload
Linuxx86_64 (64-bit).tar.gzDownload

Verifying Your Download

Each release includes SHA256 checksum files (.sha256) for security verification.

macOS/Linux

# Download the checksum file and binary, then verify
shasum -a 256 -c trial-submission-studio-*.sha256

Windows (PowerShell)

# Compare the checksum
Get-FileHash trial-submission-studio-*.zip -Algorithm SHA256

Platform-Specific Instructions

macOS

  1. Download the .dmg file for your architecture
  2. Open the .dmg file
  3. Drag Trial Submission Studio to your Applications folder
  4. On first launch, you may need to right-click and select “Open” to bypass Gatekeeper

[!TIP] Which version do I need?

Click the Apple menu () > About This Mac:

  • Chip: Apple M1/M2/M3 → Download the Apple Silicon version
  • Processor: Intel → Download the Intel version

Windows

  1. Download the .zip file for your architecture
  2. Extract the archive to your preferred location
  3. Run trial-submission-studio.exe

Linux

  1. Download the .tar.gz file
  2. Extract: tar -xzf trial-submission-studio-*.tar.gz
  3. Run: ./trial-submission-studio

Uninstalling

Trial Submission Studio is a portable application that does not modify system settings or registry entries.

Windows

  1. Delete the extracted folder containing trial-submission-studio.exe
  2. Optionally delete settings from %APPDATA%\trial-submission-studio\

macOS

  1. Drag Trial Submission Studio from Applications to Trash
  2. Optionally delete settings from ~/Library/Application Support/trial-submission-studio/

Linux

  1. Delete the AppImage file or extracted folder
  2. Optionally delete settings from ~/.config/trial-submission-studio/

Next Steps

Quick Start Guide

Get up and running with Trial Submission Studio in 5 minutes.

Overview

This guide walks you through the basic workflow:

flowchart LR
    A["Import CSV"] --> B["Map Columns"]
    B --> C["Validate"]
    C --> D["Export"]

    style A fill:#4a90d9,color:#fff
    style D fill:#50c878,color:#fff
  1. Import your source CSV data
  2. Map columns to SDTM variables
  3. Validate against CDISC standards
  4. Export to XPT format

Step 1: Launch the Application

After installing Trial Submission Studio, launch the application:

  • macOS: Open from Applications folder
  • Windows: Run trial-submission-studio.exe
  • Linux: Run ./trial-submission-studio

You’ll see the welcome screen where you can select your CDISC standard:

Welcome Screen


Step 2: Import Your Data

  1. Click Open Study Folder and select your data folder
  2. Trial Submission Studio will automatically:
    • Detect column types
    • Identify potential SDTM domains
    • Parse date formats

Tip

Your data should have column headers in the first row.


Step 3: Review Discovered Domains

Trial Submission Studio automatically discovers domains from your source data:

Study Overview

  1. Review the list of discovered domains (DM, AE, VS, etc.)
  2. Click on a domain to configure its mappings

Step 4: Map Columns

  1. Review the suggested column mappings
  2. For each source column, select the corresponding SDTM variable
  3. Use the fuzzy matching suggestions to speed up mapping

Column Mapping

The mapping interface shows:

  • Source Column: Your CSV column name
  • Target Variable: The SDTM variable
  • Match Score: Confidence of the suggested mapping (e.g., 93% match)

Step 5: Validate

  1. Switch to the Validation tab to check your data against CDISC rules
  2. Review any validation messages:
    • Errors: Must be fixed before export
    • Warnings: Should be reviewed
    • Info: Informational messages

Validation Results

Each validation issue includes the rule ID, a description, and suggestions on how to fix it.


Step 6: Export

  1. Click Go to Export or navigate to the Export screen
  2. Select which domains to export
  3. Choose your output format:
    • XPT (SAS Transport) (FDA standard)
    • Dataset-XML (CDISC data exchange)
  4. Click Export

Export Settings


Next Steps

Now that you’ve completed the basic workflow:

System Requirements

Trial Submission Studio is designed to run on modern desktop systems with minimal resource requirements.

Supported Platforms

PlatformArchitectureMinimum VersionStatus
macOSApple Silicon (M1/M2/M3+)macOS 11.0 (Big Sur)Supported
macOSIntel (x86_64)macOS 10.15 (Catalina)Supported
Windowsx86_64 (64-bit)Windows 10Supported
WindowsARM64Windows 11Supported
Linuxx86_64 (64-bit)Ubuntu 20.04 or equivalentSupported

Hardware Requirements

ComponentMinimumRecommended
RAM4 GB8 GB+
Disk Space200 MB500 MB
Display1280x7201920x1080+

Software Dependencies

Trial Submission Studio is a standalone application with no external dependencies:

  • No SAS installation required
  • No Java runtime required
  • No internet connection required (works fully offline)
  • All CDISC standards are embedded in the application

Performance Considerations

Large Datasets

Trial Submission Studio can handle datasets with:

  • Hundreds of thousands of rows
  • Hundreds of columns

For very large datasets (1M+ rows), consider:

  • Ensuring adequate RAM (8GB+)
  • Using SSD storage for faster I/O
  • Processing data in batches if needed

Memory Usage

Memory usage scales with dataset size. Approximate guidelines:

  • Small datasets (<10,000 rows): ~100 MB RAM
  • Medium datasets (10,000-100,000 rows): ~500 MB RAM
  • Large datasets (100,000+ rows): 1+ GB RAM

Troubleshooting

macOS Gatekeeper

On first launch, macOS may block the application. To resolve:

  1. Right-click the application
  2. Select “Open”
  3. Click “Open” in the dialog

Linux Permissions

Ensure the executable has run permissions:

chmod +x trial-submission-studio

Windows SmartScreen

If Windows SmartScreen blocks the application:

  1. Click “More info”
  2. Click “Run anyway”

Next Steps

Building from Source

For developers who want to compile Trial Submission Studio from source code.

Prerequisites

Required

  • Rust 1.92+ - Install via rustup
  • Git - For cloning the repository

Platform-Specific Dependencies

macOS

No additional dependencies required.

Linux (Ubuntu/Debian)

sudo apt-get install libgtk-3-dev libxdo-dev

Windows

No additional dependencies required.

Clone the Repository

git clone https://github.com/rubentalstra/trial-submission-studio.git
cd trial-submission-studio

Verify Rust Version

rustup show

Ensure you have Rust 1.92 or higher. To update:

rustup update stable

Build

Debug Build (faster compilation)

cargo build

Release Build (optimized, slower compilation)

cargo build --release

Run

Debug

cargo run --package tss-gui

Release

cargo run --release --package tss-gui

Or run the compiled binary directly:

./target/release/tss-gui        # macOS/Linux
.\target\release\tss-gui.exe    # Windows

Run Tests

# All tests
cargo test

# Specific crate
cargo test --package tss-submit

# With output
cargo test -- --nocapture

Run Lints

# Format check
cargo fmt --check

# Clippy lints
cargo clippy -- -D warnings

Project Structure

Trial Submission Studio is organized as a 6-crate Rust workspace:

trial-submission-studio/
├── crates/
│   ├── tss-gui/            # Desktop application (Iced 0.14.0)
│   ├── tss-submit/         # Mapping, normalization, validation, export
│   ├── tss-ingest/         # CSV loading
│   ├── tss-standards/      # CDISC standards loader
│   ├── tss-updater/        # Auto-update functionality
│   └── tss-updater-helper/ # macOS update helper
├── standards/              # Embedded CDISC standards
├── mockdata/               # Test datasets
└── docs/                   # Documentation (this site)

Third-Party Licenses

When adding or updating dependencies, regenerate the licenses file:

# Install cargo-about (one-time)
cargo install cargo-about

# Generate licenses
cargo about generate about.hbs -o THIRD_PARTY_LICENSES.md

IDE Setup

RustRover / IntelliJ IDEA

  1. Open the project folder
  2. The Rust plugin will detect the workspace automatically

VS Code

  1. Install the rust-analyzer extension
  2. Open the project folder

Next Steps

Interface Overview

Trial Submission Studio features a clean, intuitive interface designed for clinical data programmers.

Welcome Screen

When you first launch the application, you’ll see the welcome screen where you can select your target CDISC standard and open a study folder:

Welcome Screen

Study Overview

After opening a study folder, Trial Submission Studio automatically discovers domains from your source data:

Study Overview

Main Window Layout

The application is organized into several key areas:

┌─────────────────────────────────────────────────────────────┐
│  Menu Bar                                                    │
├─────────────────────────────────────────────────────────────┤
│  Toolbar                                                     │
├──────────────────┬──────────────────────────────────────────┤
│                  │                                           │
│  Navigation      │  Main Content Area                        │
│  Panel           │                                           │
│                  │  - Data Preview                           │
│  - Import        │  - Mapping Interface                      │
│  - Mapping       │  - Validation Results                     │
│  - Validation    │  - Export Options                         │
│  - Export        │                                           │
│                  │                                           │
├──────────────────┴──────────────────────────────────────────┤
│  Status Bar                                                  │
└─────────────────────────────────────────────────────────────┘

File Menu

  • Import CSV - Load source data
  • Export - Save to XPT/XML formats
  • Recent Files - Quick access to recent projects
  • Exit - Close the application

Edit Menu

  • Undo/Redo - Reverse or repeat actions
  • Preferences - Application settings

Help Menu

  • Documentation - Open this documentation
  • About - Version and license information
  • Third-Party Licenses - Dependency attributions

About Dialog

Toolbar

Quick access to common actions:

  • Import - Load CSV file
  • Validate - Run validation checks
  • Export - Save output files

The left sidebar provides step-by-step workflow navigation:

  1. Import - Load and preview source data
  2. Domain - Select target SDTM domain
  3. Mapping - Map columns to variables
  4. Validation - Review validation results
  5. Export - Generate output files

Main Content Area

The central area displays context-sensitive content based on the current workflow step:

Import View

  • File selection
  • Data preview table
  • Column type detection
  • Schema information

Mapping View

  • Source columns list
  • Target variables list
  • Mapping connections
  • Match confidence scores

Validation View

  • Validation rule results
  • Error/warning/info messages
  • Affected rows and columns
  • Suggested fixes

Validation View

Preview View

Preview your SDTM-compliant data before export:

SDTM Preview

Export View

  • Format selection
  • Output options
  • File destination
  • Progress indicator

Status Bar

The bottom bar displays:

  • Current file name
  • Row/column counts
  • Validation status
  • Progress for long operations

Keyboard Shortcuts

ActionmacOSWindows/Linux
Import⌘OCtrl+O
Export⌘ECtrl+E
Validate⌘RCtrl+R
Undo⌘ZCtrl+Z
Redo⌘⇧ZCtrl+Shift+Z
Preferences⌘,Ctrl+,
Quit⌘QAlt+F4

Themes

Trial Submission Studio supports light and dark themes. Change via: Edit → Preferences → Appearance

Next Steps

Importing Data

Trial Submission Studio accepts CSV files as input and automatically detects schema information.

Supported Input Format

Currently, Trial Submission Studio supports:

  • CSV files (.csv)
  • UTF-8 or ASCII encoding
  • Comma-separated values
  • Headers in first row

Import Methods

Drag and Drop

Simply drag a CSV file from your file manager and drop it onto the application window.

File Menu

  1. Click File → Import CSV
  2. Navigate to your file
  3. Click Open

Toolbar Button

Click the Import button in the toolbar.

Automatic Detection

When you import a file, Trial Submission Studio automatically:

Column Type Detection

Analyzes sample values to determine:

  • Numeric - Integer or floating-point numbers
  • Date/Time - Various date formats
  • Text - Character strings

Domain Suggestion

Based on column names, suggests likely SDTM domains:

  • USUBJID, AGE, SEX → Demographics (DM)
  • AETERM, AESTDTC → Adverse Events (AE)
  • VSTESTCD, VSSTRESN → Vital Signs (VS)

Date Format Detection

Automatically recognizes common date formats:

  • ISO 8601: 2024-01-15
  • US format: 01/15/2024
  • EU format: 15-01-2024
  • With time: 2024-01-15T09:30:00

Data Preview

After import, you’ll see:

Data Grid

  • First 100 rows displayed
  • Scroll to view more data
  • Column headers with detected types

Summary Panel

  • Total row count
  • Total column count
  • File size
  • Encoding detected

Column Information

  • Column name
  • Detected type
  • Sample values
  • Null count

Handling Issues

Encoding Problems

If you see garbled characters:

  1. Ensure your file is UTF-8 encoded
  2. Re-save from your source application with UTF-8 encoding

Missing Headers

If your CSV lacks headers:

  1. Add a header row to your file
  2. Re-import

Large Files

For files with millions of rows:

  • Import may take longer
  • A progress indicator will show status
  • Consider splitting into smaller files if needed

Best Practices

  1. Clean your data before import

    • Remove trailing whitespace
    • Standardize date formats
    • Check for encoding issues
  2. Use descriptive column names

    • Helps with automatic mapping suggestions
    • Use SDTM-like naming when possible
  3. Include all required data

    • USUBJID for subject identification
    • Domain-specific required variables

Next Steps

Column Mapping

The mapping interface helps you connect your source CSV columns to SDTM variables.

Mapping Interface

Overview

Column mapping is a critical step that defines how your source data transforms into SDTM-compliant output.

flowchart LR
    subgraph Source[Source CSV]
        S1[SUBJ_ID]
        S2[PATIENT_AGE]
        S3[GENDER]
        S4[VISIT_DATE]
    end

    subgraph Mapping[Fuzzy Matching]
        M[Match<br/>Algorithm]
    end

    subgraph Target[SDTM Variables]
        T1[USUBJID]
        T2[AGE]
        T3[SEX]
        T4[RFSTDTC]
    end

    S1 --> M --> T1
    S2 --> M --> T2
    S3 --> M --> T3
    S4 --> M --> T4
    style M fill: #4a90d9, color: #fff

The Mapping Interface

┌─────────────────────────────────────────────────────────────┐
│ Source Columns          │  Target Variables                 │
├─────────────────────────┼───────────────────────────────────┤
│ SUBJ_ID         ────────│──▶  USUBJID                       │
│ PATIENT_AGE     ────────│──▶  AGE                           │
│ GENDER          ────────│──▶  SEX                           │
│ VISIT_DATE      ────────│──▶  RFSTDTC                       │
│ RACE_DESC       ────────│──▶  RACE                          │
│ [Unmapped]              │     ETHNIC (Required)             │
└─────────────────────────┴───────────────────────────────────┘

Automatic Mapping

Trial Submission Studio uses fuzzy matching to suggest mappings:

How It Works

  1. Analyzes source column names
  2. Compares against SDTM variable names
  3. Calculates similarity scores
  4. Suggests best matches

Match Confidence

  • High (>80%) - Strong name similarity, auto-accepted
  • Medium (50-80%) - Review recommended
  • Low (<50%) - Manual mapping needed

Example Matches

Source ColumnSuggested VariableConfidence
SUBJECT_IDUSUBJID85%
AGEAGE100%
GENDERSEX75%
VSTESTVALVSSTRESN70%

Manual Mapping

To Map a Column

  1. Click on the source column
  2. Click on the target variable
  3. A connection line appears

To Unmap a Column

  1. Click on the connection line
  2. Or right-click and select “Remove Mapping”

To Change a Mapping

  1. Remove the existing mapping
  2. Create a new mapping

Required vs Optional Variables

Required Variables

Shown with a red indicator. Must be mapped for valid output:

  • STUDYID - Study identifier
  • DOMAIN - Domain abbreviation
  • USUBJID - Unique subject identifier

Optional Variables

Shown without indicator. Map if data is available.

Expected Variables

Shown with yellow indicator. Expected for the domain but not strictly required.

Data Type Considerations

The mapping interface warns about type mismatches:

WarningDescription
Type MismatchSource is text, target is numeric
Length ExceededSource values exceed SDTM length limits
Format WarningDate format needs conversion

Controlled Terminology

For variables with controlled terminology:

  • The interface shows valid values
  • Warns if source values don’t match
  • Suggests value mappings

CT Normalization

The Transform tab allows you to normalize values to CDISC Controlled Terminology:

CT Normalization

Values are automatically transformed to their standardized form (e.g., “Years” → “YEARS”).

Supplemental Qualifiers (SUPP)

For non-standard variables that need to be captured as supplemental qualifiers, use the SUPP tab:

SUPP Configuration

Configure QNAM, QLABEL, QORIG, and QEVAL for each supplemental qualifier variable.

Mapping Templates

Save a Template

  1. Complete your mappings
  2. File → Save Mapping Template
  3. Name your template

Load a Template

  1. Import your data
  2. File → Load Mapping Template
  3. Select the template
  4. Review and adjust as needed

Best Practices

  1. Review all automatic mappings - Don’t blindly accept
  2. Map required variables first - Ensure compliance
  3. Check controlled terminology - Validate allowed values
  4. Save templates - Reuse for similar datasets

Next Steps

Validation

Trial Submission Studio validates your data against CDISC standards before export.

Validation Results

Validation Overview

flowchart LR
    subgraph Input
        DATA[Mapped Data]
    end

    subgraph Checks
        STRUCT[Structure<br/>Required variables]
        CT[Terminology<br/>Codelist values]
        CROSS[Cross-Domain<br/>Consistency]
    end

    subgraph Output
        ERR[Errors]
        WARN[Warnings]
        INFO[Info]
    end

    DATA --> STRUCT --> CT --> CROSS
    STRUCT --> ERR
    CT --> WARN
    CROSS --> INFO
    style ERR fill: #f8d7da, stroke: #721c24
    style WARN fill: #fff3cd, stroke: #856404
    style INFO fill: #d1ecf1, stroke: #0c5460

Validation checks ensure your data:

  • Conforms to SDTM structure
  • Uses correct controlled terminology
  • Meets FDA submission requirements

Running Validation

Automatic Validation

Validation runs automatically when you:

  • Complete column mapping
  • Make changes to mappings
  • Prepare for export

Manual Validation

Click Validate in the toolbar or press Ctrl+R (⌘R on macOS).

Validation Results

Result Categories

CategoryIconDescription
ErrorRedMust be fixed before export
WarningYellowShould be reviewed
InfoBlueInformational, no action required

Results Panel

┌─────────────────────────────────────────────────────────────┐
│ Validation Results                           [✓] [⚠] [ℹ]   │
├─────────────────────────────────────────────────────────────┤
│ ❌ SD0001: USUBJID is required but not mapped               │
│    Rows affected: All                                        │
│    Fix: Map a column to USUBJID                             │
├─────────────────────────────────────────────────────────────┤
│ ⚠️ CT0015: Value "M" not in SEX codelist                    │
│    Rows affected: 45, 67, 89                                │
│    Expected: MALE, FEMALE, UNKNOWN                          │
├─────────────────────────────────────────────────────────────┤
│ ℹ️ INFO: 1250 rows will be exported                         │
└─────────────────────────────────────────────────────────────┘

Validation Rules

Structural Rules

Rule IDDescription
SD0001Required variable missing
SD0002Invalid variable name
SD0003Variable length exceeded
SD0004Invalid data type

Controlled Terminology Rules

Rule IDDescription
CT0001Value not in codelist
CT0002Codelist not found
CT0003Invalid date format

Cross-Domain Rules

Rule IDDescription
XD0001USUBJID not consistent
XD0002Missing parent record
XD0003Duplicate keys

Fixing Validation Errors

Mapping Errors

  1. Click on the error message
  2. The relevant mapping is highlighted
  3. Adjust the mapping or source data

Data Errors

  1. Note the affected rows
  2. Correct the source data
  3. Re-import and re-validate

Terminology Errors

  1. Review the expected values
  2. Map source values to controlled terms
  3. Or update source data to use standard terms

Controlled Terminology Validation

Supported Codelists

Trial Submission Studio includes embedded controlled terminology:

  • CDISC CT 2025-09-26 (latest)
  • CDISC CT 2025-03-28
  • CDISC CT 2024-03-29

Codelist Validation

For variables like SEX, RACE, COUNTRY:

  • Source values are checked against valid terms
  • Invalid values are flagged
  • Suggestions for correct values are provided

Validation Reports

Export Validation Report

  1. Complete validation
  2. File → Export Validation Report
  3. Choose format (PDF, HTML, CSV)
  4. Save the report

Report Contents

  • Summary statistics
  • All validation messages
  • Affected data rows
  • Recommendations

Best Practices

  1. Validate early and often - Fix issues as you go
  2. Address errors first - Then warnings
  3. Document exceptions - If warnings are intentional
  4. Keep validation reports - For audit trails

Next Steps

Exporting Data

After mapping and validation, export your data to CDISC-compliant formats.

Export Dialog

Export Formats

Trial Submission Studio supports multiple output formats:

FormatVersionDescriptionUse Case
XPTV5SAS Transport (FDA standard)FDA submissions
XPTV8Extended SAS TransportLonger names/labels
Dataset-XML1.0CDISC XML formatData exchange
Define-XML2.1Metadata documentationSubmission package

XPT Export

XPT Version 5 (Default)

The FDA standard format with these constraints:

  • Variable names: 8 characters max
  • Labels: 40 characters max
  • Compatible with SAS V5 Transport

XPT Version 8

Extended format supporting:

  • Variable names: 32 characters
  • Labels: 256 characters
  • Note: Not all systems support V8

Export Steps

  1. Click Export in the toolbar
  2. Select XPT V5 or XPT V8
  3. Choose output location
  4. Click Save

XPT Options

OptionDescription
Include all variablesExport mapped and derived variables
Sort by keysOrder rows by key variables
CompressReduce file size

Dataset-XML Export

CDISC ODM-based XML format for data exchange.

Features

  • Human-readable format
  • Full Unicode support
  • Metadata included
  • Schema validation

Export Steps

  1. Click Export
  2. Select Dataset-XML
  3. Configure options
  4. Click Save

Define-XML Export

Generate submission metadata documentation.

Define-XML 2.1

  • Dataset definitions
  • Variable metadata
  • Controlled terminology
  • Computational methods
  • Value-level metadata

Export Steps

  1. Click Export
  2. Select Define-XML
  3. Review metadata
  4. Click Save

Batch Export

Export multiple domains at once:

  1. File → Batch Export
  2. Select domains to export
  3. Choose format(s)
  4. Set output directory
  5. Click Export All

Export Validation

Before export completes, the system verifies:

  • All required variables are present
  • Data types are correct
  • Lengths don’t exceed limits
  • Controlled terms are valid

Output Files

File Naming

Default naming convention:

  • {domain}.xpt - e.g., dm.xpt, ae.xpt
  • {domain}.xml - for Dataset-XML
  • define.xml - for Define-XML

Checksums

Each export generates:

  • SHA256 checksum file (.sha256)
  • Useful for submission verification

Quality Checks

Post-Export Verification

  1. Open the exported file in a viewer
  2. Verify row counts match
  3. Check variable order
  4. Review sample values

External Validation

Consider validating with:

  • Pinnacle 21 Community
  • SAS (if available)
  • Other CDISC validators

Best Practices

  1. Validate before export - Fix all errors first
  2. Use XPT V5 for FDA - Standard format
  3. Generate checksums - For integrity verification
  4. Test with validators - Confirm compliance
  5. Keep source files - Maintain audit trail

Troubleshooting

Export Fails

IssueSolution
Validation errorsFix errors before export
Disk fullFree up space
Permission deniedCheck write permissions
File in useClose file in other apps

Output Issues

IssueSolution
Truncated valuesCheck length limits
Missing dataVerify mappings
Wrong encodingEnsure UTF-8 source

Next Steps

Common Workflows

Step-by-step guides for typical Trial Submission Studio use cases.

Workflow Overview

flowchart LR
    subgraph "1. Import"
        A[Load CSV]
    end

    subgraph "2. Configure"
        B[Select Domain]
        C[Map Columns]
    end

    subgraph "3. Quality"
        D[Handle CT]
        E[Validate]
    end

    subgraph "4. Output"
        F[Export XPT]
    end

    A --> B --> C --> D --> E --> F
    E -.->|Fix Issues| C
    style A fill: #e8f4f8, stroke: #333
    style F fill: #d4edda, stroke: #333

Workflow 1: Demographics (DM) Domain

Transform demographics source data to SDTM DM domain.

Source Data Example

SUBJECT_ID,AGE,SEX,RACE,ETHNIC,COUNTRY,SITE_ID
SUBJ001,45,Male,WHITE,NOT HISPANIC,USA,101
SUBJ002,38,Female,ASIAN,NOT HISPANIC,USA,102
SUBJ003,52,Male,BLACK,HISPANIC,USA,101

Steps

  1. Import the CSV

    • File → Import CSV
    • Select your demographics file
  2. Select DM Domain

    • Click on “Domain Selection”
    • Choose “DM - Demographics”
  3. Map Columns

    SourceTargetNotes
    SUBJECT_IDUSUBJIDSubject identifier
    AGEAGEAge in years
    SEXSEXMaps to controlled terminology
    RACERACEControlled terminology
    ETHNICETHNICControlled terminology
    COUNTRYCOUNTRYISO 3166 codes
    SITE_IDSITEIDSite identifier
  4. Handle Controlled Terminology

    • “Male” → “M” (or keep if using extensible CT)
    • “Female” → “F”
    • Review RACE and ETHNIC values
  5. Validate

    • Click Validate
    • Address any errors
  6. Export

    • Export → XPT V5
    • Save as dm.xpt

Workflow 2: Adverse Events (AE) Domain

Transform adverse event data to SDTM AE domain.

Source Data Example

SUBJECT_ID,AE_TERM,START_DATE,END_DATE,SEVERITY,SERIOUS
SUBJ001,Headache,2024-01-15,2024-01-17,MILD,N
SUBJ001,Nausea,2024-02-01,,MODERATE,N
SUBJ002,Rash,2024-01-20,2024-01-25,SEVERE,Y

Steps

  1. Import CSV

  2. Select AE Domain

  3. Map Columns

    SourceTargetNotes
    SUBJECT_IDUSUBJID
    AE_TERMAETERMVerbatim term
    START_DATEAESTDTCStart date
    END_DATEAEENDTCEnd date (can be blank)
    SEVERITYAESEVControlled terminology
    SERIOUSAESERY/N
  4. Derive Required Variables

    • AESEQ (sequence number) - auto-generated
    • AEDECOD (dictionary term) - if available
  5. Validate and Export


Workflow 3: Vital Signs (VS) Domain

Transform vital signs measurements to SDTM VS domain.

Source Data Example

SUBJECT_ID,VISIT,TEST,RESULT,UNIT,DATE
SUBJ001,BASELINE,SYSBP,120,mmHg,2024-01-10
SUBJ001,BASELINE,DIABP,80,mmHg,2024-01-10
SUBJ001,WEEK 4,SYSBP,118,mmHg,2024-02-07

Steps

  1. Import CSV

  2. Select VS Domain

  3. Map Columns

    SourceTargetNotes
    SUBJECT_IDUSUBJID
    VISITVISITVisit name
    TESTVSTESTCDTest code
    RESULTVSSTRESNNumeric result
    UNITVSSTRESUResult unit
    DATEVSDTCCollection date
  4. Map Test Codes

    • SYSBP → Systolic Blood Pressure
    • DIABP → Diastolic Blood Pressure
  5. Validate and Export


Workflow 4: Batch Processing

Process multiple domains from one source file.

Source Data

A comprehensive dataset with columns for multiple domains.

Steps

  1. Import the source file
  2. Process each domain
    • Filter relevant columns
    • Map to domain variables
    • Validate
  3. Batch Export
    • File → Batch Export
    • Select all processed domains
    • Export to output folder

Workflow 5: Re-processing with Template

Use a saved mapping template for similar data.

Steps

  1. First Time Setup

    • Import sample data
    • Create mappings
    • Save template: File → Save Mapping Template
  2. Subsequent Processing

    • Import new data (same structure)
    • Load template: File → Load Mapping Template
    • Review and adjust if needed
    • Validate and export

Tips for All Workflows

Before You Start

  • Review source data quality
  • Identify required variables
  • Prepare controlled terminology mappings

During Processing

  • Validate after each major step
  • Document any decisions
  • Keep notes on exceptions

After Export

  • Verify output files
  • Run external validation
  • Archive source and output files

Next Steps

Troubleshooting

Common issues and their solutions when using Trial Submission Studio.

Import Issues

File Won’t Import

SymptomCauseSolution
“Invalid file format”Not a CSV fileEnsure file is CSV format
“Encoding error”Non-UTF8 encodingRe-save as UTF-8
“No data found”Empty file or wrong delimiterCheck file contents
“Parse error”Malformed CSVFix CSV structure

Data Appears Garbled

Cause: Encoding mismatch

Solution:

  1. Open the file in a text editor
  2. Save with UTF-8 encoding
  3. Re-import

Missing Columns

Cause: Header row issues

Solution:

  1. Verify first row contains headers
  2. Check for BOM (byte order mark) issues
  3. Remove hidden characters

Mapping Issues

No Suggested Mappings

Cause: Column names don’t match SDTM variables

Solution:

  1. Manually map columns
  2. Consider renaming source columns
  3. Create a mapping template for reuse

Wrong Automatic Mappings

Cause: Fuzzy matching misidentified variables

Solution:

  1. Review all automatic mappings
  2. Manually correct incorrect mappings
  3. Adjust match confidence threshold in settings

Can’t Map Required Variable

Cause: Source data missing required information

Solution:

  1. Add the missing data to source file
  2. Derive from other columns if possible
  3. Consult with data manager

Validation Issues

Too Many Errors

Cause: Data quality issues or incorrect mappings

Solution:

  1. Address errors in priority order
  2. Fix mapping issues first
  3. Clean source data if needed
  4. Re-validate after each fix

Controlled Terminology Errors

Cause: Values don’t match CDISC CT

Solution:

  1. Review expected values in the error message
  2. Map source values to standard terms
  3. Update source data if appropriate

Date Format Errors

Cause: Non-ISO date formats

Solution:

  1. Convert dates to ISO 8601 format (YYYY-MM-DD)
  2. Or use partial dates where appropriate (YYYY-MM, YYYY)

Export Issues

Export Fails

ErrorCauseSolution
“Validation errors exist”Unresolved errorsFix all errors first
“Permission denied”No write accessCheck folder permissions
“Disk full”Insufficient spaceFree up disk space
“File in use”File open elsewhereClose file in other apps

Truncated Data in XPT

Cause: Values exceed XPT limits

Solution:

  1. XPT V5: Max 200 chars per variable
  2. Check variable lengths before export
  3. Consider using XPT V8 for longer values

Missing Variables in Output

Cause: Variables not mapped or derived

Solution:

  1. Verify all required mappings
  2. Check if derived variables were created
  3. Review export settings

Performance Issues

Slow Import

Cause: Large file size

Solution:

  1. Allow time for large files
  2. Consider splitting into smaller files
  3. Close other applications
  4. Increase available RAM

Application Freezes

Cause: Processing large datasets

Solution:

  1. Wait for operation to complete
  2. Check progress indicator
  3. If unresponsive after 5+ minutes, restart
  4. Process smaller datasets

High Memory Usage

Cause: Large dataset in memory

Solution:

  1. Close unused files
  2. Process one domain at a time
  3. Restart application to free memory

Application Issues

Application Won’t Start

macOS:

# If blocked by Gatekeeper
xattr -d com.apple.quarantine /Applications/Trial\ Submission\ Studio.app

Linux:

# Ensure executable permission
chmod +x trial-submission-studio

Windows:

  • Run as administrator
  • Check antivirus isn’t blocking

Crashes on Startup

Solution:

  1. Delete configuration files:
    • macOS: ~/Library/Application Support/trial-submission-studio/
    • Windows: %APPDATA%\trial-submission-studio\
    • Linux: ~/.config/trial-submission-studio/
  2. Reinstall the application

Settings Not Saved

Cause: Permission issues

Solution:

  1. Ensure write access to config directory
  2. Run application with appropriate permissions

Getting Help

Collect Information

Before reporting an issue, gather:

  1. Application version (Help → About)
  2. Operating system and version
  3. Steps to reproduce
  4. Error messages (screenshots)
  5. Sample data (anonymized)

Report an Issue

  1. Check existing issues
  2. Create a new issue
  3. Include collected information

Community Support


Quick Reference

Keyboard Shortcuts for Recovery

ActionWindows/LinuxmacOS
Force quitAlt+F4⌘Q
Cancel operationEscEsc
UndoCtrl+Z⌘Z

Log Files

Application logs are located at:

  • macOS: ~/Library/Logs/trial-submission-studio/
  • Windows: %LOCALAPPDATA%\trial-submission-studio\logs\
  • Linux: ~/.local/share/trial-submission-studio/logs/

Include relevant log excerpts when reporting issues.

CDISC Standards Overview

Trial Submission Studio supports CDISC (Clinical Data Interchange Standards Consortium) standards for regulatory submissions.

What is CDISC?

CDISC develops global data standards that streamline clinical research and enable connections to healthcare. These standards are required by regulatory agencies including the FDA and PMDA.

Standards Hierarchy

flowchart TD
    CDISC[CDISC Standards]
    CDISC --> SDTM[SDTM]
    CDISC --> ADAM[ADaM]
    CDISC --> SEND[SEND]
    CDISC --> CT[Controlled Terminology]

    SDTM --> IG34[IG v3.4]
    ADAM --> IG13[IG v1.3]
    SEND --> IG311[IG v3.1.1]

    style CDISC fill:#4a90d9,color:#fff
    style SDTM fill:#50c878,color:#fff
    style ADAM fill:#f5a623,color:#fff
    style SEND fill:#9b59b6,color:#fff

Supported Standards

Currently Implemented

StandardVersionStatus
SDTM-IG3.4Supported
Controlled Terminology2024-2025Supported

Planned Support

StandardVersionStatus
ADaM-IG1.3Planned
SEND-IG3.1.1Planned

SDTM (Study Data Tabulation Model)

SDTM is the standard structure for submitting study data to regulatory authorities.

Key Concepts

  • Domains: Logical groupings of data (e.g., Demographics, Adverse Events)
  • Variables: Individual data elements within domains
  • Controlled Terminology: Standardized values for specific variables

Learn More

Controlled Terminology

CDISC Controlled Terminology (CT) provides standardized values for SDTM variables.

Embedded Versions

Trial Submission Studio includes the following CT packages:

  • CDISC CT 2025-09-26 (latest)
  • CDISC CT 2025-03-28
  • CDISC CT 2024-03-29

Learn More

ADaM (Analysis Data Model)

ADaM is the standard for analysis-ready datasets derived from SDTM.

Note

ADaM support is planned for a future release.

SEND (Standard for Exchange of Nonclinical Data)

SEND is SDTM for nonclinical (animal) studies.

Note

SEND support is planned for a future release.

FDA Requirements

Electronic Submissions

The FDA requires CDISC standards for:

  • New Drug Applications (NDA)
  • Biologics License Applications (BLA)
  • Abbreviated New Drug Applications (ANDA)

Study Data Technical Conformance Guide

Trial Submission Studio aligns with FDA’s Study Data Technical Conformance Guide requirements:

  • XPT V5 format
  • Define-XML 2.1
  • Controlled Terminology validation

Resources

Official CDISC Resources

FDA Resources

Next Steps

SDTM Introduction

The Study Data Tabulation Model (SDTM) is the standard for organizing and formatting human clinical trial data for submission to regulatory authorities.

Purpose

SDTM provides:

  • Consistent structure for clinical trial data
  • Standardized naming conventions
  • Regulatory compliance with FDA requirements
  • Interoperability between systems and organizations

Key Concepts

Domains

SDTM organizes data into domains - logical groupings of related observations:

CategoryExamples
Special PurposeDM (Demographics), CO (Comments), SE (Subject Elements), SV (Subject Visits)
InterventionsCM (Concomitant Meds), EX (Exposure), SU (Substance Use)
EventsAE (Adverse Events), DS (Disposition), MH (Medical History)
FindingsLB (Labs), VS (Vital Signs), EG (ECG), PE (Physical Exam)

Variables

Each domain contains variables - individual data elements:

TypeDescriptionExamples
IdentifierSubject/study identificationSTUDYID, USUBJID, DOMAIN
TopicFocus of the observationAETERM, VSTEST, LBTEST
TimingWhen observation occurredAESTDTC, VSDTC, VISITNUM
QualifierAdditional contextAESEV, VSPOS, LBORRES

Controlled Terminology

Many variables require values from controlled terminology (CT):

  • Standardized value lists
  • Ensures consistency across studies
  • Required for regulatory submissions

SDTM Structure

flowchart TB
    subgraph "SDTM Domain Classes"
        direction TB
        SP[Special Purpose<br/>DM, CO, SE, SV]
        INT[Interventions<br/>CM, EX, SU]
        EVT[Events<br/>AE, DS, MH]
        FIND[Findings<br/>LB, VS, EG, PE]
    end

    subgraph "Variable Types"
        ID[Identifiers<br/>STUDYID, USUBJID]
        TOPIC[Topic Variables<br/>--TERM, --TEST]
        TIMING[Timing Variables<br/>--STDTC, --ENDTC]
        QUAL[Qualifiers<br/>--SEV, --RES]
    end

    SP --> ID
    INT --> ID
    EVT --> ID
    FIND --> ID
    ID --> TOPIC
    TOPIC --> TIMING
    TIMING --> QUAL
    style SP fill: #4a90d9, color: #fff
    style INT fill: #50c878, color: #fff
    style EVT fill: #f5a623, color: #fff
    style FIND fill: #9b59b6, color: #fff

General Observation Classes

  1. Interventions: Treatments applied to subjects
  2. Events: Occurrences during study participation
  3. Findings: Observations and test results

Variable Roles

RolePurposeExample
IdentifierLink records across domainsUSUBJID
TopicDescribe the observationAETERM
TimingCapture whenAESTDTC
QualifierProvide contextAESEV
RuleLink to analysis rules(via Define-XML)

Working with SDTM in Trial Submission Studio

Import Flow

  1. Load source CSV data
  2. Select target SDTM domain
  3. Map source columns to SDTM variables
  4. Handle controlled terminology
  5. Validate against SDTM rules
  6. Export to XPT format

Variable Requirements

  • Required: Must be present and populated
  • Expected: Should be present if applicable
  • Permissible: Allowed but not required

Best Practices

  1. Map identifiers first: STUDYID, DOMAIN, USUBJID
  2. Use controlled terminology: For variables requiring CT
  3. Follow naming conventions: Variable names, labels
  4. Validate early: Catch issues before export

SDTM Versions

Trial Submission Studio currently supports:

  • SDTM-IG 3.4 (current FDA standard)

Version History

VersionReleaseNotes
3.42021Current FDA standard
3.32018
3.22013
3.1.22008

Next Steps

SDTM Domains

SDTM organizes clinical trial data into domains based on the type of observation.

Domain Classification

flowchart TD
    subgraph "SDTM Domains"
        direction TB
        SPE[Special Purpose]
        INT[Interventions]
        EVT[Events]
        FND[Findings]
    end

    SPE --> DM[DM - Demographics]
    SPE --> TA[TA - Trial Arms]
    SPE --> TS[TS - Trial Summary]

    INT --> CM[CM - Medications]
    INT --> EX[EX - Exposure]
    INT --> PR[PR - Procedures]

    EVT --> AE[AE - Adverse Events]
    EVT --> MH[MH - Medical History]
    EVT --> DS[DS - Disposition]

    FND --> LB[LB - Lab Results]
    FND --> VS[VS - Vital Signs]
    FND --> EG[EG - ECG Results]

    style SPE fill:#4a90d9,color:#fff
    style INT fill:#50c878,color:#fff
    style EVT fill:#f5a623,color:#fff
    style FND fill:#9b59b6,color:#fff

Domain Categories

Special Purpose Domains

Core structural domains required for all submissions.

DomainNameDescription
DMDemographicsSubject demographic information
COCommentsFree-text comments
SESubject ElementsSubject milestones
SVSubject VisitsVisits for each subject
TATrial ArmsPlanned study arms
TDTrial DiseaseDisease descriptions
TETrial ElementsPlanned protocol elements
TITrial Inclusion/ExclusionEligibility criteria
TSTrial SummaryStudy-level parameters
TVTrial VisitsPlanned visits

Interventions Domains

Treatments and substances given to or used by subjects.

DomainNameDescription
CMConcomitant MedicationsNon-study medications
ECExposure as CollectedExposure data as collected
EXExposureStudy treatment exposure
PRProceduresNon-study procedures
SUSubstance UseTobacco, alcohol, etc.

Events Domains

Discrete occurrences during study participation.

DomainNameDescription
AEAdverse EventsAll adverse events
CEClinical EventsNon-adverse clinical events
DSDispositionSubject status at milestones
DVProtocol DeviationsProtocol violations
HOHealthcare EncountersHospitalizations, ER visits
MHMedical HistoryPrior conditions

Findings Domains

Observations and measurements.

DomainNameDescription
DADrug AccountabilityDrug dispensing/return
DDDeath DetailsCause of death details
EGECG ResultsElectrocardiogram data
FTFunctional TestsFunctional assessments
IEInclusion/ExclusionSubject eligibility
ISImmunogenicity SpecimenSample assessments
LBLab ResultsLaboratory tests
MBMicrobiology SpecimenMicrobiology samples
MIMicroscopic FindingsHistopathology
MKMusculoskeletalMusculoskeletal findings
MOMorphologyImaging morphology
MSMicrobiology SusceptibilityAntibiotic susceptibility
NVNervous SystemNeurological findings
OEOphthalmologyEye exam results
PCPharmacokinetics ConcentrationsDrug concentrations
PEPhysical ExamPhysical examination
PPPK ParametersPharmacokinetic parameters
QSQuestionnairesPRO/questionnaire data
RERespiratoryPulmonary function
RPReproductiveReproductive findings
RSDisease ResponseTumor response
SCSubject CharacteristicsAdditional demographics
SSSubject StatusSubject enrollment status
TRTumor/Lesion ResultsTumor measurements
TUTumor/Lesion IdentificationTumor identification
URUrinary SystemUrological findings
VSVital SignsVital sign measurements

Common Domain Details

DM - Demographics

Required for all studies. Contains one record per subject.

Key Variables:

  • USUBJID (Unique Subject ID)
  • AGE, AGEU (Age and units)
  • SEX, RACE, ETHNIC
  • ARM, ARMCD (Study arm)
  • RFSTDTC, RFENDTC (Reference dates)
  • COUNTRY, SITEID

AE - Adverse Events

Captures all adverse events during the study.

Key Variables:

  • AETERM (Verbatim term)
  • AEDECOD (Dictionary-coded term)
  • AESTDTC, AEENDTC (Start/end dates)
  • AESEV (Severity)
  • AESER (Serious)
  • AEREL (Relationship to treatment)
  • AEOUT (Outcome)

VS - Vital Signs

Captures vital sign measurements.

Key Variables:

  • VSTESTCD, VSTEST (Test code/name)
  • VSORRES, VSSTRESC, VSSTRESN (Results)
  • VSORRESU, VSSTRESU (Units)
  • VSPOS (Position)
  • VSDTC (Date/time)
  • VISITNUM, VISIT

LB - Laboratory Results

Captures laboratory test results.

Key Variables:

  • LBTESTCD, LBTEST (Test code/name)
  • LBORRES, LBSTRESC, LBSTRESN (Results)
  • LBORRESU, LBSTRESU (Units)
  • LBSPEC (Specimen type)
  • LBDTC (Date/time)
  • LBNRIND (Reference range indicator)

Custom Domains

For data not fitting standard domains, create custom domains:

  • Two-letter code starting with X, Y, or Z
  • Follow general observation class rules
  • Document in Define-XML

Next Steps

SDTM Variables

Variables are the individual data elements within SDTM domains.

Variable Categories

Identifier Variables

Identify the study, subject, and domain.

VariableLabelDescription
STUDYIDStudy IdentifierUnique study ID
DOMAINDomain AbbreviationTwo-letter domain code
USUBJIDUnique Subject IDUnique across all studies
SUBJIDSubject IDSubject ID within study
SITEIDStudy Site IdentifierSite number

Topic Variables

Describe what was observed.

DomainVariableDescription
AEAETERMAdverse event term
CMCMTRTMedication name
LBLBTESTLab test name
VSVSTESTVital sign test

Timing Variables

Capture when observations occurred.

VariableLabelDescription
–DTCDate/TimeISO 8601 date/time
–STDTCStart Date/TimeStart of observation
–ENDTCEnd Date/TimeEnd of observation
–DYStudy DayStudy day number
VISITNUMVisit NumberNumeric visit identifier
VISITVisit NameVisit label

Qualifier Variables

Provide additional context.

TypeExamplesDescription
Grouping–CAT, –SCATCategory, subcategory
Result–ORRES, –STRESCOriginal/standard result
Record–SEQ, –GRPIDSequence, grouping
Synonym–DECOD, –MODIFYCoded/modified terms

Variable Naming Conventions

Prefix Pattern

Most variables use a domain-specific prefix:

  • AE + TERM = AETERM
  • VS + TESTCD = VSTESTCD
  • LB + ORRES = LBORRES

Common Suffixes

SuffixMeaningExample
--TESTCDTest CodeVSTESTCD, LBTESTCD
--TESTTest NameVSTEST, LBTEST
--ORRESOriginal ResultVSORRES, LBORRES
--ORRESUOriginal UnitsVSORRESU, LBORRESU
--STRESCStandardized Result (Char)VSSTRESC
--STRESNStandardized Result (Num)VSSTRESN
--STRESUStandardized UnitsVSSTRESU
--STATStatusVSSTAT (NOT DONE)
--REASNDReason Not DoneVSREASND
--LOCLocationVSLOC
--DTCDate/TimeVSDTC, AESTDTC

Data Types

Character Variables

  • Text values
  • Max length: 200 characters (XPT V5)
  • Example: AETERM, VSTEST

Numeric Variables

  • Integer or floating-point
  • Example: AGE, VSSTRESN, LBSTRESN

Date/Time Variables

ISO 8601 format:

  • Full: 2024-01-15T09:30:00
  • Date only: 2024-01-15
  • Partial: 2024-01, 2024

Variable Requirements

Required Variables

Must be present and populated for every record.

DomainRequired Variables
AllSTUDYID, DOMAIN, USUBJID
DMRFSTDTC, RFENDTC, SITEID, ARM, ARMCD
AEAETERM, AEDECOD, AESTDTC
VSVSTESTCD, VSTEST, VSORRES, VSDTC

Expected Variables

Should be present when applicable.

DomainExpected Variables
AEAEENDTC, AESEV, AESER, AEREL
VSVSSTRESN, VSSTRESU, VISITNUM

Permissible Variables

Can be included if relevant data exists.

Controlled Terminology

Variables requiring controlled terminology:

VariableCodelist
SEXSex
RACERace
ETHNICEthnicity
COUNTRYCountry
AESEVSeverity
AESERNo Yes Response
VSTESTCDVital Signs Test Code
LBTESTCDLab Test Code

Variable Metadata

Label

40 characters max (XPT V5):

  • Descriptive text
  • Example: “Adverse Event Reported Term”

Length

Define appropriate length for each variable:

  • Consider actual data values
  • XPT V5 max: 200 characters

Order

Maintain consistent variable ordering:

  1. Identifier variables
  2. Topic variables
  3. Qualifier variables
  4. Timing variables

Next Steps

SDTM Validation Rules

Trial Submission Studio validates data against SDTM implementation guide rules.

Validation Categories

Structural Validation

Checks data structure and format.

Rule IDDescriptionSeverity
SD0001Required variable missingError
SD0002Invalid variable nameError
SD0003Variable length exceededError
SD0004Invalid data typeError
SD0005Duplicate recordsWarning
SD0006Invalid domain codeError

Content Validation

Checks data values and relationships.

Rule IDDescriptionSeverity
CT0001Value not in controlled terminologyError
CT0002Invalid date formatError
CT0003Date out of valid rangeWarning
CT0004Numeric value out of rangeWarning
CT0005Missing required valueError

Cross-Record Validation

Checks relationships between records.

Rule IDDescriptionSeverity
XR0001USUBJID not in DMError
XR0002Duplicate key valuesError
XR0003Missing parent recordWarning
XR0004Inconsistent dates across domainsWarning

Common Validation Rules

Identifier Rules

STUDYID

  • Must be present in all records
  • Must be consistent across domains
  • Cannot be null or empty

USUBJID

  • Must be present in all records
  • Must exist in DM domain
  • Must be unique per subject

DOMAIN

  • Must match the domain abbreviation
  • Must be uppercase
  • Must be 2 characters

Date/Time Rules

–DTC Variables

  • Must follow ISO 8601 format
  • Supported formats:
    • YYYY-MM-DDTHH:MM:SS
    • YYYY-MM-DD
    • YYYY-MM
    • YYYY

Date Ranges

  • End date cannot precede start date
  • Study dates should be within study period

Controlled Terminology Rules

SEX

Valid values:

  • M (Male)
  • F (Female)
  • U (Unknown)
  • UNDIFFERENTIATED

AESEV

Valid values:

  • MILD
  • MODERATE
  • SEVERE

AESER

Valid values:

  • Y (Yes)
  • N (No)

Validation Report

Error Summary

┌─────────────────────────────────────────────────────────────┐
│ Validation Summary                                          │
├─────────────────────────────────────────────────────────────┤
│ Errors:   5                                                 │
│ Warnings: 12                                                │
│ Info:     3                                                 │
├─────────────────────────────────────────────────────────────┤
│ Domain: DM                                                  │
│   - 2 Errors                                                │
│   - 3 Warnings                                              │
│                                                             │
│ Domain: AE                                                  │
│   - 3 Errors                                                │
│   - 9 Warnings                                              │
└─────────────────────────────────────────────────────────────┘

Error Details

Each error includes:

  • Rule ID: Unique identifier
  • Severity: Error/Warning/Info
  • Description: What’s wrong
  • Location: Affected rows/columns
  • Suggestion: How to fix

Fixing Validation Issues

Mapping Issues

  1. Verify correct source column is mapped
  2. Check data type compatibility
  3. Ensure all required variables are mapped

Data Issues

  1. Review affected rows
  2. Correct values in source data
  3. Re-import and re-validate

Terminology Issues

  1. Check expected values in codelist
  2. Map source values to standard terms
  3. Use value-level mapping if needed

Custom Validation

Severity Overrides

Some warnings can be suppressed if intentional:

  1. Review the warning
  2. Document the reason
  3. Mark as reviewed (if applicable)

Adding Context

For validation reports:

  • Add comments explaining exceptions
  • Document data collection differences
  • Note protocol-specific variations

Best Practices

  1. Validate incrementally

    • After initial mapping
    • After each significant change
    • Before final export
  2. Address errors first

    • Errors block export
    • Warnings should be reviewed
    • Info messages are FYI
  3. Document exceptions

    • Why a warning is acceptable
    • Protocol-specific reasons
    • Historical data limitations
  4. Review validation reports

    • Keep for audit trail
    • Share with data management
    • Include in submission package

Next Steps

Controlled Terminology

CDISC Controlled Terminology (CT) provides standardized values for SDTM variables.

Overview

Controlled Terminology ensures:

  • Consistency across studies and organizations
  • Interoperability between systems
  • Regulatory compliance with FDA requirements

Embedded CT Packages

Trial Submission Studio includes the following CT versions:

VersionRelease DateStatus
2024-12-20December 2024Current
2024-09-27September 2024Supported
2024-06-28June 2024Supported

Common Codelists

SEX (C66731)

CodeDecoded Value
MMALE
FFEMALE
UUNKNOWN
UNDIFFERENTIATEDUNDIFFERENTIATED

RACE (C74457)

Decoded Value
AMERICAN INDIAN OR ALASKA NATIVE
ASIAN
BLACK OR AFRICAN AMERICAN
NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
WHITE
MULTIPLE
NOT REPORTED
UNKNOWN

ETHNIC (C66790)

Decoded Value
HISPANIC OR LATINO
NOT HISPANIC OR LATINO
NOT REPORTED
UNKNOWN

COUNTRY (C66729)

ISO 3166-1 alpha-3 country codes:

  • USA, CAN, GBR, DEU, FRA, JPN, etc.

AESEV (C66769) - Severity

Decoded Value
MILD
MODERATE
SEVERE

AESER (C66742) - Serious

CodeDecoded Value
YY
NN

NY (C66742) - No Yes Response

CodeDecoded Value
YY
NN

VSTESTCD (C66741) - Vital Signs Test Codes

CodeDecoded Value
BMIBody Mass Index
DIABPDiastolic Blood Pressure
HEIGHTHeight
HRHeart Rate
PULSEPulse Rate
RESPRespiratory Rate
SYSBPSystolic Blood Pressure
TEMPTemperature
WEIGHTWeight

LBTESTCD - Lab Test Codes

Common examples:

CodeDescription
ALBAlbumin
ALTAlanine Aminotransferase
ASTAspartate Aminotransferase
BILIBilirubin
BUNBlood Urea Nitrogen
CREATCreatinine
GLUCGlucose
HGBHemoglobin
PLATPlatelet Count
WBCWhite Blood Cell Count

Extensible vs Non-Extensible

Non-Extensible Codelists

Values must exactly match the codelist:

  • SEX
  • COUNTRY
  • Unit codelists

Extensible Codelists

Additional values allowed with sponsor definition:

  • RACE (can add study-specific values)
  • Some test codes

Using CT in Trial Submission Studio

Automatic Validation

When you map variables with controlled terminology:

  1. Values are checked against the codelist
  2. Non-matching values are flagged
  3. Suggestions are provided

Value Mapping

For source values not in CT format:

  1. Create value-level mappings
  2. Map “Male” → “M”, “Female” → “F”
  3. Apply consistently

CT Version Selection

  1. Go to Settings → Controlled Terminology
  2. Select the appropriate CT version
  3. Validation uses selected version

Handling CT Errors

Value Not in Codelist

Error: “Value ‘XYZ’ not found in codelist”

Solutions:

  1. Check spelling/case
  2. Find the correct CT value
  3. Map source value to CT value
  4. For extensible codelists, document new value

Common Mappings

Source ValueCT Value
MaleM
FemaleF
YesY
NoN
CaucasianWHITE
African AmericanBLACK OR AFRICAN AMERICAN

Updating CT

New CT versions are released quarterly by CDISC. To use newer versions:

  1. Check for Trial Submission Studio updates
  2. New CT is included in app updates
  3. Select version in settings

Resources

Official References

Next Steps

ADaM (Preview)

The Analysis Data Model (ADaM) defines standards for analysis-ready datasets.

Note

ADaM support is planned for a future release of Trial Submission Studio.

What is ADaM?

ADaM (Analysis Data Model) provides:

  • Standards for analysis datasets
  • Derived from SDTM data
  • Ready for statistical analysis
  • Required for FDA submissions

ADaM vs SDTM

AspectSDTMADaM
PurposeData tabulationData analysis
TimingRaw data collectionDerived for analysis
StructureObservation-basedAnalysis-ready
AudienceData managersStatisticians

ADaM Dataset Types

ADSL - Subject-Level Analysis Dataset

One record per subject containing:

  • Demographics
  • Treatment information
  • Key baseline characteristics
  • Analysis flags

BDS - Basic Data Structure

Vertical structure for:

  • Laboratory data (ADLB)
  • Vital signs (ADVS)
  • Efficacy parameters

OCCDS - Occurrence Data Structure

For event data:

  • Adverse events (ADAE)
  • Concomitant medications (ADCM)

Other Structures

  • Time-to-Event (ADTTE)
  • Medical History (ADMH)

Planned Features

When ADaM support is added, Trial Submission Studio will provide:

ADaM Generation

  • Derive ADSL from DM and other SDTM domains
  • Create BDS datasets from SDTM findings
  • Generate OCCDS from events domains

ADaM Validation

  • Check ADaM IG compliance
  • Validate traceability to SDTM
  • Verify required variables

ADaM Export

  • Export to XPT format
  • Generate Define-XML for ADaM
  • Include in submission package

Current Workarounds

Until ADaM support is available:

  1. Export SDTM first

    • Use Trial Submission Studio for SDTM
    • Generate XPT files
  2. Derive ADaM externally

    • Use SAS or R
    • Apply ADaM derivation rules
    • Generate analysis datasets
  3. Validate separately

    • Use external validation tools
    • Check ADaM compliance

Timeline

ADaM support is on our roadmap. Priority features:

  • ADSL generation
  • BDS for VS and LB
  • OCCDS for AE

Resources

CDISC ADaM Resources

Stay Updated

SEND (Preview)

The Standard for Exchange of Nonclinical Data (SEND) extends SDTM for animal studies.

Note

SEND support is planned for a future release of Trial Submission Studio.

What is SEND?

SEND (Standard for Exchange of Nonclinical Data) provides:

  • Standardized format for nonclinical (animal) study data
  • Based on SDTM structure
  • Required for FDA nonclinical submissions
  • Supports toxicology and pharmacology studies

SEND vs SDTM

AspectSDTMSEND
SubjectsHumanAnimal
StudiesClinical trialsNonclinical studies
DomainsClinical domainsNonclinical domains
RequirementsNDA, BLAIND, NDA (nonclinical)

SEND Domains

Special Purpose

DomainName
DMDemographics
DSDisposition
TATrial Arms
TETrial Elements
TSTrial Summary
TXTrial Sets

Findings

DomainName
BWBody Weight
BGBody Weight Gain
CLClinical Observations
DDDeath Diagnosis
FWFood/Water Consumption
LBLaboratory Results
MAMacroscopic Findings
MIMicroscopic Findings
OMOrgan Measurements
PCPharmacokinetic Concentrations
PPPharmacokinetic Parameters
TFTumor Findings
VSVital Signs

Interventions

DomainName
EXExposure

Key Differences from SDTM

Subject Identification

  • USUBJID format differs for animals
  • Species and strain information required
  • Group/cage identification

Domain-Specific Variables

SEND includes nonclinical-specific variables:

  • Species, strain, sex
  • Dose group information
  • Study day calculations
  • Sacrifice/necropsy data

Controlled Terminology

SEND uses specific CT:

  • Animal species
  • Strain/substrain
  • Route of administration (nonclinical)
  • Specimen types

Planned Features

When SEND support is added, Trial Submission Studio will provide:

SEND Import/Mapping

  • Support nonclinical data formats
  • Map to SEND domains
  • Handle group-level data

SEND Validation

  • SEND-IG compliance checking
  • Nonclinical-specific rules
  • Controlled terminology for SEND

SEND Export

  • XPT V5 format
  • Define-XML for SEND
  • Submission-ready packages

Current Workarounds

Until SEND support is available:

  1. Manual Mapping

    • Use current SDTM workflow
    • Manually adjust for SEND differences
    • Export to XPT
  2. External Tools

    • Use specialized nonclinical tools
    • Validate with SEND validators

SEND Versions

VersionDescription
SEND 3.1.1Current FDA standard
SEND 3.1Previous version
SEND 3.0Initial release

Resources

CDISC SEND Resources

FDA Resources

Stay Updated

  • Check the Roadmap for SEND progress
  • Watch for announcements on GitHub

XPT (SAS Transport) Format

XPT is the FDA-standard format for regulatory data submissions.

Overview

The SAS Transport Format (XPT) is:

  • Required by FDA for electronic submissions
  • A platform-independent binary format
  • Compatible with SAS and other tools
  • The de facto standard for clinical data exchange

XPT Versions

Trial Submission Studio supports two XPT versions:

XPT Version 5 (FDA Standard)

CharacteristicLimit
Variable name length8 characters
Variable label length40 characters
Record length8,192 bytes
Numeric precision8 bytes (IEEE)

Use for: FDA submissions, regulatory requirements

XPT Version 8 (Extended)

CharacteristicLimit
Variable name length32 characters
Variable label length256 characters
Record length131,072 bytes
Numeric precision8 bytes (IEEE)

Use for: Internal use, longer names needed

File Structure

Header Records

XPT files contain metadata headers:

  • Library header (first record)
  • Member header (dataset info)
  • Namestr records (variable definitions)

Data Records

  • Fixed-width records
  • Packed binary format
  • IEEE floating-point numbers

Creating XPT Files

Export Steps

  1. Complete data mapping
  2. Run validation
  3. Click Export → XPT
  4. Select version (V5 or V8)
  5. Choose output location
  6. Click Save

Export Options

OptionDescription
VersionV5 (default) or V8
Sort by keysOrder records by key variables
Include metadataDataset label, variable labels

XPT Constraints

Variable Names

V5 Requirements:

  • Maximum 8 characters
  • Start with letter or underscore
  • Alphanumeric and underscore only
  • Uppercase recommended

V8 Requirements:

  • Maximum 32 characters
  • Same character restrictions

Variable Labels

V5: 40 characters max V8: 256 characters max

Data Values

Character variables:

  • V5: Max 200 bytes per value
  • Trailing spaces trimmed
  • Missing = blank

Numeric variables:

  • 8-byte IEEE format
  • 28 SAS missing value codes supported (.A through .Z, ._)
  • Precision: ~15 significant digits

Numeric Precision

IEEE to SAS Conversion

Trial Submission Studio handles:

  • IEEE 754 double precision
  • SAS missing value encoding
  • Proper byte ordering

Missing Values

SAS/XPT supports 28 missing value codes:

CodeMeaning
.Standard missing
.A - .ZSpecial missing A-Z
._Underscore missing

Validation Before Export

Automatic Checks

  • Variable name lengths
  • Label lengths
  • Data type compatibility
  • Value length limits

Common Issues

IssueSolution
Name too longUse V8 or rename
Label truncatedShorten label
Value too longTruncate or split

Post-Export Verification

  1. Check file size - Matches expected data volume
  2. Open in viewer - Verify structure
  3. Validate with external tools - Pinnacle 21, SAS
  4. Compare row counts - Match source data

External Validation

Consider validating with:

  • Pinnacle 21 Community (free)
  • SAS Universal Viewer
  • Other XPT readers

FDA Submission Requirements

Required Format

  • XPT Version 5 for FDA submissions
  • Define-XML 2.1 for metadata
  • Appropriate file naming (lowercase domain codes)

File Naming Convention

  • dm.xpt - Demographics
  • ae.xpt - Adverse Events
  • vs.xpt - Vital Signs
  • (lowercase domain abbreviation)

Dataset Limits

ConstraintLimit
File size5 GB (practical limit)
Variables per datasetNo formal limit
Records per datasetNo formal limit

Technical Details

Byte Order

  • XPT uses big-endian byte order
  • Trial Submission Studio handles conversion automatically

Character Encoding

  • ASCII-compatible
  • Extended ASCII for special characters
  • UTF-8 source data converted appropriately

Record Blocking

  • 80-byte logical records
  • Blocked for efficiency
  • Headers use fixed-format records

Next Steps

Dataset-XML Format

Dataset-XML is a CDISC standard XML format for clinical data exchange.

Overview

Dataset-XML provides:

  • Human-readable data format
  • Full Unicode support
  • Embedded metadata
  • Alternative to XPT binary format

When to Use Dataset-XML

Use CaseRecommendation
FDA submissionUse XPT (required)
Internal data exchangeDataset-XML works well
Archive/audit trailGood for documentation
Non-SAS environmentsEasier integration
Full character supportUnicode capable

Format Structure

ODM Container

Dataset-XML is based on CDISC ODM (Operational Data Model):

<?xml version="1.0" encoding="UTF-8"?>
<ODM xmlns="http://www.cdisc.org/ns/odm/v1.3"
     xmlns:data="http://www.cdisc.org/ns/Dataset-XML/v1.0"
     FileType="Snapshot">
    <ClinicalData StudyOID="..." MetaDataVersionOID="...">
        <SubjectData SubjectKey="...">
            <StudyEventData StudyEventOID="...">
                <ItemGroupData ItemGroupOID="DM">
                    <ItemData ItemOID="STUDYID">ABC123</ItemData>
                    <ItemData ItemOID="USUBJID">ABC123-001</ItemData>
                    <!-- More items -->
                </ItemGroupData>
            </StudyEventData>
        </SubjectData>
    </ClinicalData>
</ODM>

Key Elements

ElementDescription
ODMRoot container
ClinicalDataStudy data container
SubjectDataPer-subject data
ItemGroupDataDomain records
ItemDataIndividual values

Creating Dataset-XML

Export Steps

  1. Complete data mapping
  2. Run validation
  3. Click Export → Dataset-XML
  4. Configure options
  5. Choose output location
  6. Click Save

Export Options

OptionDescription
Include metadataEmbed variable definitions
Pretty printFormat XML for readability
CompressReduce file size
Single fileOne file vs. file per domain

Dataset-XML vs XPT

AspectDataset-XMLXPT
FormatText (XML)Binary
ReadabilityHuman-readableRequires tools
SizeLargerSmaller
UnicodeFull supportLimited
FDA submissionAcceptedRequired
IntegrationEasierSAS-focused

Advantages

Human Readable

  • Open in any text editor
  • Easily inspectable
  • Good for debugging

Full Unicode

  • International characters
  • Special symbols
  • No character limitations

Self-Describing

  • Metadata embedded
  • Schema validation
  • No external dependencies

Platform Independent

  • Standard XML format
  • Any programming language
  • No proprietary tools needed

Limitations

File Size

  • Larger than binary XPT
  • Compression recommended for large datasets

FDA Preference

  • FDA prefers XPT for submissions
  • Dataset-XML accepted but less common

Processing Overhead

  • XML parsing slower than binary
  • More memory for large files

Validation

Schema Validation

Dataset-XML can be validated against:

  • CDISC Dataset-XML schema
  • ODM schema
  • Custom validation rules

Common Checks

  • Well-formed XML
  • Valid element structure
  • Data type conformance
  • Required elements present

Working with Dataset-XML

Reading Files

Dataset-XML can be read by:

  • Any XML parser
  • CDISC-compatible tools
  • Statistical software with XML support

Converting to Other Formats

From Dataset-XML, you can convert to:

  • XPT (for FDA submission)
  • CSV (for analysis)
  • Database tables

Technical Details

Encoding

  • UTF-8 (default and recommended)
  • UTF-16 supported
  • Encoding declared in XML header

Namespaces

xmlns="http://www.cdisc.org/ns/odm/v1.3"
        xmlns:data="http://www.cdisc.org/ns/Dataset-XML/v1.0"

File Extension

  • .xml for Dataset-XML files
  • Optionally: domain.xml (e.g., dm.xml)

Next Steps

Define-XML 2.1

Define-XML provides metadata documentation for CDISC datasets.

Overview

Define-XML is:

  • Required for FDA electronic submissions
  • Describes dataset structure and content
  • Documents variable definitions
  • Provides value-level metadata

What Define-XML Contains

Dataset Metadata

  • Dataset names and descriptions
  • Domain structure
  • Keys and sort order
  • Dataset locations

Variable Metadata

  • Variable names and labels
  • Data types and lengths
  • Origin information
  • Controlled terminology references

Value-Level Metadata

  • Specific value definitions
  • Conditional logic
  • Derivation methods

Computational Methods

  • Derivation algorithms
  • Imputation rules
  • Analysis methods

Define-XML 2.1 Structure

Root Element

<?xml version="1.0" encoding="UTF-8"?>
<ODM xmlns="http://www.cdisc.org/ns/odm/v1.3"
     xmlns:def="http://www.cdisc.org/ns/def/v2.1"
     ODMVersion="1.3.2"
     FileType="Snapshot"
     FileOID="DEFINE-XML-EXAMPLE">

Key Components

ComponentDescription
StudyStudy-level information
MetaDataVersionMetadata container
ItemGroupDefDataset definitions
ItemDefVariable definitions
CodeListControlled terminology
MethodDefComputational methods
CommentDefComments and notes

Creating Define-XML

Automatic Generation

Trial Submission Studio generates Define-XML from:

  1. Mapped datasets
  2. Variable definitions
  3. Controlled terminology
  4. Validation rules

Export Steps

  1. Complete all domain mappings
  2. Run validation
  3. Click Export → Define-XML
  4. Review generated metadata
  5. Add comments/methods if needed
  6. Click Save

Generated Content

The exported Define-XML includes:

ElementSource
Dataset definitionsFrom mapped domains
Variable definitionsFrom SDTM standards
OriginsFrom mapping configuration
CodelistsFrom controlled terminology

Define-XML Elements

ItemGroupDef (Datasets)


<ItemGroupDef OID="IG.DM"
              Name="DM"
              Repeating="No"
              Domain="DM"
              def:Structure="One record per subject"
              def:Class="SPECIAL PURPOSE">
    <Description>
        <TranslatedText xml:lang="en">Demographics</TranslatedText>
    </Description>
    <ItemRef ItemOID="IT.DM.STUDYID" OrderNumber="1" Mandatory="Yes"/>
    <!-- More ItemRefs -->
</ItemGroupDef>

ItemDef (Variables)


<ItemDef OID="IT.DM.USUBJID"
         Name="USUBJID"
         DataType="text"
         Length="50"
         def:Origin="CRF">
    <Description>
        <TranslatedText xml:lang="en">Unique Subject Identifier</TranslatedText>
    </Description>
</ItemDef>

CodeList (Controlled Terminology)


<CodeList OID="CL.SEX"
          Name="Sex"
          DataType="text">
    <CodeListItem CodedValue="M">
        <Decode>
            <TranslatedText xml:lang="en">Male</TranslatedText>
        </Decode>
    </CodeListItem>
    <CodeListItem CodedValue="F">
        <Decode>
            <TranslatedText xml:lang="en">Female</TranslatedText>
        </Decode>
    </CodeListItem>
</CodeList>

Variable Origins

Define-XML documents where data comes from:

OriginDescription
CRFCase Report Form
DerivedCalculated from other data
AssignedAssigned by sponsor
ProtocolFrom study protocol
eDTElectronic data transfer

Customizing Define-XML

Adding Comments

Add explanatory comments for:

  • Complex derivations
  • Data collection notes
  • Exception documentation

Computational Methods

Document derivation algorithms:

  • Formulas
  • Conditions
  • Source variables

Value-Level Metadata

For variables with parameter-dependent definitions:

  • Different units by test
  • Conditional codelists
  • Test-specific origins

Validation

Schema Validation

Define-XML is validated against:

  • CDISC Define-XML 2.1 schema
  • Stylesheet rendering rules

Common Issues

IssueSolution
Missing required elementsAdd required metadata
Invalid referencesCheck OID references
Codelist mismatchesVerify CT alignment

FDA Requirements

Submission Package

  • define.xml - Metadata file
  • define.pdf - Rendered stylesheet (optional)
  • Referenced XPT datasets

Naming Convention

  • File: define.xml (lowercase)
  • Location: Study root folder

Stylesheet

Include the CDISC stylesheet for rendering:

<?xml-stylesheet type="text/xsl" href="define2-1.xsl"?>

Best Practices

  1. Generate early - Create Define-XML as you build datasets
  2. Review carefully - Verify all metadata is accurate
  3. Document derivations - Explain complex logic
  4. Test rendering - View with stylesheet before submission
  5. Validate - Use Define-XML validators

Next Steps

Architecture Overview

Trial Submission Studio is built as a modular Rust workspace with 6 specialized crates.

Design Philosophy

Core Principles

  1. Separation of Concerns - Each crate has a single responsibility
  2. Deterministic Output - Reproducible results for regulatory compliance
  3. Offline Operation - All standards embedded, no network dependencies
  4. Type Safety - Rust’s type system prevents data errors

Key Design Decisions

  • Pure Functions - Mapping and validation logic is side-effect free
  • Embedded Standards - CDISC data bundled in binary
  • No External APIs - Works without internet connection
  • Auditable - Clear data lineage and transformations

Workspace Structure

trial-submission-studio/
├── Cargo.toml              # Workspace configuration
├── crates/
│   ├── tss-gui/            # Desktop application (Iced 0.14.0)
│   ├── tss-submit/         # Mapping, normalization, validation, export
│   ├── tss-ingest/         # CSV loading
│   ├── tss-standards/      # CDISC standards loader
│   ├── tss-updater/        # App update mechanism
│   └── tss-updater-helper/ # macOS bundle swap helper
├── standards/              # Embedded CDISC data
├── mockdata/               # Test datasets
└── docs/                   # This documentation

Crate Dependency Graph

flowchart TD
    subgraph Application
        GUI[tss-gui<br/>Iced 0.14.0]
    end

    subgraph "Core Pipeline"
        SUBMIT[tss-submit]
        subgraph modules[tss-submit modules]
            MAP[map/]
            NORM[normalize/]
            VAL[validate/]
            EXP[export/]
        end
    end

    subgraph Support
        INGEST[tss-ingest]
        STANDARDS[tss-standards]
    end

    subgraph Update
        UPDATER[tss-updater]
        HELPER[tss-updater-helper]
    end

    subgraph External
        XPT[xportrs<br/>crates.io]
    end

    GUI --> SUBMIT
    GUI --> INGEST
    GUI --> STANDARDS
    GUI --> UPDATER
    SUBMIT --> STANDARDS
    SUBMIT --> XPT
    UPDATER -.-> HELPER
    INGEST --> STANDARDS

    style GUI fill:#4a90d9,color:#fff
    style SUBMIT fill:#50c878,color:#fff
    style STANDARDS fill:#f5a623,color:#fff
    style XPT fill:#9b59b6,color:#fff

Crate Responsibilities

CratePurposeKey Dependencies
tss-guiDesktop applicationIced 0.14.0
tss-submitMapping, normalization, validation, exportrapidfuzz, xportrs, quick-xml
tss-ingestCSV loadingcsv, polars
tss-standardsCDISC standards loaderserde, serde_json
tss-updaterApp updatesreqwest
tss-updater-helpermacOS bundle swap(macOS-only)

Data Flow

Import to Export Pipeline

flowchart LR
    subgraph Input
        CSV[CSV File]
    end

    subgraph "tss-submit"
        MAP[map/]
        NORM[normalize/]
        VAL[validate/]
        EXP[export/]
    end

    subgraph Output
        XPT[XPT V5/V8]
        XML[Dataset-XML]
        DEFINE[Define-XML 2.1]
    end

    CSV --> MAP
    MAP --> NORM
    NORM --> VAL
    VAL --> EXP
    EXP --> XPT
    EXP --> XML
    EXP --> DEFINE

    style CSV fill:#e8f4f8
    style XPT fill:#d4edda
    style XML fill:#d4edda
    style DEFINE fill:#d4edda

Pipeline Stages

StageModulePurpose
1. Maptss-submit/map/Fuzzy column-to-variable mapping with confidence scoring
2. Normalizetss-submit/normalize/Data transformation (datetime, CT, studyday, duration)
3. Validatetss-submit/validate/CDISC conformance checking (CT, required, dates, datatypes)
4. Exporttss-submit/export/Output generation (XPT via xportrs, Dataset-XML, Define-XML)

Standards Integration

flowchart TB
    subgraph "Embedded CDISC Data"
        SDTM[SDTM-IG 3.4]
        CT[Controlled Terminology]
        DOMAINS[Domain Definitions]
    end

    STANDARDS[tss-standards]
    SDTM --> STANDARDS
    CT --> STANDARDS
    DOMAINS --> STANDARDS
    STANDARDS --> SUBMIT[tss-submit]
    STANDARDS --> INGEST[tss-ingest]
    style STANDARDS fill:#50c878,color:#fff

Key Technologies

Core Stack

ComponentTechnology
LanguageRust 1.92+
GUI FrameworkIced 0.14.0 (Elm architecture)
Data ProcessingPolars
SerializationSerde
TestingInsta, Proptest

External Crates

PurposeCrate
Fuzzy matchingrapidfuzz
XML processingquick-xml
XPT handlingxportrs
Loggingtracing
HTTP clientreqwest

GUI Architecture

Trial Submission Studio uses Iced 0.14.0 with the Elm architecture pattern:

flowchart LR
    subgraph "Elm Architecture"
        View["View<br/>(render UI)"]
        Message["Message<br/>(user action)"]
        Update["Update<br/>(handle message)"]
        State["State<br/>(app data)"]
    end

    View --> Message
    Message --> Update
    Update --> State
    State --> View

    style View fill:#4a90d9,color:#fff
    style State fill:#50c878,color:#fff

Key GUI components:

  • State types: ViewState, Study, DomainState, Settings
  • Message enums: Message -> HomeMessage, DomainEditorMessage, DialogMessage
  • View functions: Organized by screen (home/, domain_editor/, dialog/)
  • Theme: Clinical-style theming with custom color palette

Embedded Data

Standards Directory

standards/
├── sdtm/
│   └── ig/v3.4/
│       ├── Datasets.csv         # Domain definitions
│       ├── Variables.csv        # Variable metadata
│       ├── metadata.toml        # Version info
│       └── chapters/            # IG chapter documentation
├── adam/
│   └── ig/v1.3/
│       ├── DataStructures.csv   # ADaM structures
│       ├── Variables.csv        # Variable metadata
│       └── metadata.toml
├── send/
│   └── ig/v3.1.1/
│       ├── Datasets.csv         # SEND domains
│       ├── Variables.csv        # Variable metadata
│       └── metadata.toml
├── terminology/
│   ├── 2024-03-29/              # CT release date
│   │   ├── SDTM_CT_*.csv
│   │   ├── SEND_CT_*.csv
│   │   └── ADaM_CT_*.csv
│   ├── 2025-03-28/
│   └── 2025-09-26/              # Latest CT
├── validation/
│   ├── sdtm/Rules.csv           # SDTM validation rules
│   ├── adam/Rules.csv           # ADaM validation rules
│   └── send/Rules.csv           # SEND validation rules
└── xsl/
    ├── define2-0-0.xsl          # Define-XML stylesheets
    └── define2-1.xsl

Testing Strategy

Test Types

TypePurposeCrates
UnitFunction-levelAll
IntegrationCross-cratetss-gui
SnapshotOutput stabilitytss-submit/export
PropertyEdge casestss-submit/map, tss-submit/validate

Test Data

Mock datasets in mockdata/ for:

  • Various domain types
  • Edge cases
  • Validation testing

Next Steps

tss-gui

The desktop application crate providing the graphical user interface using Iced 0.14.0.

Overview

tss-gui is the main entry point for Trial Submission Studio, built with the Iced GUI framework following the Elm architecture pattern. It provides a clinical-style desktop interface for transforming clinical trial data into FDA-compliant formats.

Responsibilities

  • Application window and layout
  • User interaction handling via message passing
  • Navigation between workflow steps
  • Data visualization
  • File dialogs and system integration
  • Multi-window dialog management

Dependencies

[dependencies]
# Iced GUI framework (Elm architecture)
iced = { version = "0.14.0", features = [
    "tokio",     # Async runtime for Task::perform
    "image",     # Image loading (PNG icons)
    "svg",       # SVG rendering for icons
    "markdown",  # Markdown widget (changelogs)
    "lazy",      # Lazy widget rendering
    "advanced",  # Advanced widget capabilities
] }
iced_fonts = { version = "0.3.0", features = ["lucide"] }

# File dialogs
rfd = "0.17"

# System integration
directories = "6.0"
open = "5.3"

# Path dependencies
tss-ingest = { path = "../tss-ingest" }
tss-standards = { path = "../tss-standards" }
tss-submit = { path = "../tss-submit" }
tss-updater = { path = "../tss-updater" }

Architecture

Elm Architecture Pattern

Trial Submission Studio follows Iced’s Elm architecture:

flowchart LR
    subgraph "Elm Architecture"
        View["View<br/>(render UI)"]
        Message["Message<br/>(user action)"]
        Update["Update<br/>(handle message)"]
        State["State<br/>(app data)"]
    end

    View --> Message
    Message --> Update
    Update --> State
    State --> View

    style View fill:#4a90d9,color:#fff
    style State fill:#50c878,color:#fff

Key principles:

  • State is the single source of truth
  • All state changes happen through messages in update()
  • Views are pure functions of state
  • Async operations use Task, not channels

Module Structure

tss-gui/
├── src/
│   ├── main.rs              # Entry point
│   ├── lib.rs               # Library root
│   ├── app/
│   │   ├── mod.rs           # App struct with new(), update(), view()
│   │   └── handler/         # Message handlers by category
│   ├── state/
│   │   ├── mod.rs           # AppState root container
│   │   ├── view_state.rs    # ViewState, EditorTab, filters
│   │   ├── study.rs         # Study data structure
│   │   ├── domain_state.rs  # DomainState, mappings, SUPP
│   │   └── settings.rs      # User preferences (persisted)
│   ├── message/
│   │   ├── mod.rs           # Root Message enum
│   │   ├── home.rs          # HomeMessage
│   │   ├── domain_editor.rs # DomainEditorMessage
│   │   ├── dialog.rs        # DialogMessage (About, Settings, etc.)
│   │   ├── export.rs        # ExportMessage
│   │   └── menu.rs          # MenuMessage
│   ├── view/
│   │   ├── mod.rs           # View router
│   │   ├── home/            # Home/welcome screen views
│   │   ├── domain_editor/   # Domain editor tabs (mapping, validation, etc.)
│   │   ├── dialog/          # Dialog windows (about, settings, update)
│   │   └── export.rs        # Export view
│   ├── component/
│   │   ├── mod.rs           # Reusable UI components
│   │   ├── data_grid.rs     # Virtual scrolling data table
│   │   ├── toast.rs         # Toast notifications
│   │   └── ...
│   ├── theme/
│   │   ├── mod.rs           # Clinical-style theming
│   │   ├── colors.rs        # Color palette
│   │   └── styles.rs        # Widget styles
│   ├── menu/
│   │   ├── mod.rs           # Menu system
│   │   ├── macos/           # Native macOS menu (muda)
│   │   └── desktop/         # In-app menu (Windows/Linux)
│   └── service/
│       └── ...              # Background services
└── assets/
    ├── icon.svg
    └── icon.png

Application Entry Point

The application uses Iced 0.14.0’s builder pattern:

fn main() -> iced::Result {
    iced::application(App::new, App::update, App::view)
        .title(App::title)
        .theme(App::theme)
        .subscription(App::subscription)
        .settings(settings)
        .run_with(|| App::init())
}

State Management

Root State

#![allow(unused)]
fn main() {
pub struct AppState {
    /// Current view and its associated UI state
    pub view: ViewState,

    /// Loaded study data (None when no study is open)
    pub study: Option<Study>,

    /// User settings (persisted to disk)
    pub settings: Settings,

    /// CDISC Controlled Terminology registry
    pub terminology: Option<TerminologyRegistry>,

    /// Current error message to display (transient)
    pub error: Option<String>,

    /// Whether a background task is running
    pub is_loading: bool,

    /// Tracks open dialog windows
    pub dialog_windows: DialogWindows,

    /// Active toast notification
    pub toast: Option<ToastState>,
}
}

View State

#![allow(unused)]
fn main() {
pub enum ViewState {
    /// Home/welcome screen
    Home { /* pagination, filters */ },

    /// Domain editor with tabs
    DomainEditor {
        domain: String,
        tab: EditorTab,
        mapping_ui: MappingUiState,
        normalization_ui: NormalizationUiState,
        validation_ui: ValidationUiState,
        preview_ui: PreviewUiState,
        supp_ui: SuppUiState,
    },

    /// Export view
    Export(ExportViewState),
}

pub enum EditorTab {
    Mapping,
    Normalization,
    Validation,
    Preview,
    Supp,
}
}

View Hierarchy

flowchart TD
    App[App]

    subgraph Views
        Home[Home View]
        DomainEditor[Domain Editor]
        Export[Export View]
    end

    subgraph "Domain Editor Tabs"
        Mapping[Mapping Tab]
        Normalization[Normalization Tab]
        Validation[Validation Tab]
        Preview[Preview Tab]
        Supp[SUPP Tab]
    end

    subgraph Dialogs
        About[About Dialog]
        Settings[Settings Dialog]
        Update[Update Dialog]
        ThirdParty[Third Party Dialog]
    end

    App --> Views
    DomainEditor --> Mapping
    DomainEditor --> Normalization
    DomainEditor --> Validation
    DomainEditor --> Preview
    DomainEditor --> Supp
    App -.-> Dialogs

Message System

Hierarchical Messages

#![allow(unused)]
fn main() {
pub enum Message {
    // Navigation
    Navigate(ViewState),
    SetWorkflowMode(WorkflowMode),

    // View-specific messages
    Home(HomeMessage),
    DomainEditor(DomainEditorMessage),
    Export(ExportMessage),

    // Dialogs
    Dialog(DialogMessage),

    // Menu
    MenuAction(MenuAction),

    // Background task results
    StudyLoaded(Result<(Study, TerminologyRegistry), String>),
    PreviewReady { domain: String, result: Result<DataFrame, String> },
    ValidationComplete { domain: String, report: ValidationReport },
    UpdateCheckComplete(Result<Option<UpdateInfo>, String>),

    // Global events
    KeyPressed(Key, Modifiers),
    FolderSelected(Option<PathBuf>),
    Toast(ToastMessage),
}
}

Message Flow Example

sequenceDiagram
    participant User
    participant View
    participant Update
    participant State
    participant Task

    User->>View: Clicks "Load Study"
    View->>Update: Message::Home(LoadStudy(path))
    Update->>State: state.is_loading = true
    Update->>Task: Task::perform(load_study())
    Task-->>Update: Message::StudyLoaded(result)
    Update->>State: state.study = Some(study)
    State->>View: Re-render with study data

Key Components

Data Grid

Custom virtual-scrolling widget for large datasets:

#![allow(unused)]
fn main() {
// Features:
// - Virtual scrolling for performance
// - Column sorting
// - Row selection
// - Type-aware formatting (dates, numbers)
// - Alternating row colors
}

Toast Notifications

Non-blocking notifications for user feedback:

#![allow(unused)]
fn main() {
pub struct ToastState {
    pub message: String,
    pub level: ToastLevel,  // Success, Info, Warning, Error
    pub progress: Option<f32>,
}
}

Multi-Window Dialogs

Dialogs open as separate windows (Iced multi-window support):

#![allow(unused)]
fn main() {
pub struct DialogWindows {
    pub about: Option<window::Id>,
    pub settings: Option<(window::Id, SettingsCategory)>,
    pub update: Option<(window::Id, UpdateState)>,
    pub third_party: Option<(window::Id, ThirdPartyState)>,
    pub export_progress: Option<(window::Id, ExportProgressState)>,
}
}

Configuration

Settings Storage

User preferences stored in:

  • macOS: ~/Library/Application Support/Trial Submission Studio/
  • Windows: %APPDATA%\Trial Submission Studio\
  • Linux: ~/.config/trial-submission-studio/

Configurable Options

#![allow(unused)]
fn main() {
pub struct Settings {
    pub recent_studies: Vec<RecentStudy>,
    pub default_export_dir: Option<PathBuf>,
    pub workflow_type: WorkflowType,
    pub ig_version: SdtmIgVersion,
    pub xpt_version: XptVersion,
    pub export_format: ExportFormat,
    // Display settings, validation settings, etc.
}
}

Platform-Specific Features

macOS

  • Native menu bar via muda crate
  • Sparkle-style updates via tss-updater-helper
  • App bundle support

Windows/Linux

  • In-app menu bar
  • Standard installer/package updates

Running

# Development
cargo run --package tss-gui

# Release
cargo run --release --package tss-gui

Testing

cargo test --package tss-gui

Testing focuses on:

  • State transitions
  • Message handling
  • Data transformations
  • Integration with other crates

See Also

tss-submit

Core submission preparation crate with mapping, normalization, validation, and export.

Overview

tss-submit is the central processing crate that implements the complete 4-stage pipeline for transforming source data into FDA-compliant CDISC formats. It consolidates all data transformation logic into a single, cohesive module structure.

Architecture

Module Structure

tss-submit/
├── src/
│   ├── lib.rs              # Crate root, re-exports
│   ├── map/                # Column-to-variable mapping
│   │   ├── mod.rs
│   │   ├── error.rs        # Mapping errors
│   │   ├── score.rs        # Fuzzy scoring engine
│   │   └── state.rs        # Mapping state management
│   ├── normalize/          # Data transformation
│   │   ├── mod.rs
│   │   ├── error.rs        # Normalization errors
│   │   ├── types.rs        # Rule definitions
│   │   ├── inference.rs    # Rule inference from metadata
│   │   ├── executor.rs     # Pipeline execution
│   │   ├── preview.rs      # Preview dataframe building
│   │   └── normalization/  # Transform implementations
│   │       ├── ct.rs       # Controlled terminology
│   │       ├── datetime.rs # ISO 8601 dates
│   │       ├── duration.rs # ISO 8601 durations
│   │       ├── numeric.rs  # Numeric formatting
│   │       └── studyday.rs # Study day calculation
│   ├── validate/           # CDISC conformance
│   │   ├── mod.rs
│   │   ├── issue.rs        # Issue types and severity
│   │   ├── report.rs       # Validation report
│   │   ├── util.rs         # Helper utilities
│   │   ├── rules/          # Rule categories
│   │   └── checks/         # Validation checks
│   │       ├── ct.rs       # Controlled terminology
│   │       ├── required.rs # Required variables
│   │       ├── expected.rs # Expected variables
│   │       ├── datatype.rs # Data types
│   │       ├── dates.rs    # Date formats
│   │       ├── sequence.rs # Sequence uniqueness
│   │       ├── length.rs   # Field lengths
│   │       └── identifier.rs # Identifier nulls
│   └── export/             # Output generation
│       ├── mod.rs
│       ├── common.rs       # Shared utilities
│       ├── types.rs        # Domain frame types
│       ├── xpt.rs          # XPT V5/V8 format
│       ├── dataset_xml.rs  # Dataset-XML format
│       └── define_xml.rs   # Define-XML 2.1

Pipeline Flow

flowchart LR
    CSV[Source CSV] --> MAP[map/]
    MAP --> NORM[normalize/]
    NORM --> VAL[validate/]
    VAL --> EXP[export/]

    EXP --> XPT[XPT V5/V8]
    EXP --> XML[Dataset-XML]
    EXP --> DEF[Define-XML 2.1]

    subgraph "tss-submit"
        MAP
        NORM
        VAL
        EXP
    end

    style CSV fill:#e8f4f8
    style XPT fill:#d4edda
    style XML fill:#d4edda
    style DEF fill:#d4edda

Dependencies

[dependencies]
anyhow = "1"
chrono = "0.4"
polars = { version = "0.46", features = ["lazy", "csv"] }
quick-xml = "0.37"
rapidfuzz = "0.5"
regex = "1.12"
serde = { version = "1", features = ["derive"] }
thiserror = "2"
tracing = "0.1"
xportrs = "0.3"

tss-standards = { path = "../tss-standards" }

Module: map/

Fuzzy column-to-variable mapping with confidence scoring.

Design Philosophy

  • Simple: Pure Jaro-Winkler scoring with minimal adjustments
  • Explainable: Score breakdowns show why a match scored as it did
  • Session-only: No persistence, mappings live for the session duration
  • Centralized: GUI calls this module for scoring instead of reimplementing

Key Types

#![allow(unused)]
fn main() {
pub enum VariableStatus {
    Unmapped,      // No suggestion or mapping
    Suggested,     // Auto-suggestion available
    Accepted,      // User accepted a mapping
}

pub struct ColumnScore {
    pub column: String,
    pub score: f64,
    pub components: Vec<ScoreComponent>,
}

pub struct MappingState {
    // Manages all mappings for a domain session
}
}

API Usage

#![allow(unused)]
fn main() {
use tss_submit::map::{MappingState, VariableStatus};

// Create mapping state with auto-suggestions
let mut state = MappingState::new(domain, "STUDY01", &columns, hints, 0.6);

// Check and accept mappings
match state.status("USUBJID") {
    VariableStatus::Suggested => {
        state.accept_suggestion("USUBJID").unwrap();
    }
    VariableStatus::Unmapped => {
        state.accept_manual("USUBJID", "SUBJECT_ID").unwrap();
    }
    _ => {}
}

// Get all scores for dropdown sorting
let scores = state.scorer().score_all_for_variable("AETERM", &available_cols);
}

Module: normalize/

Data-driven, variable-level normalization for SDTM compliance.

Design Principles

  • Metadata-driven: All normalization types inferred from Variable metadata
  • SDTM-compliant: Follows SDTMIG v3.4 rules for dates, CT, sequences
  • Stateless functions: Pure functions for easy testing and composition
  • Error preservation: On normalization failure, preserve original value + log

Normalization Types

TypeDescriptionExample
DateTimeISO 8601 datetime2024-01-152024-01-15T00:00:00
ControlledTerminologyCT codelist mappingmaleM
DurationISO 8601 duration2 weeksP14D
StudyDayCalculate –DYReference date to study day
NumericNumeric formattingPrecision and rounding

API Usage

#![allow(unused)]
fn main() {
use tss_submit::normalize::{
    infer_normalization_rules,
    execute_normalization,
    NormalizationContext
};

// Infer rules from SDTM metadata
let pipeline = infer_normalization_rules(&domain);

// Create execution context
let context = NormalizationContext::new("CDISC01", "AE")
    .with_mappings(mappings);

// Execute normalizations
let result_df = execute_normalization(&source_df, &pipeline, &context)?;
}

Module: validate/

Comprehensive CDISC conformance checking.

Validation Checks

flowchart TD
    subgraph Checks
        CT[Controlled Terminology]
        REQ[Required Variables]
        EXP[Expected Variables]
        TYPE[Data Types]
        DATE[Date Formats]
        SEQ[Sequence Uniqueness]
        LEN[Field Lengths]
        ID[Identifier Nulls]
    end

    subgraph Severity
        ERR[Error]
        WARN[Warning]
        INFO[Info]
    end

    CT --> ERR
    REQ --> ERR
    EXP --> WARN
    TYPE --> ERR
    DATE --> WARN
    SEQ --> ERR
    LEN --> WARN
    ID --> ERR

    style ERR fill:#ef4444,color:#fff
    style WARN fill:#f59e0b,color:#fff
    style INFO fill:#3b82f6,color:#fff
CheckDescriptionSeverity
Controlled TerminologyValues match CT codelistsError
Required VariablesReq variables present and populatedError
Expected VariablesExp variables presentWarning
Data TypesNum columns contain numeric dataError
Date FormatsISO 8601 complianceWarning
Sequence Uniqueness–SEQ unique per USUBJIDError
Field LengthsCharacter field limitsWarning
Identifier NullsID variables have no nullsError

API Usage

#![allow(unused)]
fn main() {
use tss_submit::validate::{validate_domain, Issue, Severity};

// Validate a domain
let report = validate_domain(&domain, &df, ct_registry.as_ref());

// Process issues
for issue in &report.issues {
    match issue.severity() {
        Severity::Error => eprintln!("ERROR: {}", issue.message()),
        Severity::Warning => eprintln!("WARN: {}", issue.message()),
        Severity::Info => println!("INFO: {}", issue.message()),
    }
}

// Check if exportable
if report.has_errors() {
    println!("Cannot export: {} errors found", report.error_count());
}
}

Module: export/

Multi-format output generation for FDA submissions.

Supported Formats

FormatDescriptionUse Case
XPT V5/V8SAS Transport formatPrimary FDA submission
Dataset-XMLCDISC Dataset-XMLData exchange
Define-XML 2.1Metadata documentationSubmission documentation

API Usage

#![allow(unused)]
fn main() {
use tss_submit::export::{
    write_xpt_outputs,
    write_dataset_xml_outputs,
    write_define_xml,
    DomainFrame,
};

// Prepare domain data
let domains: Vec<DomainFrame> = vec![
    DomainFrame::new("DM", dm_df),
    DomainFrame::new("AE", ae_df),
];

// Export to XPT
write_xpt_outputs(&domains, output_dir)?;

// Export to Dataset-XML
let xml_options = DatasetXmlOptions::default();
write_dataset_xml_outputs(&domains, output_dir, &xml_options)?;

// Export Define-XML
let define_options = DefineXmlOptions::new("STUDY01", "1.0");
write_define_xml(&domains, &define_options, output_path)?;
}

Error Handling

Each module has dedicated error types:

#![allow(unused)]
fn main() {
// Mapping errors
pub enum MappingError {
    VariableNotFound(String),
    ColumnNotFound(String),
    AlreadyMapped(String),
}

// Normalization errors
pub enum NormalizationError {
    InvalidDate(String),
    InvalidCodelist(String, String),
    MissingContext(String),
}

// Validation uses Issue + Severity (not errors)
}

Testing

# Run all tss-submit tests
cargo test --package tss-submit

# Run specific module tests
cargo test --package tss-submit map::
cargo test --package tss-submit normalize::
cargo test --package tss-submit validate::
cargo test --package tss-submit export::

See Also

tss-ingest

CSV ingestion and schema detection crate.

Overview

tss-ingest handles loading source data files and detecting their schema.

Responsibilities

  • CSV file parsing
  • Schema detection (types, formats)
  • Domain suggestion
  • Data preview generation

Dependencies

[dependencies]
csv = "1.3"
polars = { version = "0.46", features = ["lazy", "csv"] }
encoding_rs = "0.8"
tss-standards = { path = "../tss-standards" }

Architecture

Module Structure

tss-ingest/
├── src/
│   ├── lib.rs
│   ├── reader.rs        # CSV reading
│   ├── schema.rs        # Schema detection
│   ├── types.rs         # Type inference
│   ├── domain.rs        # Domain suggestion
│   └── preview.rs       # Data preview

Schema Detection

Type Inference

#![allow(unused)]
fn main() {
pub enum InferredType {
    Integer,
    Float,
    Date(String),      // With format pattern
    DateTime(String),
    Boolean,
    Text,
}
}

Detection Algorithm

  1. Sample first N rows
  2. For each column:
    • Try parsing as integer
    • Try parsing as float
    • Try common date formats
    • Default to text

Date Format Detection

PatternExample
%Y-%m-%d2024-01-15
%m/%d/%Y01/15/2024
%d-%m-%Y15-01-2024
%Y-%m-%dT%H:%M:%S2024-01-15T09:30:00

API

Loading a File

#![allow(unused)]
fn main() {
use tss_ingest::{CsvReader, IngestOptions};

let options = IngestOptions {
encoding: Some("utf-8"),
sample_rows: 1000,
..Default::default ()
};

let result = CsvReader::read("data.csv", options) ?;
println!("Rows: {}", result.row_count);
println!("Columns: {:?}", result.schema.columns);
}

Schema Result

#![allow(unused)]
fn main() {
pub struct IngestResult {
    pub data: DataFrame,
    pub schema: DetectedSchema,
    pub suggested_domain: Option<String>,
    pub warnings: Vec<IngestWarning>,
}

pub struct DetectedSchema {
    pub columns: Vec<ColumnInfo>,
}

pub struct ColumnInfo {
    pub name: String,
    pub inferred_type: InferredType,
    pub null_count: usize,
    pub sample_values: Vec<String>,
}
}

Domain Suggestion

Based on column names, suggest likely SDTM domain:

Column PatternsSuggested Domain
USUBJID, AGE, SEXDM
AETERM, AESTDTCAE
VSTESTCD, VSORRESVS
LBTESTCD, LBORRESLB
#![allow(unused)]
fn main() {
pub fn suggest_domain(columns: &[String]) -> Option<String> {
    // Pattern matching logic
}
}

Error Handling

Common Issues

IssueHandling
Encoding errorTry alternative encodings
Parse errorMark as text, warn user
Empty fileReturn error
No headerRequire user action

Testing

cargo test --package tss-ingest

Test Files

Located in mockdata/:

  • Various CSV formats
  • Different encodings
  • Edge cases

See Also

tss-standards

CDISC standards data loader crate.

Overview

tss-standards loads and provides access to embedded CDISC standard definitions.

Responsibilities

  • Load SDTM-IG definitions
  • Load controlled terminology
  • Provide domain/variable metadata
  • Version management

Dependencies

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
include_dir = "0.7"
polars = { version = "0.46", features = ["lazy", "csv"] }

Architecture

Module Structure

tss-standards/
├── src/
│   ├── lib.rs
│   ├── loader.rs         # Data loading
│   ├── sdtm.rs           # SDTM definitions
│   ├── terminology.rs    # Controlled terminology
│   └── cache.rs          # In-memory caching

Embedded Data

Standards are embedded at compile time:

#![allow(unused)]
fn main() {
use include_dir::{include_dir, Dir};

static STANDARDS_DIR: Dir = include_dir!("$CARGO_MANIFEST_DIR/../standards");
}

Data Structures

SDTM Definitions

#![allow(unused)]
fn main() {
pub struct SdtmIg {
    pub version: String,
    pub domains: Vec<DomainDefinition>,
}

pub struct DomainDefinition {
    pub code: String,           // e.g., "DM"
    pub name: String,           // e.g., "Demographics"
    pub class: DomainClass,
    pub structure: String,
    pub variables: Vec<VariableDefinition>,
}

pub struct VariableDefinition {
    pub name: String,
    pub label: String,
    pub data_type: DataType,
    pub core: Core,             // Required/Expected/Permissible
    pub codelist: Option<String>,
    pub description: String,
}
}

Controlled Terminology

#![allow(unused)]
fn main() {
pub struct ControlledTerminology {
    pub version: String,
    pub codelists: Vec<Codelist>,
}

pub struct Codelist {
    pub code: String,           // e.g., "C66731"
    pub name: String,           // e.g., "Sex"
    pub extensible: bool,
    pub terms: Vec<Term>,
}

pub struct Term {
    pub code: String,
    pub value: String,
    pub synonyms: Vec<String>,
}
}

API

Loading Standards

#![allow(unused)]
fn main() {
use tss_standards::Standards;

// Load with specific versions
let standards = Standards::load(
SdtmVersion::V3_4,
CtVersion::V2024_12_20,
) ?;

// Get domain definition
let dm = standards.get_domain("DM") ?;

// Get codelist
let sex = standards.get_codelist("SEX") ?;
}

Querying

#![allow(unused)]
fn main() {
// Get required variables for domain
let required = standards.required_variables("DM");

// Check if value is in codelist
let valid = standards.is_valid_term("SEX", "M");

// Get variable definition
let var = standards.get_variable("DM", "USUBJID") ?;
}

Embedded Data Format

SDTM JSON

{
  "version": "3.4",
  "domains": [
    {
      "code": "DM",
      "name": "Demographics",
      "class": "SPECIAL_PURPOSE",
      "structure": "One record per subject",
      "variables": [
        {
          "name": "STUDYID",
          "label": "Study Identifier",
          "dataType": "Char",
          "core": "Required"
        }
      ]
    }
  ]
}

CT JSON

{
  "version": "2024-12-20",
  "codelists": [
    {
      "code": "C66731",
      "name": "Sex",
      "extensible": false,
      "terms": [
        {
          "code": "C16576",
          "value": "F"
        },
        {
          "code": "C20197",
          "value": "M"
        }
      ]
    }
  ]
}

Caching

Standards are cached in memory after first load:

#![allow(unused)]
fn main() {
lazy_static! {
    static ref STANDARDS_CACHE: RwLock<Option<Standards>> = RwLock::new(None);
}
}

Testing

cargo test --package tss-standards

Test Categories

  • JSON parsing
  • Version loading
  • Query accuracy
  • Missing data handling

See Also

tss-updater

Application update mechanism crate.

Overview

tss-updater checks for and applies application updates from GitHub releases.

Responsibilities

  • Check for new versions
  • Download updates
  • Verify checksums
  • Apply updates (platform-specific)

Dependencies

[dependencies]
reqwest = { version = "0.12", features = ["json"] }
semver = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sha2 = "0.10"

Architecture

Module Structure

tss-updater/
├── src/
│   ├── lib.rs
│   ├── checker.rs       # Version checking
│   ├── downloader.rs    # Download handling
│   ├── verifier.rs      # Checksum verification
│   └── installer.rs     # Update installation

Update Flow

┌─────────────────┐
│ Check Version   │
│ (GitHub API)    │
└────────┬────────┘
         │ New version?
         ▼
┌─────────────────┐
│ Download Asset  │
│ (Release file)  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Verify Checksum │
│ (SHA256)        │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Install Update  │
│ (Platform)      │
└─────────────────┘

API

Checking for Updates

#![allow(unused)]
fn main() {
use tss_updater::{UpdateChecker, UpdateInfo};

let checker = UpdateChecker::new("rubentalstra", "Trial-Submission-Studio");

match checker.check_for_updates(current_version)? {
Some(update) => {
println ! ("New version available: {}", update.version);
println ! ("Release notes: {}", update.notes);
}
None => {
println ! ("You're up to date!");
}
}
}

Update Info

#![allow(unused)]
fn main() {
pub struct UpdateInfo {
    pub version: Version,
    pub notes: String,
    pub download_url: String,
    pub checksum_url: String,
    pub published_at: DateTime<Utc>,
}
}

Downloading

#![allow(unused)]
fn main() {
use tss_updater::Downloader;

let downloader = Downloader::new();
let progress_callback = | percent| {
println ! ("Download: {}%", percent);
};

downloader.download( & update.download_url, & temp_path, progress_callback) ?;
}

Verification

#![allow(unused)]
fn main() {
use tss_updater::Verifier;

let verifier = Verifier::new();
let expected_hash = verifier.fetch_checksum( & update.checksum_url) ?;

if verifier.verify_file( & temp_path, & expected_hash)? {
println ! ("Checksum verified!");
} else {
return Err(UpdateError::ChecksumMismatch);
}
}

Platform-Specific Installation

macOS

Uses tss-updater-helper for atomic bundle swap:

  1. Download new app bundle to temp directory
  2. Spawn tss-updater-helper with config
  3. Main app exits
  4. Helper performs atomic swap and relaunches

See tss-updater-helper for details.

Windows

  1. Extract to temp location
  2. Schedule replacement on restart
  3. Restart application

Linux

  1. Extract new binary
  2. Replace existing binary
  3. Restart application

Security

HTTPS Only

All connections use HTTPS:

  • GitHub API
  • Release downloads
  • Checksum files

Checksum Verification

SHA256 checksums verified before installation.

Signed Releases

(Future) Code signing verification for releases.

Configuration

Update Settings

#![allow(unused)]
fn main() {
pub struct UpdateConfig {
    pub check_on_startup: bool,
    pub auto_download: bool,
    pub prerelease: bool,  // Include prereleases
}
}

Default Behavior

  • Check on startup (with delay)
  • Notify user, don’t auto-install
  • Stable releases only

Error Handling

#![allow(unused)]
fn main() {
#[derive(Error, Debug)]
pub enum UpdateError {
    #[error("Network error: {0}")]
    Network(#[from] reqwest::Error),

    #[error("Checksum mismatch")]
    ChecksumMismatch,

    #[error("Installation failed: {0}")]
    InstallFailed(String),
}
}

Testing

cargo test --package tss-updater

Test Strategy

  • Mock HTTP responses
  • Checksum calculation tests
  • Version comparison tests

See Also

tss-updater-helper

macOS update helper binary for app bundle swapping.

Overview

tss-updater-helper is a minimal helper binary that performs the actual app bundle swap during updates on macOS. It is spawned by the main application after downloading an update, allowing the main app to exit while the update is applied.

This crate is macOS-only - on other platforms, it compiles to a no-op binary that exits immediately.

Why a Separate Binary?

On macOS, applications are distributed as .app bundles (directories). The main application cannot replace itself while running because:

  1. The executable is locked by the OS while running
  2. Bundle contents may be memory-mapped
  3. Code signing requires atomic bundle replacement

The helper binary solves this by:

  • Running as a separate process
  • Waiting for the parent app to exit
  • Performing the swap atomically
  • Relaunching the updated app

Architecture

Module Structure

tss-updater-helper/
├── src/
│   ├── main.rs          # Entry point, orchestration
│   ├── config.rs        # JSON config parsing
│   ├── launch.rs        # Parent wait, app relaunch
│   ├── log.rs           # File-based logging
│   ├── quarantine.rs    # Remove macOS quarantine
│   ├── signature.rs     # Code signature verification
│   ├── status.rs        # Status file for feedback
│   └── swap.rs          # Atomic bundle swap

Update Flow

sequenceDiagram
    participant App as Main App
    participant Helper as tss-updater-helper
    participant FS as File System

    App->>FS: Download new .app to temp
    App->>FS: Write config JSON
    App->>Helper: Spawn with config path
    App->>App: Exit

    Helper->>Helper: Wait for parent exit
    Helper->>FS: Remove quarantine attribute
    Helper->>Helper: Verify code signature
    Helper->>FS: current.app → backup.app
    Helper->>FS: new.app → current.app
    Helper->>FS: Write status file
    Helper->>App: Relaunch via open command
    Helper->>FS: Delete backup.app

Dependencies

[target.'cfg(target_os = "macos")'.dependencies]
chrono = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

The crate has no dependencies on non-macOS platforms.

Configuration

The helper receives configuration via a JSON file (not stdin, to avoid race conditions):

#![allow(unused)]
fn main() {
pub struct HelperConfig {
    pub new_app_path: PathBuf,      // Path to downloaded .app
    pub current_app_path: PathBuf,   // Path to current .app
    pub parent_pid: u32,             // PID to wait for
    pub version: String,             // New version string
    pub previous_version: String,    // Current version string
}
}

Example config file:

{
  "new_app_path": "/tmp/TSS-update/Trial Submission Studio.app",
  "current_app_path": "/Applications/Trial Submission Studio.app",
  "parent_pid": 12345,
  "version": "0.1.0",
  "previous_version": "0.0.9"
}

Process Steps

1. Wait for Parent Exit

#![allow(unused)]
fn main() {
pub fn wait_for_parent(pid: u32) {
    // Poll until process no longer exists
    while process_exists(pid) {
        std::thread::sleep(Duration::from_millis(100));
    }
}
}

2. Remove Quarantine

Downloaded files on macOS have a quarantine attribute that triggers Gatekeeper. The helper removes this:

#![allow(unused)]
fn main() {
pub fn remove_quarantine(path: &Path) -> Result<()> {
    Command::new("xattr")
        .args(["-rd", "com.apple.quarantine"])
        .arg(path)
        .output()?;
    Ok(())
}
}

3. Verify Code Signature

Before replacing the current app, verify the new bundle is properly signed:

#![allow(unused)]
fn main() {
pub fn verify_signature(path: &Path) -> Result<()> {
    let output = Command::new("codesign")
        .args(["--verify", "--deep", "--strict"])
        .arg(path)
        .output()?;

    if !output.status.success() {
        return Err(anyhow!("Code signature verification failed"));
    }
    Ok(())
}
}

4. Atomic Bundle Swap

The swap is performed atomically to prevent corruption:

#![allow(unused)]
fn main() {
pub fn swap_bundles(new: &Path, current: &Path) -> Result<SwapResult> {
    let backup = current.with_extension("app.backup");

    // Move current → backup
    fs::rename(current, &backup)?;

    // Move new → current
    fs::rename(new, current)?;

    Ok(SwapResult { backup_path: backup })
}
}

5. Status File

A status file is written for the relaunched app to display feedback:

#![allow(unused)]
fn main() {
pub struct UpdateStatus {
    pub success: bool,
    pub version: String,
    pub previous_version: String,
    pub error: Option<String>,
    pub log_path: PathBuf,
    pub timestamp: DateTime<Utc>,
}
}

Location: ~/Library/Application Support/Trial Submission Studio/update-status.json

6. Relaunch

The app is relaunched using macOS open command:

#![allow(unused)]
fn main() {
pub fn relaunch(app_path: &Path) -> Result<()> {
    Command::new("open")
        .arg("-a")
        .arg(app_path)
        .spawn()?;
    Ok(())
}
}

Logging

All operations are logged to a file for debugging:

~/Library/Logs/Trial Submission Studio/update-helper.log

Example log output:

[2024-01-15T10:30:00Z] Trial Submission Studio Update Helper started
[2024-01-15T10:30:00Z] Config file: /tmp/tss-update-config.json
[2024-01-15T10:30:00Z] Config loaded: HelperConfig { ... }
[2024-01-15T10:30:00Z] Paths validated
[2024-01-15T10:30:01Z] Parent process 12345 exited
[2024-01-15T10:30:01Z] Quarantine attribute removed
[2024-01-15T10:30:01Z] Code signature valid, Team ID: XXXXXXXXXX
[2024-01-15T10:30:02Z] Bundle swap complete, backup at: /Applications/Trial Submission Studio.app.backup
[2024-01-15T10:30:02Z] Status file written
[2024-01-15T10:30:02Z] Application relaunch command sent
[2024-01-15T10:30:02Z] Update complete!

Error Handling

On failure, the helper:

  1. Writes a failure status file with error details
  2. Does NOT attempt rollback (backup preserved for manual recovery)
  3. Logs the error
  4. Exits with non-zero code

The relaunched app reads the status file and displays appropriate feedback.

Building

# Build for current platform
cargo build --package tss-updater-helper --release

# The binary is only functional on macOS
# On other platforms, it compiles but exits immediately

Integration with tss-updater

The tss-updater crate spawns this helper during the update process:

#![allow(unused)]
fn main() {
// In tss-updater
let config_path = write_config_file(&config)?;

Command::new(helper_path)
    .arg(&config_path)
    .spawn()?;

// Parent app exits here
std::process::exit(0);
}

Security Considerations

  • Code signing: New bundles must pass codesign --verify
  • Quarantine removal: Only performed on verified bundles
  • Atomic swap: Prevents partial/corrupted installations
  • Backup preservation: Allows manual rollback if needed
  • Logging: Full audit trail for debugging

See Also

Design Decisions

Key architectural decisions and their rationale.

Why Rust?

Chosen: Rust

Rationale:

  • Memory safety without garbage collection
  • Performance comparable to C/C++
  • Type system catches errors at compile time
  • Cross-platform compilation to native binaries
  • Growing ecosystem for data processing

Alternatives Considered

LanguageProsCons
PythonFamiliar, many librariesPerformance, distribution
JavaCross-platform, matureJVM dependency, startup time
C++PerformanceMemory safety, complexity
GoSimple, fast compilationLess expressive types

Why Iced for GUI?

Chosen: Iced 0.14.0

Rationale:

  • Elm architecture - Predictable state management with unidirectional data flow
  • Pure Rust - No FFI complexity, native performance
  • Cross-platform - macOS, Windows, Linux
  • Type-safe messages - Compile-time guarantees for all user interactions
  • Async-first - Built-in Task system for background operations
  • Multi-window - Native support for dialog windows

Architecture Benefits

flowchart LR
    subgraph "Elm Architecture"
        View["View<br/>(render UI)"]
        Message["Message<br/>(user action)"]
        Update["Update<br/>(handle message)"]
        State["State<br/>(app data)"]
    end

    View --> Message
    Message --> Update
    Update --> State
    State --> View

    style View fill:#4a90d9,color:#fff
    style State fill:#50c878,color:#fff

The Elm architecture ensures:

  • State is the single source of truth
  • All state changes flow through update()
  • Views are pure functions of state
  • Easy debugging and testing

Alternatives Considered

FrameworkProsCons
eguiSimple immediate mode, rapid prototypingHarder state management at scale, no multi-window
TauriWeb tech, flexibleBundle size, two languages (Rust + JS)
GTK-rsNative lookPlatform differences, complex bindings
QtMature, richLicense complexity, C++ bindings

Why Polars for Data?

Chosen: Polars

Rationale:

  • Performance - Lazy evaluation, parallelism
  • Rust native - No Python dependency
  • DataFrame API - Familiar for data work
  • Memory efficient - Arrow-based

Alternatives Considered

LibraryProsCons
ndarrayLow-level controlMore manual work
ArrowStandard formatLess DataFrame features
CustomFull controlDevelopment time

Why Embed Standards?

Chosen: Embedded CDISC data

Rationale:

  • Offline operation - No network dependency
  • Deterministic - Consistent across runs
  • Fast - No API latency
  • Regulatory - Audit trail

Alternatives Considered

ApproachProsCons
API-basedAlways currentNetwork required, latency
Download on demandSmaller binaryCaching complexity
Plugin systemFlexibleDistribution complexity

Workspace Architecture

Chosen: Multi-crate workspace

Rationale:

  • Separation of concerns - Clear boundaries
  • Parallel compilation - Faster builds
  • Selective testing - Test only changed crates
  • Reusability - Crates can be used independently

Crate Boundaries

CratePrinciple
tss-guiUI only, delegates all processing to other crates
tss-submitCore pipeline (map, normalize, validate, export)
tss-ingestCSV parsing only, no transformation logic
tss-standardsPure data loading, no transformation logic
tss-updaterUpdate mechanism, no UI dependencies
tss-updater-helpermacOS-only binary, minimal dependencies

Data Processing Pipeline

Chosen: Lazy evaluation with checkpoints

Rationale:

  • Memory efficiency - Don’t load all data at once
  • Performance - Optimize query plans
  • Transparency - User sees intermediate results
  • Recoverability - Can resume from checkpoints

Pipeline Stages

flowchart LR
    subgraph Stage1[Import]
        I1[CSV File]
        I2[Schema Detection]
    end

    subgraph Stage2[Map]
        M1[Column Matching]
        M2[Type Conversion]
    end

    subgraph Stage3[Validate]
        V1[Structure Rules]
        V2[CT Validation]
        V3[Cross-Domain]
    end

    subgraph Stage4[Export]
        E1[XPT Generation]
        E2[XML Output]
    end

    I1 --> I2 --> M1 --> M2 --> V1 --> V2 --> V3 --> E1
    V3 --> E2
    V1 -.->|Errors| M1
    V2 -.->|Warnings| M1
    style I1 fill: #e8f4f8, stroke: #333
    style E1 fill: #d4edda, stroke: #333
    style E2 fill: #d4edda, stroke: #333

Validation Strategy

Chosen: Multi-level validation

Rationale:

  • Early feedback - Catch issues during mapping
  • Complete checking - Full validation before export
  • Severity levels - Error vs. warning vs. info
  • Actionable - Clear fix suggestions

Validation Levels

flowchart TB
    subgraph "Validation Layers"
        direction TB
        L1[Schema Validation<br/>File structure, encoding]
        L2[Mapping Validation<br/>Variable compatibility, types]
        L3[Content Validation<br/>CDISC compliance, CT checks]
        L4[Output Validation<br/>Format conformance, checksums]
    end

    IMPORT[Import] --> L1
    L1 --> MAP[Map]
    MAP --> L2
    L2 --> TRANSFORM[Transform]
    TRANSFORM --> L3
    L3 --> EXPORT[Export]
    EXPORT --> L4
    L4 --> OUTPUT[Output Files]
    L1 -.->|Schema Error| IMPORT
    L2 -.->|Type Mismatch| MAP
    L3 -.->|CT Error| TRANSFORM
    style L1 fill: #ffeeba, stroke: #333
    style L2 fill: #ffeeba, stroke: #333
    style L3 fill: #ffeeba, stroke: #333
    style L4 fill: #ffeeba, stroke: #333
    style OUTPUT fill: #d4edda, stroke: #333
LevelWhenPurpose
SchemaImportFile structure
MappingMap stepVariable compatibility
ContentPre-exportCDISC compliance
OutputExportFormat conformance

Error Handling

Chosen: Result types with context

Rationale:

  • No panics - Graceful error handling
  • Context - Where and why errors occurred
  • Recovery - Allow user to fix and continue
  • Logging - Full trace for debugging

Error Categories

CategoryHandling
User errorDisplay message, allow retry
Data errorShow affected rows, suggest fix
System errorLog, display generic message
BugLog with context, fail gracefully

File Format Choices

XPT V5 as Default

Rationale:

  • FDA requirement for submissions
  • Maximum compatibility
  • Well-documented format

XPT V8 as Option

Rationale:

  • Longer variable names
  • Larger labels
  • Future-proofing

Security Considerations

Data Privacy

  • No cloud - All processing local
  • No telemetry - No usage data collection
  • No network - Works fully offline

Code Security

  • Dependency audit - Regular cargo audit
  • Minimal dependencies - Reduce attack surface
  • Memory safety - Rust’s guarantees

Performance Goals

Target Metrics

OperationTarget
Import 100K rows< 2 seconds
Validation< 5 seconds
Export to XPT< 3 seconds
Application startup< 1 second

Optimization Strategies

  • Lazy evaluation
  • Parallel processing
  • Memory mapping for large files
  • Incremental validation

Future Considerations

Extensibility

The architecture supports future additions:

  • New CDISC standards (ADaM, SEND)
  • Additional output formats
  • Plugin system (potential)
  • CLI interface (potential)

Backward Compatibility

  • Configuration format versioning
  • Data migration paths
  • Deprecation warnings

Next Steps

Contributing: Getting Started

Thank you for your interest in contributing to Trial Submission Studio!

Ways to Contribute

Code Contributions

  • Bug fixes
  • New features
  • Performance improvements
  • Documentation updates

Non-Code Contributions

  • Bug reports
  • Feature requests
  • Documentation improvements
  • Testing and feedback
  • Helping other users

Before You Start

Prerequisites

  • Rust 1.92+ - Install via rustup
  • Git - For version control
  • Basic familiarity with Rust programming
  • (Optional) Understanding of CDISC SDTM standards

Read the Documentation

Familiarize yourself with:

Finding Issues to Work On

GitHub Issues

  1. Check GitHub Issues
  2. Look for labels:
    • good-first-issue - Great for newcomers
    • help-wanted - We’d love assistance
    • bug - Known issues to fix
    • enhancement - New features

Claiming an Issue

  1. Find an issue you want to work on
  2. Comment on the issue expressing interest
  3. Wait for maintainer feedback before starting
  4. Fork the repository
  5. Create a branch and start working

Contribution Workflow

Overview

flowchart LR
    A["Find Issue"] --> B["Comment"]
    B --> C["Fork"]
    C --> D["Branch"]
    D --> E["Code"]
    E --> F["Test"]
    F --> G["PR"]

    style A fill:#4a90d9,color:#fff
    style G fill:#50c878,color:#fff

Detailed Steps

  1. Find an issue (or create one)
  2. Comment to claim it
  3. Fork the repository
  4. Clone your fork
  5. Create a branch (feature/my-feature or fix/my-fix)
  6. Make changes
  7. Test your changes
  8. Commit with conventional commit messages
  9. Push to your fork
  10. Create a Pull Request

Communication

Where to Discuss

  • GitHub Issues - Bug reports, feature requests
  • GitHub Discussions - Questions, ideas, general discussion
  • Pull Requests - Code review discussion

Guidelines

  • Be respectful and constructive
  • Assume good intentions
  • Welcome newcomers
  • Focus on the code, not the person

Code of Conduct

Please read and follow our Code of Conduct.

Key points:

  • Be respectful and inclusive
  • Welcome newcomers
  • Focus on constructive feedback
  • Assume good intentions

Getting Help

Stuck on Something?

  1. Check existing documentation
  2. Search GitHub Issues/Discussions
  3. Ask in GitHub Discussions
  4. Open an issue with your question

Review Process

After submitting a PR:

  1. Automated checks run (CI)
  2. Maintainer reviews code
  3. Address any feedback
  4. Maintainer merges when ready

Recognition

Contributors are recognized in:

  • GitHub contributor list
  • Release notes (for significant contributions)
  • THIRD_PARTY_LICENSES.md (if adding dependencies)

Next Steps

Development Setup

Set up your development environment for contributing to Trial Submission Studio.

Prerequisites

Required

ToolVersionPurpose
Rust1.92+Programming language
GitAny recentVersion control

Optional

ToolPurpose
cargo-aboutLicense generation
cargo-watchAuto-rebuild on changes

Step 1: Fork and Clone

Fork on GitHub

  1. Go to Trial Submission Studio
  2. Click “Fork” in the top right
  3. Select your account

Clone Your Fork

git clone https://github.com/YOUR_USERNAME/trial-submission-studio.git
cd trial-submission-studio

Add Upstream Remote

git remote add upstream https://github.com/rubentalstra/Trial-Submission-Studio.git

Step 2: Install Rust

Using rustup

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Verify Installation

rustup show

Expected output should show Rust 1.92 or higher.

Install Required Toolchain

rustup toolchain install stable
rustup component add rustfmt clippy

Step 3: Platform Dependencies

macOS

No additional dependencies required.

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install -y libgtk-3-dev libxdo-dev

Windows

No additional dependencies required.

Step 4: Build the Project

Debug Build

cargo build

Release Build

cargo build --release

Check Build

cargo check

Step 5: Run the Application

cargo run --package tss-gui

Step 6: Run Tests

# All tests
cargo test

# Specific crate
cargo test --package tss-submit

# With output
cargo test -- --nocapture

Step 7: Run Lints

# Format check
cargo fmt --check

# Apply formatting
cargo fmt

# Clippy lints
cargo clippy -- -D warnings

IDE Setup

RustRover / IntelliJ IDEA

  1. Open the project folder
  2. Rust plugin auto-detects workspace
  3. Configure run configuration for tss-gui

VS Code

  1. Install rust-analyzer extension
  2. Open the project folder
  3. Extension auto-configures
  • rust-analyzer
  • Even Better TOML
  • Error Lens
  • GitLens

Project Structure

trial-submission-studio/
├── Cargo.toml              # Workspace config
├── crates/                 # All crates
│   ├── tss-gui/            # Main application (Iced 0.14.0)
│   ├── tss-submit/         # Mapping, normalization, validation, export
│   ├── tss-ingest/         # CSV loading
│   ├── tss-standards/      # CDISC standards loader
│   ├── tss-updater/        # Auto-update functionality
│   └── tss-updater-helper/ # macOS update helper
├── standards/              # Embedded CDISC data
├── mockdata/               # Test data
└── docs/                   # Documentation

Development Workflow

Create Feature Branch

git checkout main
git pull upstream main
git checkout -b feature/my-feature

Make Changes

  1. Edit code
  2. Run tests: cargo test
  3. Run lints: cargo clippy
  4. Format: cargo fmt

Commit Changes

git add .
git commit -m "feat: add my feature"

Push and Create PR

git push origin feature/my-feature

Then create PR on GitHub.

Useful Commands

CommandPurpose
cargo buildBuild debug
cargo build --releaseBuild release
cargo testRun all tests
cargo test --package XTest specific crate
cargo clippyRun linter
cargo fmtFormat code
cargo doc --openGenerate docs
cargo run -p tss-guiRun application

Troubleshooting

Build Fails

  1. Ensure Rust 1.92+: rustup update stable
  2. Clean build: cargo clean && cargo build
  3. Check dependencies: cargo fetch

Tests Fail

  1. Run with output: cargo test -- --nocapture
  2. Run specific test: cargo test test_name
  3. Check test data in mockdata/

GUI Won’t Start

  1. Check platform dependencies installed
  2. Try release build: cargo run --release -p tss-gui
  3. Check logs for errors

Next Steps

Coding Standards

Code style and quality guidelines for Trial Submission Studio.

Rust Style

Formatting

Use rustfmt for all code formatting:

# Check formatting
cargo fmt --check

# Apply formatting
cargo fmt

Linting

All code must pass Clippy with no warnings:

cargo clippy -- -D warnings

Naming Conventions

Crates

  • Lowercase with hyphens: tss-submit, tss-ingest
  • Prefix with tss- for project crates

Modules

  • Lowercase with underscores: column_mapping.rs
  • Keep names short but descriptive

Functions

#![allow(unused)]
fn main() {
// Good - descriptive, snake_case
fn calculate_similarity(source: &str, target: &str) -> f64

// Good - verb-noun pattern
fn validate_domain(data: &DataFrame) -> Vec<ValidationResult>

// Avoid - too abbreviated
fn calc_sim(s: &str, t: &str) -> f64
}

Types

#![allow(unused)]
fn main() {
// Good - PascalCase, descriptive
struct ValidationResult {
    ...
}
enum DomainClass {...}

// Good - clear trait naming
trait ValidationRule { ... }
}

Constants

#![allow(unused)]
fn main() {
// Good - SCREAMING_SNAKE_CASE
const MAX_VARIABLE_LENGTH: usize = 8;
const DEFAULT_CONFIDENCE_THRESHOLD: f64 = 0.8;
}

Code Organization

File Structure

#![allow(unused)]
fn main() {
// 1. Module documentation
//! Module description

// 2. Imports (grouped)
use std::collections::HashMap;

use serde::{Deserialize, Serialize};

use crate::model::Variable;

// 3. Constants
const DEFAULT_VALUE: i32 = 0;

// 4. Type definitions
pub struct MyStruct {
    ...
}

// 5. Implementations
impl MyStruct { ... }

// 6. Functions
pub fn my_function() { ... }

// 7. Tests (at bottom or in separate file)
#[cfg(test)]
mod tests {
    ...
}
}

Import Organization

Group imports in this order:

  1. Standard library
  2. External crates
  3. Internal crates
  4. Current crate modules
#![allow(unused)]
fn main() {
use std::path::Path;

use polars::prelude::*;
use serde::Serialize;

use tss_model::Variable;

use crate::mapping::Mapping;
}

Error Handling

Use Result Types

#![allow(unused)]
fn main() {
// Good - explicit error handling
pub fn parse_file(path: &Path) -> Result<Data, ParseError> {
    let content = std::fs::read_to_string(path)?;
    parse_content(&content)
}

// Avoid - panicking on errors
pub fn parse_file(path: &Path) -> Data {
    let content = std::fs::read_to_string(path).unwrap(); // Don't do this
    parse_content(&content).expect("parse failed") // Or this
}
}

Custom Error Types

#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Error, Debug)]
pub enum ValidationError {
    #[error("Missing required variable: {0}")]
    MissingVariable(String),

    #[error("Invalid value '{value}' for {variable}")]
    InvalidValue { variable: String, value: String },
}
}

Error Context

#![allow(unused)]
fn main() {
// Good - add context to errors
fs::read_to_string(path)
.map_err( | e| ParseError::FileRead {
path: path.to_path_buf(),
source: e,
}) ?;
}

Documentation

Public Items

All public items must be documented:

#![allow(unused)]
fn main() {
/// Validates data against SDTM rules.
///
/// # Arguments
///
/// * `data` - The DataFrame to validate
/// * `domain` - Target SDTM domain code
///
/// # Returns
///
/// Vector of validation results
///
/// # Example
///
/// ```
/// let results = validate(&data, "DM")?;
/// ```
pub fn validate(data: &DataFrame, domain: &str) -> Result<Vec<ValidationResult>> {
    // ...
}
}

Module Documentation

#![allow(unused)]
fn main() {
//! CSV ingestion and schema detection.
//!
//! This module provides functionality for loading CSV files
//! and automatically detecting their schema.
}

Testing

Test Organization

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic_case() {
        // Arrange
        let input = "test";

        // Act
        let result = process(input);

        // Assert
        assert_eq!(result, expected);
    }

    #[test]
    fn test_edge_case() {
        // ...
    }
}
}

Test Naming

#![allow(unused)]
fn main() {
// Good - descriptive test names
#[test]
fn parse_iso8601_date_returns_correct_value() { ... }

#[test]
fn validate_returns_error_for_missing_usubjid() { ... }

// Avoid - vague names
#[test]
fn test1() { ... }
}

Architecture Principles

Separation of Concerns

  • Keep business logic out of GUI code
  • I/O operations separate from data processing
  • Validation rules independent of data loading

Pure Functions

Prefer pure functions where possible:

#![allow(unused)]
fn main() {
// Good - pure function, easy to test
pub fn calculate_confidence(source: &str, target: &str) -> f64 {
    // No side effects, deterministic
}

// Use sparingly - side effects
pub fn log_and_calculate(source: &str, target: &str) -> f64 {
    tracing::info!("Calculating..."); // Side effect
    calculate_confidence(source, target)
}
}

Determinism

Output must be reproducible:

#![allow(unused)]
fn main() {
// Good - deterministic output
pub fn derive_sequence(data: &DataFrame, group_by: &[&str]) -> Vec<i32> {
    // Same input always produces same output
}

// Avoid - non-deterministic
pub fn derive_sequence_random(data: &DataFrame) -> Vec<i32> {
    // Uses random ordering - bad for regulatory compliance
}
}

Performance

Avoid Premature Optimization

Write clear code first, optimize if needed based on profiling.

Use Appropriate Data Structures

#![allow(unused)]
fn main() {
// Good - HashMap for lookups
let lookup: HashMap<String, Variable> =...;

// Good - Vec for ordered data
let results: Vec<ValidationResult> =...;
}

Lazy Evaluation

Use Polars lazy evaluation for large datasets:

#![allow(unused)]
fn main() {
let result = df.lazy()
.filter(col("value").gt(lit(0)))
.collect() ?;
}

Next Steps

Testing

Testing guidelines for Trial Submission Studio contributions.

Test Types

Unit Tests

Test individual functions and methods:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn normalize_column_name_removes_spaces() {
        let result = normalize_column_name("Patient Age");
        assert_eq!(result, "PATIENT_AGE");
    }
}
}

Integration Tests

Test interactions between modules:

#![allow(unused)]
fn main() {
// tests/integration_test.rs
use tss_ingest::CsvReader;
use tss_validate::Validator;

#[test]
fn validate_imported_data() {
    let data = CsvReader::read("tests/data/sample.csv").unwrap();
    let results = Validator::validate(&data, "DM").unwrap();
    assert!(results.errors().is_empty());
}
}

Running Tests

All Tests

cargo test

Specific Crate

cargo test --package tss-submit

Specific Test

cargo test test_name

With Output

cargo test -- --nocapture

Release Mode

cargo test --release

Test Organization

File Structure

crates/tss-submit/
├── src/
│   ├── lib.rs
│   └── validate/
│       └── checks/
└── tests/
    ├── validation_test.rs
    └── data/
        └── sample_dm.csv

Inline Tests

For simple unit tests:

#![allow(unused)]
fn main() {
// src/normalize.rs

pub fn normalize(s: &str) -> String {
    s.trim().to_uppercase()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_normalize() {
        assert_eq!(normalize("  hello  "), "HELLO");
    }
}
}

External Tests

For integration tests:

#![allow(unused)]
fn main() {
// tests/validation_integration.rs

use tss_validate::*;

#[test]
fn full_validation_workflow() {
    // Integration test code
}
}

Test Data

Location

Test data files are in:

  • mockdata/ - Shared test datasets
  • crates/*/tests/data/ - Crate-specific test data

Sample Data

STUDYID,DOMAIN,USUBJID,SUBJID,AGE,SEX
ABC123,DM,ABC123-001,001,45,M
ABC123,DM,ABC123-002,002,38,F

Sensitive Data

Never commit real clinical trial data. Use:

  • Synthetic/mock data only
  • Anonymized examples
  • Generated test cases

Writing Good Tests

Structure (AAA Pattern)

#![allow(unused)]
fn main() {
#[test]
fn test_validation_rule() {
    // Arrange - set up test data
    let data = create_test_dataframe();
    let validator = Validator::new();

    // Act - perform the operation
    let results = validator.validate(&data);

    // Assert - verify results
    assert_eq!(results.len(), 1);
    assert_eq!(results[0].severity, Severity::Error);
}
}

Descriptive Names

#![allow(unused)]
fn main() {
// Good
#[test]
fn returns_error_when_usubjid_is_missing() { ... }

#[test]
fn accepts_valid_iso8601_date_format() { ... }

// Avoid
#[test]
fn test1() { ... }

#[test]
fn it_works() { ... }
}

Test Edge Cases

#![allow(unused)]
fn main() {
#[test]
fn handles_empty_dataframe() { ... }

#[test]
fn handles_null_values() { ... }

#[test]
fn handles_unicode_characters() { ... }

#[test]
fn handles_maximum_length_values() { ... }
}

Test Error Conditions

#![allow(unused)]
fn main() {
#[test]
fn returns_error_for_invalid_input() {
    let result = process_file("nonexistent.csv");
    assert!(result.is_err());
}

#[test]
fn error_contains_helpful_message() {
    let err = process_file("bad.csv").unwrap_err();
    assert!(err.to_string().contains("parse error"));
}
}

CI Testing

Automated Checks

Every PR runs:

  1. cargo test - All tests
  2. cargo clippy - Linting
  3. cargo fmt --check - Formatting

Test Matrix

Tests run on:

  • Ubuntu (primary)
  • macOS (future)
  • Windows (future)

Test Coverage

Goal

Aim for high coverage on critical paths:

  • Validation rules
  • Data transformations
  • File I/O

Not Required

100% coverage isn’t required. Focus on:

  • Business logic
  • Error handling
  • Edge cases

Next Steps

Pull Requests

Guidelines for submitting pull requests to Trial Submission Studio.

Before Creating a PR

Complete Your Changes

  • Code compiles: cargo build
  • Tests pass: cargo test
  • Lints pass: cargo clippy -- -D warnings
  • Formatted: cargo fmt

Commit Guidelines

Conventional Commits

Use conventional commit format:

type(scope): description

[optional body]

[optional footer]

Types

TypeDescription
featNew feature
fixBug fix
docsDocumentation only
testAdding/updating tests
refactorCode refactoring
perfPerformance improvement
choreMaintenance tasks

Examples

git commit -m "feat(validate): add CT validation for SEX variable"
git commit -m "fix(xpt): handle missing values correctly"
git commit -m "docs: update installation instructions"
git commit -m "test(map): add property tests for similarity"
git commit -m "refactor(ingest): simplify schema detection"

Keep PRs Focused

  • One feature or fix per PR
  • Small, reviewable changes
  • Don’t mix refactoring with features

Creating a PR

Push Your Branch

git push origin feature/my-feature

Open PR on GitHub

  1. Go to your fork on GitHub
  2. Click “Pull Request”
  3. Select your branch
  4. Fill in the template

PR Title

Use same format as commits:

feat(submit): add USUBJID cross-domain validation
fix(export): correct numeric precision for large values
docs: add API documentation for tss-submit

PR Description Template

## Summary

Brief description of what this PR does.

## Changes

- Added X
- Fixed Y
- Updated Z

## Testing

How was this tested?

- [ ] Unit tests added
- [ ] Manual testing performed
- [ ] Tested on: macOS / Windows / Linux

## Related Issues

Fixes #123
Related to #456

## Checklist

- [ ] Code compiles without warnings
- [ ] Tests pass
- [ ] Clippy passes
- [ ] Code is formatted
- [ ] Documentation updated (if needed)

Review Process

What Reviewers Look For

  1. Correctness - Does it work?
  2. Tests - Are changes tested?
  3. Style - Follows coding standards?
  4. Performance - Any concerns?
  5. Documentation - Updated if needed?

Responding to Feedback

  1. Address all comments
  2. Push additional commits
  3. Mark conversations resolved
  4. Request re-review when ready

Acceptable Responses

  • Fix the issue
  • Explain why it’s correct
  • Discuss alternative approaches
  • Agree to follow up in separate PR

After Merge

Clean Up

# Switch to main
git checkout main

# Update from upstream
git pull upstream main

# Delete local branch
git branch -d feature/my-feature

# Delete remote branch (optional, GitHub can auto-delete)
git push origin --delete feature/my-feature

Update Fork

git push origin main

PR Types

Feature PRs

  • Reference the issue or discussion
  • Include tests
  • Update documentation if user-facing

Bug Fix PRs

  • Reference the bug issue
  • Include regression test
  • Explain root cause if complex

Documentation PRs

  • No code changes required
  • Preview locally: mdbook serve
  • Check links work

Refactoring PRs

  • No behavior changes
  • All existing tests must pass
  • Add tests if coverage was low

Tips for Good PRs

Make Review Easy

  • Write clear descriptions
  • Add comments on complex code
  • Break large changes into steps

Be Patient

  • Reviews take time
  • Don’t ping repeatedly
  • Provide more context if asked

Learn from Feedback

  • Feedback improves code quality
  • Ask questions if unclear
  • Apply learnings to future PRs

Automated Checks

CI Pipeline

Every PR runs:

  1. Build - Compilation check
  2. Test - All tests
  3. Lint - Clippy
  4. Format - rustfmt

Required Checks

All checks must pass before merge.

Fixing Failed Checks

# If tests fail
cargo test

# If clippy fails
cargo clippy -- -D warnings

# If format fails
cargo fmt

Emergency Fixes

For critical bugs:

  1. Create PR with hotfix/ prefix
  2. Note urgency in description
  3. Request expedited review

Questions?

  • Ask in PR comments
  • Open a Discussion
  • Reference documentation

Next Steps

macOS Code Signing Setup

This guide explains how to set up Apple Developer certificates for signing and notarizing Trial Submission Studio releases.

Prerequisites

  • Active Apple Developer Program membership ($99/year)
  • macOS with Xcode Command Line Tools installed
  • Access to the GitHub repository settings (for adding secrets)

Step 1: Create Developer ID Application Certificate

1.1 Request Certificate from Apple

  1. Open Keychain Access (Applications → Utilities → Keychain Access)
  2. Go to Keychain Access → Certificate Assistant → Request a Certificate From a Certificate Authority
  3. Fill in:
    • Email Address: Your Apple ID email
    • Common Name: Your name
    • Request is: Saved to disk
  4. Save the .certSigningRequest file

1.2 Create Certificate in Apple Developer Portal

  1. Go to Apple Developer Certificates
  2. Click + to create a new certificate
  3. Select Developer ID Application (NOT “Developer ID Installer”)
  4. Upload your .certSigningRequest file
  5. Download the generated .cer file
  6. Double-click the .cer file to install it in Keychain Access

1.3 Verify Certificate Installation

Run this command to verify the certificate is installed:

security find-identity -v -p codesigning

You should see output like:

1) ABCDEF1234567890... "Developer ID Application: Your Name (TEAM_ID)"

Step 2: Export Certificate for GitHub Actions

2.1 Export as .p12

  1. Open Keychain Access
  2. Find your certificate: “Developer ID Application: Your Name”
  3. Right-click → Export
  4. Choose .p12 format
  5. Set a strong password (you’ll need this later)
  6. Save the file

2.2 Convert to Base64

base64 -i YourCertificate.p12 | pbcopy

This copies the base64-encoded certificate to your clipboard.

Step 3: Create App-Specific Password

Apple requires an app-specific password for notarization (not your regular Apple ID password).

  1. Go to Apple ID Account
  2. Sign in with your Apple ID
  3. Navigate to App-Specific Passwords
  4. Click Generate an app-specific password
  5. Label: “GitHub Actions Notarization”
  6. Copy the generated password (format: xxxx-xxxx-xxxx-xxxx)

Step 4: Find Your Team ID

  1. Go to Apple Developer Account
  2. Click Membership in the left sidebar
  3. Copy your Team ID (10-character alphanumeric string)

Step 5: Configure GitHub Secrets

Go to your repository’s Settings → Secrets and variables → Actions and add these 7 secrets:

Secret NameDescriptionHow to Get
APPLE_DEVELOPER_CERTIFICATE_P12_BASE64Base64-encoded .p12 certificateStep 2.2 output
APPLE_DEVELOPER_CERTIFICATE_PASSWORDPassword you set when exporting .p12Step 2.1
APPLE_CODESIGN_IDENTITYFull certificate namesecurity find-identity -v -p codesigning output
APPLE_NOTARIZATION_APPLE_IDYour Apple ID emailYour Apple Developer email
APPLE_NOTARIZATION_APP_PASSWORDApp-specific passwordStep 3 output
APPLE_DEVELOPER_TEAM_ID10-character Team IDStep 4
CI_KEYCHAIN_PASSWORDRandom secure passwordGenerate any secure string

Example Values

APPLE_CODESIGN_IDENTITY: Developer ID Application: Ruben Talstra (ABCD1234EF)
APPLE_DEVELOPER_TEAM_ID: ABCD1234EF
APPLE_NOTARIZATION_APPLE_ID: your.email@example.com

Local Development

Create App Bundle

cargo build --release
./scripts/macos/create-bundle.sh

Sign Locally (for testing)

./scripts/macos/sign-local.sh

Verify Bundle

./scripts/macos/verify-bundle.sh

Test Gatekeeper

./scripts/macos/test-gatekeeper.sh
open "Trial Submission Studio.app"

Troubleshooting

“No Developer ID Application certificate found”

Ensure the certificate is in your login keychain and not expired:

security find-identity -v -p codesigning

“The signature is invalid”

Re-sign with the --force flag:

codesign --force --options runtime --sign "Developer ID Application: ..." "Trial Submission Studio.app"

“Notarization failed”

Check the notarization log:

xcrun notarytool log <submission-id> --apple-id "..." --password "..." --team-id "..."

Common issues:

  • Missing hardened runtime (--options runtime)
  • Problematic entitlements (JIT, unsigned memory)
  • Unsigned nested code

Security Notes

  • Never commit certificates or passwords to the repository
  • Use GitHub’s encrypted secrets for all sensitive values
  • The app-specific password is NOT your Apple ID password
  • Rotate credentials if you suspect they’ve been compromised

Windows Code Signing Setup

This guide explains how to set up Windows code signing using SignPath Foundation for Trial Submission Studio releases.

Overview

Windows code signing uses Authenticode certificates to sign executables. This eliminates SmartScreen warnings (“Windows protected your PC”) and builds user trust.

We use SignPath Foundation which provides free code signing certificates for open source projects. The certificate is issued to SignPath Foundation, and they vouch for your project by verifying binaries are built from your open source repository.

Prerequisites

  • Open source project with an OSI-approved license
  • GitHub repository with automated builds
  • MFA enabled on both GitHub and SignPath accounts
  • At least one prior release of your application

Step 1: Apply to SignPath Foundation

1.1 Check Eligibility

Your project must meet these criteria:

  1. OSI-approved license - Must use an approved open source license (no dual-licensing)
  2. No malware - No malware or potentially unwanted programs
  3. Actively maintained - Project must be actively maintained
  4. Already released - Must have prior releases in the form to be signed
  5. Documented - Functionality described on download page
  6. All team members use MFA - For both SignPath and GitHub
  7. Automated builds - Build process must be fully automated

1.2 Submit Application

  1. Go to signpath.org/apply
  2. Fill out the application form with your project details
  3. Link your GitHub repository
  4. Wait for approval (typically a few days)

1.3 After Approval

Once approved, you’ll receive:

  • Organization ID
  • Project slug
  • Access to the SignPath dashboard

Step 2: Install SignPath GitHub App

  1. Go to github.com/apps/signpath
  2. Click Install
  3. Select your repository
  4. Grant necessary permissions

Step 3: Configure SignPath Dashboard

3.1 Add GitHub as Trusted Build System

  1. Log in to app.signpath.io
  2. Navigate to your project
  3. Go to Trusted Build Systems
  4. Add GitHub.com as a trusted build system
  5. Link to your repository

3.2 Configure Artifact Format

  1. Go to Artifact Configurations
  2. Create a new configuration or use the default
  3. Set the root element to <zip-file> (GitHub packages artifacts as ZIP)
  4. Configure the PE file signing within the ZIP

Example artifact configuration:


<artifact-configuration xmlns="http://signpath.io/artifact-configuration/v1">
    <zip-file>
        <pe-file path="*.exe">
            <authenticode-sign/>
        </pe-file>
    </zip-file>
</artifact-configuration>

3.3 Create API Token

  1. Go to My ProfileAPI Tokens
  2. Click Create API Token
  3. Name: “GitHub Actions”
  4. Permissions: Submitter role for your project
  5. Copy the token (you won’t see it again!)

Step 4: Configure GitHub Secrets

Go to your repository’s Settings → Secrets and variables → Actions and add these 4 secrets:

Secret NameDescriptionWhere to Find
SIGNPATH_API_TOKENAPI token with submitter permissionsStep 3.3
SIGNPATH_ORGANIZATION_IDYour organization IDSignPath dashboard URL or settings
SIGNPATH_PROJECT_SLUGProject identifierSignPath project settings
SIGNPATH_SIGNING_POLICY_SLUGSigning policy nameSignPath project (typically “release-signing”)

Finding Your IDs

Organization ID: Look at your SignPath dashboard URL:

https://app.signpath.io/Web/YOUR_ORG_ID/...

Project Slug: Found in your project’s URL or settings page.

Signing Policy Slug: Usually release-signing for open source projects.

How It Works

When you push a tag to release:

  1. Build: GitHub Actions builds the unsigned .exe
  2. Upload: The unsigned artifact is uploaded to GitHub
  3. Submit: The SignPath action submits the artifact for signing
  4. Sign: SignPath signs the executable with their certificate
  5. Download: The signed artifact is downloaded back to the workflow
  6. Verify: The workflow verifies the signature is valid
  7. Release: The signed executable is included in the GitHub release

Verification

After signing, users can verify the signature:

Windows

Right-click the .exePropertiesDigital Signatures tab

PowerShell

Get-AuthenticodeSignature "trial-submission-studio.exe"

The publisher will show as SignPath Foundation.

Troubleshooting

“Signing request rejected”

Check the SignPath dashboard for the rejection reason. Common issues:

  • Artifact format doesn’t match configuration
  • Missing permissions on API token
  • Project not linked to GitHub as trusted build system

“API token invalid”

  • Ensure the token has Submitter permissions
  • Check token hasn’t expired
  • Verify the token is for the correct organization

“Artifact not found”

  • Ensure the artifact is uploaded before the signing step
  • Check the artifact ID is correctly passed between steps
  • Verify artifact name matches what SignPath expects

SmartScreen still warns

After signing, SmartScreen warnings should disappear. If they persist:

  • The signature may need time to build reputation
  • Check the certificate is valid in Properties → Digital Signatures
  • Ensure users download from official GitHub releases

Security Notes

  • Never commit API tokens to the repository
  • Use GitHub’s encrypted secrets for all sensitive values
  • SignPath stores keys in HSM (Hardware Security Module)
  • The signing certificate is managed by SignPath Foundation
  • All signing requests are logged and auditable

Cost

SignPath Foundation is free for open source projects that meet the eligibility criteria. There are no hidden fees or limits for qualifying projects.

Resources

Code Signing Policy

Trial Submission Studio uses code signing to ensure authenticity and integrity of distributed binaries.

Attribution

Windows: Free code signing provided by SignPath.io, certificate by SignPath Foundation.

macOS: Signed and notarized with Apple Developer ID.

Linux: Unsigned (standard for AppImage distribution).

Team Roles

Per SignPath Foundation requirements, this project has a single maintainer:

RoleMemberResponsibility
Author@rubentalstraSource code ownership, trusted commits
Reviewer@rubentalstraReview all external contributions
Approver@rubentalstraAuthorize signing requests

All external contributions (pull requests) are reviewed before merging. Only merged code is included in signed releases.

Privacy & Network Communication

See Privacy Policy for full details.

Summary: This application only connects to GitHub when you explicitly request an update check. No clinical data or personal information is ever transmitted.

Build Verification

All signed binaries are:

  • Built from source code in this repository
  • Compiled via GitHub Actions (auditable CI/CD)
  • Tagged releases with full git history
  • Verified with SLSA build provenance attestations

Security Requirements

  • MFA required for SignPath access
  • MFA recommended for GitHub access (best practice)
  • Private signing keys are HSM-protected (SignPath infrastructure)
  • All signing requests are logged and auditable

Verifying Signatures

Windows

Right-click the .exe file → Properties → Digital Signatures tab.

Or use PowerShell:

Get-AuthenticodeSignature "trial-submission-studio.exe"

The publisher should show SignPath Foundation.

macOS

codesign -dv --verbose=4 /Applications/Trial\ Submission\ Studio.app
spctl --assess -vvv /Applications/Trial\ Submission\ Studio.app

Reporting Issues

macOS Gatekeeper Issues

This guide helps resolve common issues when opening Trial Submission Studio on macOS.

“Trial Submission Studio is damaged and can’t be opened”

This error typically means the app is not properly signed or notarized by Apple.

For Users: Quick Fix

If you downloaded from our official GitHub releases and see this error:

  1. Open System SettingsPrivacy & Security
  2. Scroll down to the Security section
  3. Look for a message about “Trial Submission Studio” being blocked
  4. Click Open Anyway
  5. Confirm in the dialog that appears

For Developers: Root Causes

This error can occur when:

  1. App is not code signed - No Developer ID certificate was used
  2. App is not notarized - Apple’s notary service didn’t approve it
  3. Entitlements are too permissive - Certain entitlements can cause rejection
  4. GitHub secrets not configured - CI skipped signing due to missing secrets

“Apple cannot check it for malicious software”

This warning appears for apps that are signed but not notarized.

Workaround

  1. Right-click (or Control+click) the app
  2. Select Open from the context menu
  3. Click Open in the dialog

Note: On macOS Sequoia (15.0+), Control+click bypass no longer works. You must use System Settings → Privacy & Security → Open Anyway.

Verifying App Signature

To check if an app is properly signed:

# Check code signature
codesign --verify --deep --strict --verbose=2 "Trial Submission Studio.app"

# Check notarization
xcrun stapler validate "Trial Submission Studio.app"

# Check Gatekeeper assessment
spctl --assess --type execute --verbose=2 "Trial Submission Studio.app"

Expected output for a properly signed and notarized app:

  • valid on disk from codesign
  • The validate action worked! from stapler
  • accepted from spctl

Removing Quarantine Attribute

If you’re a developer testing the app, you can remove the quarantine attribute:

xattr -d com.apple.quarantine "Trial Submission Studio.app"

Warning: Only do this for apps you trust. This bypasses macOS security.

macOS Sequoia (15.0+) Changes

Apple significantly tightened Gatekeeper in macOS Sequoia:

  • Control+click bypass removed - The old workaround no longer works
  • New bypass path: System Settings → Privacy & Security → Open Anyway
  • Admin password required - You’ll need to authenticate twice
  • spctl --master-disable removed - Can’t globally disable Gatekeeper via terminal

This makes proper code signing and notarization more important than ever.

Reporting Issues

If you downloaded from our official releases and still have issues:

  1. Check the GitHub Releases page
  2. Ensure you downloaded the .dmg file (not the .zip)
  3. Report issues at GitHub Issues

Include:

  • macOS version (sw_vers)
  • Where you downloaded the app from
  • The exact error message
  • Output of codesign --verify --verbose=2 (if possible)

Frequently Asked Questions

Common questions about Trial Submission Studio.

General

What is Trial Submission Studio?

Trial Submission Studio is a free, open-source desktop application for transforming clinical trial source data (CSV) into CDISC-compliant formats like XPT for FDA submissions.

Is my data sent anywhere?

No. Your clinical trial data stays on your computer. Trial Submission Studio works completely offline - all CDISC standards are embedded in the application, and no data is transmitted over the network.

Is Trial Submission Studio free?

Yes! Trial Submission Studio is free and open source, licensed under the MIT License. You can use it commercially without any fees.

Which platforms are supported?

  • macOS (Apple Silicon and Intel)
  • Windows (x86_64 and ARM64)
  • Linux (x86_64)

CDISC Standards

Which CDISC standards are supported?

Currently Supported:

  • SDTM-IG v3.4
  • Controlled Terminology (2024-2025 versions)

Planned:

  • ADaM-IG v1.3
  • SEND-IG v3.1.1

Can I use this for FDA submissions?

Not yet. Trial Submission Studio is currently in alpha development. Our goal is to generate FDA-compliant outputs, but until the software reaches stable release, all outputs should be validated by qualified regulatory professionals before submission.

How often is controlled terminology updated?

Controlled terminology updates are included in application releases. We aim to incorporate new CDISC CT versions within a reasonable time after their official release.

Technical

Do I need SAS installed?

No. Trial Submission Studio is completely standalone and does not require SAS or any other software. It generates XPT files natively.

What input formats are supported?

Currently, Trial Submission Studio supports CSV files as input. The CSV should have:

  • Headers in the first row
  • UTF-8 encoding (recommended)
  • Comma-separated values

What output formats are available?

  • XPT V5 - FDA standard SAS Transport format
  • XPT V8 - Extended SAS Transport (longer names)
  • Dataset-XML - CDISC XML format
  • Define-XML 2.1 - Metadata documentation

How large datasets can it handle?

Trial Submission Studio can handle datasets with hundreds of thousands of rows. For very large datasets (1M+ rows), ensure adequate RAM (8GB+) and consider processing in batches.

Usage

How does column mapping work?

Trial Submission Studio uses fuzzy matching to suggest mappings between your source column names and SDTM variables. It analyzes name similarity and provides confidence scores. You can accept suggestions or map manually.

What happens if validation fails?

Validation errors must be resolved before export. The validation panel shows:

  • Errors (red) - Must fix
  • Warnings (yellow) - Should review
  • Info (blue) - Informational

Each message includes the affected rows and suggestions for fixing.

Can I save my mapping configuration?

Yes, you can save mapping templates and reuse them for similar datasets. This is useful when processing multiple studies with consistent source data structures.

Troubleshooting

The application won’t start on macOS

On first launch, macOS may block the application. Right-click and select “Open”, then click “Open” in the dialog to bypass Gatekeeper.

Import shows garbled characters

Your file may not be UTF-8 encoded. Open it in a text editor and save with UTF-8 encoding, then re-import.

Validation shows many errors

Common causes:

  1. Incorrect domain selection
  2. Wrong column mappings
  3. Data quality issues in source
  4. Controlled terminology mismatches

Review errors one by one, starting with mapping issues.

Export creates empty file

Ensure:

  1. Data is imported successfully
  2. Mappings are configured
  3. No blocking validation errors exist

Development

How can I contribute?

See our Contributing Guide for details. We welcome:

  • Bug reports
  • Feature requests
  • Code contributions
  • Documentation improvements

Where do I report bugs?

Open an issue on GitHub Issues.

Is there a roadmap?

Yes! See our Roadmap for planned features and development priorities.

More Questions?

Glossary

Terms and definitions used in Trial Submission Studio and CDISC standards.

A

ADaM

Analysis Data Model - CDISC standard for analysis-ready datasets derived from SDTM data.

ADSL

ADaM Subject-Level - ADaM dataset containing one record per subject with demographics and key variables.

B

BDS

Basic Data Structure - An ADaM structure used for parameter-based data like vital signs and lab results.

C

CDISC

Clinical Data Interchange Standards Consortium - Organization that develops global data standards for clinical research.

Codelist

A defined set of valid values for a variable. Also known as controlled terminology.

Controlled Terminology (CT)

Standardized sets of terms and codes published by CDISC for use in SDTM and ADaM datasets.

D

Dataset-XML

A CDISC standard XML format for representing tabular clinical data.

Define-XML

An XML standard for describing the structure and content of clinical trial datasets. Required for FDA submissions.

Domain

A logical grouping of SDTM data organized by observation type (e.g., DM for Demographics, AE for Adverse Events).

DM

Demographics - SDTM domain containing one record per subject with demographic information.

E

eCTD

Electronic Common Technical Document - Standard format for regulatory submissions.

F

FDA

Food and Drug Administration - US regulatory agency that requires CDISC standards for drug submissions.

Findings Class

SDTM observation class for collected measurements and test results (e.g., Labs, Vital Signs).

I

ISO 8601

International standard for date and time formats. SDTM uses ISO 8601 format: YYYY-MM-DD.

Interventions Class

SDTM observation class for treatments given to subjects (e.g., Exposure, Concomitant Medications).

M

MedDRA

Medical Dictionary for Regulatory Activities - Standard medical terminology for adverse events.

Metadata

Data that describes other data. In Define-XML, metadata describes dataset structure and variable definitions.

O

ODM

Operational Data Model - CDISC standard for representing clinical data and metadata in XML.

P

PMDA

Pharmaceuticals and Medical Devices Agency - Japanese regulatory agency that requires CDISC standards.

S

SAS Transport (XPT)

File format for SAS datasets used for FDA submissions. See XPT.

SDTM

Study Data Tabulation Model - CDISC standard structure for organizing clinical trial data.

SDTM-IG

SDTM Implementation Guide - Detailed guidance for implementing SDTM, including variable definitions and business rules.

SEND

Standard for Exchange of Nonclinical Data - CDISC standard for nonclinical (animal) study data.

Special Purpose Domain

SDTM domains that don’t fit standard observation classes (e.g., DM, Trial Design domains).

STUDYID

Standard SDTM variable containing the unique study identifier.

U

USUBJID

Unique Subject Identifier - Standard SDTM variable that uniquely identifies each subject across all studies.

V

Variable

An individual data element within a dataset. In SDTM, variables have standard names, labels, and data types.

X

XPT

SAS Transport Format - Binary file format used to transport SAS datasets. Required by FDA for data submissions.

XPT V5

Original SAS Transport format with 8-character variable names.

XPT V8

Extended SAS Transport format supporting 32-character variable names.

Numbers

–DTC Variables

SDTM timing variables containing dates/times in ISO 8601 format (e.g., AESTDTC, VSDTC).

–SEQ Variables

SDTM sequence variables providing unique record identifiers within a domain (e.g., AESEQ, VSSEQ).

–TESTCD Variables

SDTM test code variables in Findings domains (e.g., VSTESTCD, LBTESTCD).

Changelog

All notable changes to Trial Submission Studio.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

Added

  • Initial mdBook documentation site
  • Comprehensive user guide
  • CDISC standards reference
  • Architecture documentation
  • Contributing guidelines

Changed

  • Updated documentation structure

Fixed

  • Various documentation improvements

0.0.1-alpha.1 - 2024-XX-XX

Added

Core Features

  • CSV file import with automatic schema detection
  • Column-to-SDTM variable mapping with fuzzy matching
  • XPT V5 and V8 export support
  • Basic SDTM validation
  • Controlled terminology validation

Standards Support

  • SDTM-IG v3.4 embedded
  • Controlled Terminology 2024 versions
  • Domain definitions for common SDTM domains

User Interface

  • Native desktop GUI (Iced 0.14.0)
  • Data preview grid
  • Mapping interface with suggestions
  • Validation results panel
  • Export options dialog

Platform Support

  • macOS (Apple Silicon)
  • macOS (Intel)
  • Windows (x86_64)
  • Windows (ARM64)
  • Linux (x86_64)

Known Issues

  • Alpha software - not for production use
  • ADaM support not yet implemented
  • SEND support not yet implemented
  • Dataset-XML export in progress
  • Define-XML export in progress

Version History

VersionDateStatus
0.0.1-alpha.1TBDCurrent

Release Notes Format

Each release includes:

  • Added - New features
  • Changed - Changes to existing features
  • Deprecated - Features to be removed
  • Removed - Removed features
  • Fixed - Bug fixes
  • Security - Security fixes

Getting Updates

Check for Updates

Trial Submission Studio checks for updates automatically. You can also:

  1. Visit GitHub Releases
  2. Download the latest version for your platform
  3. Replace your existing installation

Update Notifications

When a new version is available, you’ll see a notification in the application.

Reporting Issues

Found a bug or have a feature request?


Roadmap

Development plans for Trial Submission Studio.

Note

This roadmap reflects current plans and priorities. Items may change based on community feedback and project needs.

Current Focus

Features actively being developed:

  • Complete SDTM transformation pipeline
  • Dataset-XML export
  • Define-XML 2.1 generation
  • Comprehensive SDTM validation rules
  • Full export workflow

Short-term

Features planned for near-term development:

  • Batch processing (multiple domains)
  • Export templates and presets
  • Improved error messages and validation feedback
  • Session save/restore
  • Mapping templates (save and reuse mappings)

Medium-term

Features planned after core functionality is stable:

  • ADaM (Analysis Data Model) support
  • SUPP domain handling improvements
  • Custom validation rules
  • Report generation
  • Undo/redo functionality improvements

Long-term

Features for future consideration:

  • SEND (Standard for Exchange of Nonclinical Data) support
  • Batch CLI mode for automation
  • Define-XML import (reverse engineering)
  • Plugin system for custom transformations
  • Multi-study support

Completed

Features that have been implemented:

  • Core XPT read/write (V5 + V8)
  • CSV ingestion with schema detection
  • Fuzzy column mapping engine
  • Controlled Terminology validation
  • Desktop GUI (Iced 0.14.0)
  • SDTM-IG v3.4 standards embedded
  • Controlled Terminology (2024-2025)
  • Cross-platform support (macOS, Windows, Linux)

How to Contribute

We welcome contributions! See the Contributing Guide for details.

Working on Roadmap Items

If you’d like to work on a roadmap item:

  1. Check if there’s an existing GitHub Issue
  2. Comment to express interest
  3. Wait for maintainer feedback before starting work
  4. Follow the PR guidelines

Suggesting New Features

Have ideas for the roadmap?

  1. Check existing issues and discussions
  2. Open a new issue or discussion
  3. Describe the feature and use case
  4. Engage with community feedback

Prioritization

Features are prioritized based on:

  1. Regulatory compliance - FDA submission requirements
  2. User impact - Benefit to most users
  3. Complexity - Development effort required
  4. Dependencies - Prerequisites from other features
  5. Community feedback - Requested features

Versioning Plan

VersionFocus
0.1.0Core SDTM workflow stable
0.2.0Define-XML and Dataset-XML
0.3.0ADaM support
1.0.0Production ready

Stay Updated

Disclaimer

Important notices about Trial Submission Studio.

Alpha Software Notice

Warning

Trial Submission Studio is currently in alpha development.

This software is provided for evaluation and development purposes only. It is not yet suitable for production use in regulatory submissions.

What This Means

  • Features may be incomplete or change without notice
  • Bugs and unexpected behavior may occur
  • Data outputs should be independently validated
  • No guarantee of regulatory compliance

Not for Production Submissions

Do not use Trial Submission Studio outputs for actual FDA, PMDA, or other regulatory submissions until the software reaches stable release (version 1.0.0 or later).

Before Submission

All outputs from Trial Submission Studio should be:

  1. Validated by qualified regulatory professionals
  2. Verified against CDISC standards independently
  3. Reviewed for completeness and accuracy
  4. Tested with regulatory authority validation tools

Limitation of Liability

Trial Submission Studio is provided “as is” without warranty of any kind, express or implied. The authors and contributors:

  • Make no guarantees about output accuracy
  • Are not responsible for submission rejections
  • Cannot be held liable for regulatory issues
  • Do not provide regulatory consulting

See the full MIT License for complete terms.

CDISC Standards

Trial Submission Studio implements CDISC standards based on publicly available documentation:

  • SDTM-IG v3.4 - Study Data Tabulation Model Implementation Guide
  • Controlled Terminology - 2024-2025 versions

CDISC standards are developed by the Clinical Data Interchange Standards Consortium. Trial Submission Studio is not affiliated with or endorsed by CDISC.

Regulatory Guidance

This software does not constitute regulatory advice. For guidance on:

Data Privacy

Trial Submission Studio:

  • Processes all clinical data locally on your computer
  • Does not collect usage analytics or telemetry
  • Does not transmit clinical data over the network

Network communication is limited to user-initiated update checks via GitHub API. No clinical data or personal information is ever transmitted.

See our full Privacy Policy for details.

You are responsible for protecting any sensitive or confidential data processed with this software.

Reporting Issues

If you encounter problems:

  1. Do not rely on potentially incorrect outputs
  2. Report issues on GitHub
  3. Validate outputs through independent means

Future Stability

We are actively working toward a stable release. Progress can be tracked on our Roadmap.

VersionStatus
0.x.xAlpha - Not for production
1.0.0+Stable - Production ready

Questions?

Code of Conduct

Our Pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

Our Standards

Examples of behavior that contributes to a positive environment:

  • Using welcoming and inclusive language
  • Being respectful of differing viewpoints and experiences
  • Gracefully accepting constructive criticism
  • Focusing on what is best for the community
  • Showing empathy towards other community members

Examples of unacceptable behavior:

  • The use of sexualized language or imagery and unwelcome sexual attention or advances
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information without explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue on the GitHub repository or contacting the project maintainers directly.

All complaints will be reviewed and investigated promptly and fairly.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 2.1.

Privacy Policy

Trial Submission Studio is designed with privacy as a core principle.

Data Collection

We do not collect any data. Trial Submission Studio:

  • Does not collect usage analytics or telemetry
  • Does not track user behavior
  • Does not collect personal information
  • Does not access or transmit clinical trial data

Local Processing

All clinical data processing occurs entirely on your local computer:

  • Source files (CSV, XPT) are read locally
  • Transformations execute in local memory
  • Output files are written to local storage
  • No data is uploaded to any server

Network Communication

Trial Submission Studio connects to the internet only when you explicitly request it:

ActionDestinationPurpose
Check for Updatesapi.github.comFetch latest release info
Download Updategithub.comDownload new version

Important:

  • Update checks are user-initiated only (not automatic)
  • No clinical data is ever transmitted
  • No personal information is sent
  • All connections use TLS encryption

This complies with SignPath Foundation’s requirement:

“This program will not transfer any information to other networked systems unless specifically requested by the user.”

Third-Party Services

The only third-party service used is GitHub for:

  • Hosting releases and source code
  • Providing update information via GitHub Releases API

For GitHub’s data practices, see: GitHub Privacy Statement

Data Storage

Trial Submission Studio may store the following locally:

DataLocationPurpose
User preferencesOS config directoryRemember settings
Recent files listOS config directoryQuick access
Window stateOS config directoryRestore layout

Storage locations by platform:

  • Windows: %APPDATA%\trial-submission-studio\
  • macOS: ~/Library/Application Support/trial-submission-studio/
  • Linux: ~/.config/trial-submission-studio/

No clinical data is ever stored by the application itself.

Your Responsibilities

You are responsible for:

  • Protecting clinical data on your system
  • Compliance with HIPAA, GxP, 21 CFR Part 11 as applicable
  • Secure storage of source and output files
  • Access control on your computer

Changes to This Policy

Changes will be documented in release notes and this file.

Contact

Questions about privacy: GitHub Discussions