Coding Standards
Code style and quality guidelines for Trial Submission Studio.
Rust Style
Formatting
Use rustfmt for all code formatting:
# Check formatting
cargo fmt --check
# Apply formatting
cargo fmt
Linting
All code must pass Clippy with no warnings:
cargo clippy -- -D warnings
Naming Conventions
Crates
- Lowercase with hyphens:
xport,tss-validate - Prefix with
tss-for project crates
Modules
- Lowercase with underscores:
column_mapping.rs - Keep names short but descriptive
Functions
#![allow(unused)]
fn main() {
// Good - descriptive, snake_case
fn calculate_similarity(source: &str, target: &str) -> f64
// Good - verb-noun pattern
fn validate_domain(data: &DataFrame) -> Vec<ValidationResult>
// Avoid - too abbreviated
fn calc_sim(s: &str, t: &str) -> f64
}
Types
#![allow(unused)]
fn main() {
// Good - PascalCase, descriptive
struct ValidationResult {
...
}
enum DomainClass {...}
// Good - clear trait naming
trait ValidationRule { ... }
}
Constants
#![allow(unused)]
fn main() {
// Good - SCREAMING_SNAKE_CASE
const MAX_VARIABLE_LENGTH: usize = 8;
const DEFAULT_CONFIDENCE_THRESHOLD: f64 = 0.8;
}
Code Organization
File Structure
#![allow(unused)]
fn main() {
// 1. Module documentation
//! Module description
// 2. Imports (grouped)
use std::collections::HashMap;
use serde::{Deserialize, Serialize};
use crate::model::Variable;
// 3. Constants
const DEFAULT_VALUE: i32 = 0;
// 4. Type definitions
pub struct MyStruct {
...
}
// 5. Implementations
impl MyStruct { ... }
// 6. Functions
pub fn my_function() { ... }
// 7. Tests (at bottom or in separate file)
#[cfg(test)]
mod tests {
...
}
}
Import Organization
Group imports in this order:
- Standard library
- External crates
- Internal crates
- Current crate modules
#![allow(unused)]
fn main() {
use std::path::Path;
use polars::prelude::*;
use serde::Serialize;
use tss_model::Variable;
use crate::mapping::Mapping;
}
Error Handling
Use Result Types
#![allow(unused)]
fn main() {
// Good - explicit error handling
pub fn parse_file(path: &Path) -> Result<Data, ParseError> {
let content = std::fs::read_to_string(path)?;
parse_content(&content)
}
// Avoid - panicking on errors
pub fn parse_file(path: &Path) -> Data {
let content = std::fs::read_to_string(path).unwrap(); // Don't do this
parse_content(&content).expect("parse failed") // Or this
}
}
Custom Error Types
#![allow(unused)]
fn main() {
use thiserror::Error;
#[derive(Error, Debug)]
pub enum ValidationError {
#[error("Missing required variable: {0}")]
MissingVariable(String),
#[error("Invalid value '{value}' for {variable}")]
InvalidValue { variable: String, value: String },
}
}
Error Context
#![allow(unused)]
fn main() {
// Good - add context to errors
fs::read_to_string(path)
.map_err( | e| ParseError::FileRead {
path: path.to_path_buf(),
source: e,
}) ?;
}
Documentation
Public Items
All public items must be documented:
#![allow(unused)]
fn main() {
/// Validates data against SDTM rules.
///
/// # Arguments
///
/// * `data` - The DataFrame to validate
/// * `domain` - Target SDTM domain code
///
/// # Returns
///
/// Vector of validation results
///
/// # Example
///
/// ```
/// let results = validate(&data, "DM")?;
/// ```
pub fn validate(data: &DataFrame, domain: &str) -> Result<Vec<ValidationResult>> {
// ...
}
}
Module Documentation
#![allow(unused)]
fn main() {
//! CSV ingestion and schema detection.
//!
//! This module provides functionality for loading CSV files
//! and automatically detecting their schema.
}
Testing
Test Organization
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_basic_case() {
// Arrange
let input = "test";
// Act
let result = process(input);
// Assert
assert_eq!(result, expected);
}
#[test]
fn test_edge_case() {
// ...
}
}
}
Test Naming
#![allow(unused)]
fn main() {
// Good - descriptive test names
#[test]
fn parse_iso8601_date_returns_correct_value() { ... }
#[test]
fn validate_returns_error_for_missing_usubjid() { ... }
// Avoid - vague names
#[test]
fn test1() { ... }
}
Architecture Principles
Separation of Concerns
- Keep business logic out of GUI code
- I/O operations separate from data processing
- Validation rules independent of data loading
Pure Functions
Prefer pure functions where possible:
#![allow(unused)]
fn main() {
// Good - pure function, easy to test
pub fn calculate_confidence(source: &str, target: &str) -> f64 {
// No side effects, deterministic
}
// Use sparingly - side effects
pub fn log_and_calculate(source: &str, target: &str) -> f64 {
tracing::info!("Calculating..."); // Side effect
calculate_confidence(source, target)
}
}
Determinism
Output must be reproducible:
#![allow(unused)]
fn main() {
// Good - deterministic output
pub fn derive_sequence(data: &DataFrame, group_by: &[&str]) -> Vec<i32> {
// Same input always produces same output
}
// Avoid - non-deterministic
pub fn derive_sequence_random(data: &DataFrame) -> Vec<i32> {
// Uses random ordering - bad for regulatory compliance
}
}
Performance
Avoid Premature Optimization
Write clear code first, optimize if needed based on profiling.
Use Appropriate Data Structures
#![allow(unused)]
fn main() {
// Good - HashMap for lookups
let lookup: HashMap<String, Variable> =...;
// Good - Vec for ordered data
let results: Vec<ValidationResult> =...;
}
Lazy Evaluation
Use Polars lazy evaluation for large datasets:
#![allow(unused)]
fn main() {
let result = df.lazy()
.filter(col("value").gt(lit(0)))
.collect() ?;
}
Next Steps
- Testing - Testing guidelines
- Pull Requests - PR process