Fuzzy column mapping engine crate.
tss-map provides intelligent matching between source columns and SDTM variables.
- Fuzzy string matching for column names
- Match confidence scoring
- Mapping suggestions
- Type compatibility checking
[dependencies]
rapidfuzz = "0.5"
tss-standards = { path = "../tss-standards" }
tss-model = { path = "../tss-model" }
tss-map/
├── src/
│ ├── lib.rs
│ ├── matcher.rs # Fuzzy matching logic
│ ├── scorer.rs # Confidence scoring
│ ├── mapping.rs # Mapping structures
│ └── suggestions.rs # Auto-suggestion engine
- Normalize names - Case folding, remove special chars
- Calculate similarity - Multiple algorithms
- Apply domain hints - Boost relevant matches
- Score confidence - Combine factors
- Rank suggestions - Order by score
#![allow(unused)]
fn main() {
pub fn calculate_similarity(source: &str, target: &str) -> f64 {
let ratio = rapidfuzz::fuzz::ratio(source, target);
let partial = rapidfuzz::fuzz::partial_ratio(source, target);
let token_sort = rapidfuzz::fuzz::token_sort_ratio(source, target);
// Weighted combination
(ratio * 0.4 + partial * 0.3 + token_sort * 0.3) / 100.0
}
}
| Score | Level | Action |
| > 0.80 | High | Auto-accept |
| 0.50-0.80 | Medium | Review |
| < 0.50 | Low | Manual |
#![allow(unused)]
fn main() {
use tss_map::{Matcher, MatchOptions};
let matcher = Matcher::new( & standards);
let suggestions = matcher.suggest_mappings(
& source_columns,
domain,
MatchOptions::default ()
) ?;
for suggestion in suggestions {
println!("{} -> {} ({:.0}%)",
suggestion.source,
suggestion.target,
suggestion.confidence * 100.0
);
}
}
#![allow(unused)]
fn main() {
pub struct Mapping {
pub source_column: String,
pub target_variable: String,
pub confidence: f64,
pub user_confirmed: bool,
}
}
#![allow(unused)]
fn main() {
pub struct MatchOptions {
pub min_confidence: f64,
pub max_suggestions: usize,
pub consider_types: bool,
}
}
| Pattern | Domain | Boost |
*SUBJ* | All | +0.1 |
*AGE* | DM | +0.15 |
*TERM* | AE, MH | +0.15 |
*TEST* | LB, VS | +0.15 |
| Source Pattern | Target |
| SUBJECT_ID | USUBJID |
| PATIENT_AGE | AGE |
| GENDER | SEX |
| VISIT_DATE | –DTC |
cargo test --package tss-map
- Exact match detection
- Fuzzy match accuracy
- Confidence scoring
- Domain-specific matching