Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

XPT V5 Specification

The XPT V5 format is defined by the SAS Technical Note TS-140. This page provides a comprehensive overview of the format.

Format Overview

XPT V5 (also known as SAS Transport Version 5) is a binary file format with:

graph TB
    subgraph "XPT V5 File Structure"
        LH[Library Header<br/>80 bytes] --> FD[First Dataset]
        FD -->|" More datasets "| ND[Next Dataset...]
    end

subgraph "Dataset Structure"
MH[Member Header<br/>80 bytes] --> DH[DSCRPTR Header<br/>80 bytes]
DH --> DD[Dataset Descriptor<br/>160 bytes]
DD --> NH[NAMESTR Header<br/>80 bytes]
NH --> NR[NAMESTR Records<br/>140 bytes × n]
NR --> OH[OBS Header<br/>80 bytes]
OH --> OD[Observation Data]
end

Library Header

The file begins with a library header identifying the format:

OffsetSizeContent
0-7980HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000
#![allow(unused)]
fn main() {
const LIBRARY_HEADER: &[u8; 80] =
    b"HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  ";
}

Member Header

Each dataset (member) begins with a member header:

OffsetSizeContent
0-7980HEADER RECORD*******MEMBER HEADER RECORD!!!!!!!000000000000000001600000000140

The numbers at the end indicate:

  • 00000016 = 16 bytes for dataset descriptor (hex)
  • 0000014 = 140 bytes per NAMESTR record (decimal)

Dataset Descriptor

The dataset descriptor contains:

OffsetSizeFieldDescription
0-78SASSAS
8-158SASSAS
16-238SASLIBSASLIB
24-318Version9.4
32-398OSOperating system
40-478BlanksPadding
48-6316CreatedddMMMyy:hh:mm:ss
64-7916ModifiedddMMMyy:hh:mm:ss

Second Descriptor Record

OffsetSizeFieldDescription
0-78DSNAMEDataset name
8-158SASDATASASDATA
16-238Version9.4
24-318OSOperating system
32-398BlanksPadding
40-7940LabelDataset label

NAMESTR Records

The NAMESTR header introduces the variable metadata:

OffsetSizeContent
0-5354HEADER RECORD*******NAMESTR HEADER RECORD!!!!!!!
54-574Number of variables (zero-padded)
58-7922Padding

Each variable is described by a 140-byte NAMESTR record. See NAMESTR Records for detailed byte layout.

Observation Data

The observation header introduces the data:

OffsetSizeContent
0-7980HEADER RECORD*******OBS HEADER RECORD!!!!!!!000000000000000000000000000000

After this, raw observation data follows in row-major order:

[Row 1: Var1][Row 1: Var2]...[Row 1: VarN]
[Row 2: Var1][Row 2: Var2]...[Row 2: VarN]
...

Numeric Variables

All numeric variables are stored as 8-byte IBM floating-point:

  • 8 bytes per value
  • Big-endian byte order
  • IBM base-16 exponent (not IEEE 754)

Character Variables

Character variables are stored as fixed-width text:

  • 1-200 bytes per value (as defined in NAMESTR)
  • Space-padded on the right
  • No null terminators

Missing Values

TypeEncoding
Numeric missing (.)0x2E in first byte, zeros elsewhere
Numeric missing (.A-.Z)0x41-0x5A in first byte
Character missingAll spaces

Record Padding

XPT uses 80-byte record alignment:

  • NAMESTR records: 140 bytes (not aligned)
  • Multiple NAMESTRs fill to 80-byte boundary
  • Observation rows: variable length (row_length × n)
  • File ends with space padding to 80 bytes

Version Differences

FeatureV5 (TS-140)V8+
Variable name length8 bytes32 bytes
Label length40 bytes256 bytes
Number encodingIBM floatIEEE 754
Max observations~2 billionUnlimited
Regulatory supportFDA/PMDA/NMPALimited

[!IMPORTANT] For regulatory submissions, only V5 format is accepted. xportrs focuses on V5 compliance.

Official Specification

The authoritative source for XPT V5 format is:

SAS Technical Note TS-140: Record Layout of a SAS Version 5 or 6 Data Set in SAS Transport (XPORT) Format

Download PDF | View on SAS Support

Format Family

The Library of Congress maintains format documentation: