Unit 01 SDMX data transmission formats primer

SDMX data transmission formats

The SDMX standard has 12 different formats for transmitting statistical data:

TypeVersion of the standardFormatCurrency
EDISDMX 1.0SDMX-EDI GESMES/TS EDIFACT data messagecurrent
XMLSDMX 1.0 / 2.0SDMX-ML Generic (time-series) data messageobsolete
XMLSDMX 1.0 / 2.0SDMX-ML Compact (time-series) data messageobsolete
XMLSDMX 1.0 / 2.0SDMX-ML Utility (time-series) data messageobsolete
XMLSDMX 1.0 / 2.0SDMX-ML Cross-Sectional data messageobsolete
XMLSDMX 2.1SDMX-ML Generic data messages for observations, time-series and cross-sectional datacurrent
XMLSDMX 2.1SDMX-ML Structure-Specific data messages for observations, time-series and cross-sectional datacurrent
XMLSDMX 3.0SDMX-ML Structure-Specific data messagecurrent
JSONSDMX 2.1SDMX-JSON version 1 data messagecurrent
JSONSDMX 3.0SDMX-JSON version 2 data messagecurrent
CSVSDMX 2.1SDMX-CSV version 1 data messagecurrent
CSVSDMX 3.0SDMX-CSV version 2 data messagecurrent

Many dating from SDMX versions 1.0 and 2.0 are effectively obsolete, however EDI remains in use.

Format use cases

The different formats suit some use cases better than others:

EDI:

  • terse allowing smaller file sizes
  • but the GESMES/TS EDIFACT message does not support newer information model features introduced in SDMX 3.0 like multi-value attributes

XML:

  • human readable making it a good all purpose format
  • but is relatively verbose resulting in large file sizes making compression needed for efficient transmission

JSON:

  • easily read and manipulated by JavaScript and similar languages making it a good format for driving data publication websites and software tools

CSV:

  • relatively easy to create using standard software tools and programming techniques
  • also easy to read using Excel, BI tools and similar software
  • however the flattening of the data into a two-dimensional table makes datasets more verbose because component values are repeated for each observation

In the next units

In the following units we’ll take a closer look at each of the formats in turn starting with XML which is the most commonly used.