The JSON-stat format is a simple lightweight JSON format for data dissemination. It is based in a cube model that arises from the evidence that the most common form of data dissemination is the tabular form. In this cube model, datasets are organized in dimensions. Dimensions are organized in categories.

Data dissemination is not the business of a few anymore. Even though the JSON-stat format can be the perfect companion for the open data initiatives of National Statistical Offices, it is suitable for all kinds of data disseminators because it has been designed with simplicity in mind.

Apart from reading the documentation, you can also learn the format checking the sample files at your disposal. Besides these madeup files, the Statistics Norway API offers more than a hundred updated JSON-stat responses. In the United Kingdom, the Office for National Statistics API also provides JSON-stat responses.

You can browse the content of a JSON-stat response with the JSON-stat Format Viewer.

JSON-stat Format

v. 1.01

dataset ID

object free word required 1.00

ParentsNone
Childrenvalue, status, dimension, label, updated, source

A JSON-stat response can contain one or more datasets (or tables, or cubes). They are identified by an ID (a string). Multiple datasets responses allow a provider to disseminate information with few common dimensions in a single response.

{
   "cpi*prod*201207" : {
      
   },
   "pob*counties*2012" : {
      
   }
}

The dataset ID can be any standardized string used by the provider and should be a parameter in the request or be computable from the request parameters. If a provider does not plan to support responses with multiple datasets, using specific IDs could be unnecessary: any fixed word would do the job. In such a case, the recommended ID is dataset.

{
   "dataset" : {
      
   }
}

value

array object reserved word required 1.00

Parentsdataset ID
ChildrenNone

It contains the data sorted according to the dataset dimensions. It usually takes the form of an array where missing values are expressed as nulls.

{
   "dataset" : {
      "value" : [105.3, 104.3, null, 177.2, ]
   }
}

When too many cube cells are empty (sparse cube), the value array is populated with nulls.

{
   "dataset" : {
      "value" : [1.3587, null, null, null, null, null, null, null, null,
                null, null, null, null, null, null, null, null, null, 1.5849],
      
   }
}

To avoid this, the value property can take the form of an object.

{
   "dataset" : {
      "value" : { "0": 1.3587, "18": 1.5849 },
      
   }
}

Value order follows the “What does not change, first” criterion. According to this criterion, values are ordered by combination of category dimensions keeping fixed the first categories of the first dimensions and iterating through the categories of the last dimension in the id array (and so forth).

For example, if we have three dimensions (A, B and C) with 3, 2 and 4 categories respectively, the values should ordered iterating first by the 4 categories of C, then by the 2 categories of B and finally by the 3 categories of A:

A1B1C1   A1B1C2   A1B1C3   A1B1C4
A1B2C1   A1B2C2   A1B2C3   A1B2C4

A2B1C1   A2B1C2   A2B1C3   A1B1C4
A2B2C1   A2B2C2   A2B2C3   A2B2C4

A3B1C1   A3B1C2   A3B1C3   A3B1C4
A3B2C1   A3B2C2   A3B2C3   A3B2C4

The order.json file is provided as a JSON-stat example of value and dimension sorting.

status

array object string reserved word optional 1.00

Parentsdataset ID
ChildrenNone

It contains metadata at the observation level. When it takes an array form of the same size of value, it assigns a status to each data by position.

{
   "dataset" : {
      "value" : [100, null, 102, 103, 104],
      "status" : ["a", "m", "a", "a", "p"],
      
   }
}

To assign the same status to all data, an array of size 1 can be used.

{
   "dataset" : {
      "value" : [100, 99, 102, 103, 104],
      "status" : ["e"],
      
   }
}

For the same purpose, a string can be used (this is the recommended way to assign the same status to all data).

{
   "dataset" : {
      "value" : [100, 99, 102, 103, 104],
      "status" : "e",
      
   }
}

An object can also be used to provide status information for specific cells.

{
   "dataset" : {
      "value" : [100, null, 102, 103, 104],
      "status" : { "1" : "m" },
      
   }
}

Currently, “status” does not have a standard meaning nor a standard vocabulary. These are, for the moment, up to the provider. It can be used to optionally include any metadata information at the observation level, not only what is regularly known as “status”. If the vocubulary does not adhere to a standard, there is no way yet to assign meaning to the status codes in the same response.

dimension

object reserved word required 1.00

Parentsdataset ID
Childrenid, size, role, dimension ID

JSON-stat follows a cube model: the values are organized in cells, and a cell is the intersection of various dimensions. The dimension property contains information about the dimensions of the dataset.

{
   "dataset" : {
      "value" : [4729, 4832, 9561],
      "dimension" : {
         
      }
   }
}

id

array reserved word required 1.00

Parentsdimension
ChildrenNone

It contains an ordered list of dimension IDs (strings).

"dimension" : {
   "id" : ["metric", "time", "geo", "sex"], 
   
}

Dimension IDs can be any string and have no special meaning in JSON-stat. Use role to assign a particular meaning to them.

dimension must have an object with the same name as every dimension in this array (see dimension ID).

size

array reserved word required 1.00

Parentsdimension
ChildrenNone

It contains the number (integer) of categories (possible values) of each dimension. It has the same number of elements and in the same order as id.

"dimension" : {
   "id" : ["metric", "time", "geo", "sex"], 
   "size" : [1, 1, 1, 3], 
   
}

In this website, dimensions of size 1 (single category dimensions) are called constant dimensions. Their position in the id and size arrays is irrelevant because they do not affect the values order.

role

object reserved word optional 1.00

Parentsdimension
Childrentime, geo and metric

It can be used to assign special roles to dimensions. At this moment, possible roles are: time, geo and metric. A role can be shared by several dimensions.

"dimension" : {
   "id" : ["concept", "arrivaldate", "departuredate", "origin", "destination"],
   "size" : [1, 24, 24, 10, 10],
   "role": {
      "time": ["arrivaldate", "departuredate"],
      "geo": ["origin", "destination"],
      "metric": ["concept"]
   },
   
}

time

array reserved word optional 1.00

Parentsrole
ChildrenNone

It can be used to assign a time role (when?) to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.

"role": {
   "time": ["arrivaldate", "departuredate"]
}

geo

array reserved word optional 1.00

Parentsrole
ChildrenNone

It can be used to assign a spatial role (where?) to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.

"role": {
   "geo": ["origin", "destination"]
}

metric

array reserved word optional 1.00

Parentsrole
ChildrenNone

It can be used to assign a metric role (what are we counting?) to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.

"role": {
   "metric": ["concept"]
}

dimension ID

object free word required 1.00

Parentsdimension
Childrencategory, label

It is used to describe a particular dimension. The name of this object must be one of the strings in the id array. There must be one and only one dimension ID object for every dimension in the id array.

"dimension" : {
   "id" : ["metric", "time", "geo", "sex"],
   "size" : [1, 1, 1, 3],
   "metric" : {  }, 
   "time" : {  }, 
   "geo" : {  }, 
   "sex" : {  },
   
}

category

object reserved word required 1.00

Parentsdimension ID
Childrenindex, label, child, coordinates, unit

It is used to describe the possible values of a dimension.

"sex" : {
   "category" : {  }
}

index

object array reserved word optional 1.00

Parentscategory
ChildrenNone

It is used to order the possible values (categories) of a dimension. The order of the categories and the order of the dimensions themselves determine the order of the data in the value array. While the dimensions’ order has only this functional role (and therefore any order chosen by the provider is valid), the categories’ order has also a presentation role: it is assumed that the categories are sorted in a meaningful order and that the consumer can rely on it when displaying the information. For example, categories in dimensions with a time role are assumed to be in chronological order.

To provide the category IDs and their order, an array can be used.

"sex" : {
   "category" : {
      "index" : ["M", "F", "T"]
   }
}

For efficiency reasons (see Arrays vs. Objects), an object that attaches a position to every ID can also be used (this is currently the recommended format).

"sex" : {
   "category" : {
      "index" : {
         "M" : 0,
         "F" : 1,
         "T" : 2
      }
   }
}

index is required unless the dimension is a constant dimension (dimension with a single category). When a dimension has only one category, the index property is indeed unnecessary. In the case that a category index is not provided, a category label must be included.

label

string object reserved word optional 1.00

Parentsdataset ID, dimension ID, category
ChildrenNone

It is used to assign a very short (one line) descriptive text to IDs at different levels of the response tree. It is language-dependent.

When it is a child of dataset ID or dimension ID, it is a string.

{
   "dataset" : {
      "label" : "Tuvalu population by sex in the 2002 Census",
      "dimension" : {
         "sex" : {
            "label" : "Sex",
            
         },
         
      }
   }
}

When it is a child of category, it is an object where the keys are the category IDs and the values are the labels.

"sex" : {
   "label" : "Sex",
   "category" : {
      "index" : {
         "M" : 0,
         "F" : 1,
         "T" : 2
      },
      "label" : {
         "M" : "Men",
         "F" : "Women",
         "T" : "Total"
      }
   }
}

Sometimes, the dimension categories have a hierarchical relationship. This relationship can be expressed using the child property.

When no category labels are provided, the recommended behavior for a JSON-stat client is to use the IDs (as they appear in index) as labels. In this case, IDs should be chosen wisely.

"year" : {
   "category" : {
      "index" : {
         "2003" : 0,
         "2004" : 1,
         "2005" : 2,
         "2006" : 3,
         "2007" : 4,
         "2008" : 5,
         "2009" : 6,
         "2010" : 7,
         "2011" : 8,
         "2012" : 9,
         "2013" : 10,
         "2014" : 11
      }
   ,
      "label" : {
         "2003" : "2003",
         "2004" : "2004",
         "2005" : "2005",
         "2006" : "2006",
         "2007" : "2007",
         "2008" : "2008",
         "2009" : "2009",
         "2010" : "2010",
         "2011" : "2011",
         "2012" : "2012",
         "2013" : "2013",
         "2014" : "2014"
      }
   }
}

When a dimension is a constant dimension and no category index is provided for it, category labels are required.

"metric" : {
   "category" : {
      "index" : {
         "pop" : 0
      },
      "label" : {
         "pop" : "Population"
      }
   }
   
}

Of course, if the dimension is constant and the label is not important, you can choose to remove the category label instead of the category index.

"year" : {
   "category" : {
      "index" : {
         "2013" : 0
      },
      "label" : {
         "2013" : "2013"
      }
   }
   
}

child

object reserved word optional 1.00

Parentscategory
ChildrenNone

It is used to describe the hierarchical relationship between different categories. It takes the form of an object where the key is the ID of the parent category and the value is an array of the IDs of the child categories. It is also a way of exposing a certain category as a total.

"actstatus": {
   "label":"Activity status",
   "category": {
      "index": {
         "A" : 0,
         "E" : 1,
         "U" : 2,
         "I" : 3,
         "T" : 4
      },
      "label": {
         "A" : "Active population",
         "E" : "Employment",
         "U" : "Unemployment",
         "I" : "Inactive population"
         "T" : "Population 15 years old and over"
      },
      "child": {
         "A" : ["E", "U"],
         "T" : ["A", "I"]
      }
   }
}

When there are several hierarchy levels, like in the previous example, child must reference only the direct descendants. For example, the oecd-canada.json sample file includes a total (OECD) and a subtotal (EU15): the 15 countries in EU15 are not directly referenced as OECD countries.

"child" : {
   "EU15" : [
      "AT", 
      "BE", 
      "DE", 
      "DK", 
      "ES", 
      "FI", 
      "FR", 
      "GR", 
      "IE", 
      "IT", 
      "LU", 
      "NL", 
      "PT", 
      "SE", 
      "UK"
   ],
   "OECD" : [ 
      "EU15", 
      "AU", 
      "CA", 
      "CL", 
      "CZ", 
      "DK", 
      "EE", 
      "HU", 
      "IS", 
      "IL", 
      "JP", 
      "KR", 
      "MX", 
      "NO", 
      "NZ", 
      "PL", 
      "SK", 
      "SI", 
      "CH", 
      "TR", 
      "US"
   ]
}

As an example, see hierarchy.json.

coordinates

object reserved word optional 1.00

Parentscategory
ChildrenNone

It can be used to assign longitude/latitude geographic coordinates to the categories of a dimension with a geo role. It takes the form of an object where keys are category IDs and values are an array of two numbers (longitude, latitude).

"category" : {
   "label" : {
      "ISO-3166-2:TV" : "Tuvalu"
   },
   "coordinates" : {
      "ISO-3166-2:TV" : [179.1995, -8.5199]
   }
}

The goal of JSON-stat is not to provide rich geographical information. To that purpose, use GeoJSON or TopoJSON and match your maps areas in those formats with statistical data (in JSON-stat) encoding your geographical categories with common IDs.

unit

object reserved word optional 1.00

Parentscategory
ChildrenCategory IDs

It can be used to assign unit of measure metadata to the categories of a dimension with a metric role.

"role": {
   "metric": ["concept"]
}

It takes the form of an object where every dimension category is a key and the value is an object. The properties of this object are not closed.

"concept" : {
   "category" : {
      "label" : {
         "pop" : "Population"
      },
      "unit" : {
         "pop" : {  }
      }
   }
}

Based on current standards and practices, possible properties of this object could be:

  • label: It could be a language-dependent text to display with the values (like “millions of dollars”).
  • type: This property should probably help deriving new data from the data. It should probably help answering questions like: does it make sense to add two different cell values? Some possible values of this property could be count or ratio. Some might also consider as possible values things like currency, mass, length, time, etc.
  • base: It is the base unit (person, gram, euro, etc.).
  • multiplier: It is the unit multiplier. It should help comparing data with the same base unit but different multiplier. If a decimal system is used, it can be expressed as powers of 10 (0=1, 1=10, -1=0.1, etc.).
  • symbol: A possible symbol to add to the data when it is displayed (for example, €, $ or %). It could be language-dependent.
  • position: The place where the symbol should be written (before or after the data). Possible values could be start and end. It is language-dependent.
  • adjustment: A code to express the time series adjustment (for example, seasonally adjusted or adjusted by working days) or indices adjustment (for example, chain-linked indices).
  • decimals: The number of decimals.

updated

string reserved word optional 1.00

Parentsdataset ID
ChildrenNone

It contains the update time of the dataset. It is a string representing a date in an ISO 8601 format recognized by the Javascript Date.parse method.

{
   "dataset" : {
      "updated" : "2012-01-22T12:30:02Z",
      
   }
}

source

string reserved word optional 1.00

Parentsdataset ID
ChildrenNone

It contains a language-dependent short text describing the source of the dataset.

{
   "dataset" : {
      "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections",
      
   }
}

class

string reserved word optional 1.01

Parentsdataset ID, dimension ID
ChildrenNone

The default response in JSON-stat is a bundle of datasets. A JSON-stat provider may also offer partial responses (document fragments of a full JSON-stat response).

Two partial responses are considered special (“native partial responses”): dataset responses only include a dataset node; dimension responses only include a dimension node.

class is used in native partial responses to declare the response type. It can take the values dataset and dimension.

{
   "class" : "dataset",
   "label" : "Unemployment rate in the OECD countries 2003-2014",
   "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections",
   "updated" : "2012-11-27",
   "value" : [  ],
   "dimension" : {  }
}
{
   "class" : "dimension",
   "label" : "Sex",
   "category" : {
   "index" : ["T", "M", "F"],
   "label" : {
      "T" : "Total",
      "M" : "Male",
      "F" : "Female"
   }
}

In full responses, class is optional and takes a value of dataset when the parent is a dataset ID, and a value of dimension when the parent is a dimension ID.

extension

object reserved word optional 1.01

Parentsdataset ID, dimension ID
ChildrenOpen

extension allows JSON-stat to be extended for particular needs. Providers are free to define where they include this property and what children are allowed in each case.

{
   "dataset" : {
      "label" : "Unemployment rate in the OECD countries 2003-2014",
      "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections",
      "updated" : "2012-11-27",

      "extension" : {
         "contact" : "EcoOutlook@oecd.org",
         "metadata" : [
            {
               "title" : "Economic Outlook Policy and other assumptions underlying the projections Box 1.2 in General assessment",
               "href" : "http://www.oecd.org/eco/economicoutlookanalysisandforecasts/EO92macroeconomicsituation.pdf"
            },
            {
               "title" : "Economic Outlook Sources and Methods",
               "href" : "http://www.oecd.org/document/22/0,3343,en_2649_34109_33702486_1_1_1_1,00.html"
            },
            {
               "title" : "Database inventory (forthcoming)",
               "href" : "http://www.oecd.org/eco/databaseinventory"
            },
            {
               "title" : "OECD Glossary",
               "href" : "http://stats.oecd.org/glossary/"
            }
         ]
      },

      "value" : [  ],
      "dimension ": {  }
   }
}

Client libraries should provide a general way to retrieve the contents of the extension elements. For example:

J.Dataset(0).extension;
/* This retrieves the extension object 
   in the first dataset with all its properties 
*/

J.Dataset(0).extension.contact;
/* This retrieves the extension property "contact" 
   in the first dataset 
*/

href

string reserved word optional 1.01

Parentsdataset ID, dimension ID, relation ID
ChildrenNone

It specifies a URL.

Providers can use this property to avoid sending information that is shared between different requests (for example, dimensions).

{
   "dataset" : {
      "label" : "Tuvalu population by sex in the 2002 Census",
      "dimension" : {
         "sex" : {
            "href" : "http://provider.domain/dimension/sex"
         },
         
      }
   }
}

http://provider.domain/dimension/sex in the previous example would return a native partial response (see class) like:

{
   "class" : "dimension",
   "href" : "http://provider.domain/dimension/sex",
   "label" : "Sex",
   "category" : {
      "index" : ["T", "M", "F"],
      "label" : {
         "T" : "Total",
         "M" : "Male",
         "F" : "Female"
      }
   }
}

When it is a descendant of link, it is used to point to related resources.

relation ID

array external word optional 1.01

Parentslink
Childrentype, href

This ID must be an IANA link relation name that describes the relation between the elements of the array and the parent of link (a dataset ID or a dimension ID).

{
   "dataset" : {
      "link" : {
         "alternate" : [
            {
               "type" : "text/csv",
               "href" : "http://provider.domain/2002/population/sex.csv"
            },
            {
               "type" : "text/html",
               "href" : "http://provider.domain/2002/population/sex.html"
            }
         ],
         
      },
      
   }
}

type

string reserved word optional 1.01

Parentsrelation ID
ChildrenNone

It describes the media type of the accompanying href.

note

array object reserved word optional 1.01

Parentsdataset ID, dimension ID, category
ChildrenNone

This is a property similar to label. The main differences are:

  • note is used to assign annotations instead of a descriptive text
  • where label uses a string, note uses an array of strings

Please, see label for a general description.

{
   "dataset" : {
      "note" : [ "Most of the data in this dataset are taken from the individual contributions of national correspondents appointed by the OECD Secretariat with the approval of the authorities of Member countries. Consequently, these data have not necessarily been harmonised at international level." ],
      "dimension" : {
         "country" : {
            "note" : [ "Except where otherwise indicated, data refer to the actual territory of the country considered." ],
            "category" : {
               "note" : {
                  "DEU" : [ "Germany (code DEU) was created 3 October 1990 by the accession of the Democratic Republic of Germany (code DDR) to the then Federal Republic of Germany (code DEW)." ]
               },
               
            },
            
         },
         
      },
      
   }
}

note allows to assign annotations to datasets (array), dimensions (array) and categories (object). To assign annotations to individual data, use status.

error

array reserved word optional 1.01

ParentsNone
ChildrenOpen

Besides using HTTP status codes, JSON-stat documents can include the error property to communicate response errors. It takes the form of an array of objects, each providing information for an error. Libraries should offer a method to retrieve this array but should also check the validity of the response, as the inclusion of error is not mandatory.

Based on current standards and practices, possible error elements could be:

  • status: The HTTP status code (for example, "401").
  • id: The provider’s internal error code (for example, "106").
  • href: A link to a web page where information about this error is published (for example, "http://provider.domain/error/106").
  • label: A short descriptive text about the error. It can be useful to provide two properties: one for the end user ("The selected country does not exist.") and one for the developer ("Parameter 'area' must be an ISO 3166-1 alpha-2 code.").