JSON-stat Format
version
string reserved word required 2.0
Parents | Root |
---|---|
Children | None |
It declares the JSON-stat version of the response. The goal of this property is to help clients parsing that particular response.
{
"version" : "2.0",
…
}
Because future versions could add optional properties, the same response can be valid in several versions: any of such versions is accepted as a value of version.
Before version 2.0
version was introduced in version 2.0. That’s why this property can’t accept values lower than 2.0.
class
string reserved word optional 1.1
Parents | Root, relation ID array element |
---|---|
Children | None |
JSON-stat supports several classes of responses. Possible values of class are: dataset, dimension and collection.
Dataset responses include a single dataset. They are declared using the dataset class value.
{ "version" : "2.0", "class" : "dataset", "label" : "Unemployment rate in the OECD countries 2003-2014", "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections", "updated" : "2012-11-27", "id" : ["concept", "area", "year"], "size" : [1, 36, 12], "dimension" : { … }, "value" : [ … ] }
See the oecd.json sample file.
Dimension responses only include a dimension node.{ "version" : "2.0", "class" : "dimension", "label" : "sex", "category" : { "index" : ["T", "M", "F"], "label" : { "T" : "total", "M" : "male", "F" : "female" } } }Collection responses reference items in a collection. They use the link property with a relation of item. The items of the collection can be of any class (datasets, dimensions, collections).
{ "version" : "2.0", "class" : "collection", "href" : "https://json-stat.org/samples/collection.json", "label" : "JSON-stat Dataset Sample Collection", "updated" : "2015-07-02", "link" : { "item" : [ { "class" : "dataset", "href" : "https://json-stat.org/samples/oecd.json", "label" : "Unemployment rate in the OECD countries 2003-2014" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/canada.json", "label" : "Population by sex and age group. Canada. 2012" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/galicia.json", "label" : "Population by province of residence, place of birth, age, gender and year in Galicia" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-gsp.json", "label" : "US States by GSP and population" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-unr.json", "label" : "Unemployment Rates by County, 2012 Annual Averages" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-labor.json", "label" : "Labor Force Data by County, 2012 Annual Averages" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/order.json", "label" : "Demo of value ordering: what does not change, first" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/hierarchy.json", "label" : "Demo of hierarchical dimension" } ] } }
See the collection.json sample file.
Before version 2.0
class was introduced in version 2.0. Before, all responses were bundle responses.
Bundle responses could contain several datasets. They took the form of a map where dataset IDs were the keys and the values were objects that contained the dataset information.
See the oecd-canada.json sample file.
id
array reserved word required 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | None |
It contains an ordered list of dimension IDs.
{
"version" : "2.0",
"class": "dataset",
"id" : ["metric", "time", "geo", "sex"],
…
}
Dimension IDs can be any string and have no special meaning in JSON-stat. Use role to assign a particular meaning to them.
size
array reserved word required 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | None |
It contains the number (integer) of categories (possible values) of each dimension in the dataset. It has the same number of elements and in the same order as in id.
{
"version" : "2.0",
"class" : "dataset",
"id" : ["metric", "time", "geo", "sex"],
"size" : [1, 1, 1, 3],
…
}
role
object reserved word optional 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | time, geo and metric |
It can be used to assign special roles to dimensions. At this moment, possible roles are: time, geo and metric. A role can be shared by several dimensions.
{
"version" : "2.0",
"class" : "dataset",
"id" : ["concept", "arrivaldate", "departuredate", "origin", "destination"],
"size" : [1, 24, 24, 10, 10],
"role": {
"time": ["arrivaldate", "departuredate"],
"geo": ["origin", "destination"],
"metric": ["concept"]
},
…
}
time
array reserved word optional 1.0
Parents | role |
---|---|
Children | None |
It can be used to assign a time role to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.
"role": { "time": ["arrivaldate", "departuredate"] }
geo
array reserved word optional 1.0
Parents | role |
---|---|
Children | None |
It can be used to assign a spatial role to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.
"role": { "geo": ["origin", "destination"] }
metric
array reserved word optional 1.0
Parents | role |
---|---|
Children | None |
It can be used to assign a metric role to one or more dimensions. It takes the form of an array of dimension IDs in which order does not have a special meaning.
"role": { "metric": ["concept"] }
value
array object reserved word required 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | None |
It contains the data sorted according to the dataset dimensions. It usually takes the form of an array where missing values are expressed as nulls.
{ "version" : "2.0", "class" : "dataset", "value" : [105.3, 104.3, null, 177.2, …], … }
When too many cube cells are empty (sparse cube), the value array is populated with nulls.
{
"version" : "2.0",
"class" : "dataset",
"value" : [1.3587, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, 1.5849],
…
}
To avoid this, the value property can take the form of an object.
{
"version" : "2.0",
"class" : "dataset",
"value" : { "0": 1.3587, "18": 1.5849 },
…
}
Value order follows the “What does not change, first” criterion. According to this criterion, values are ordered by combination of category dimensions keeping fixed the first categories of the first dimensions and iterating through the categories of the last dimension in the id array (and so forth).
For example, if we have three dimensions (A, B and C) with 3, 2 and 4 categories respectively, the values should be ordered iterating first by the 4 categories of C, then by the 2 categories of B and finally by the 3 categories of A:
A1B1C1 A1B1C2 A1B1C3 A1B1C4 A1B2C1 A1B2C2 A1B2C3 A1B2C4 A2B1C1 A2B1C2 A2B1C3 A1B1C4 A2B2C1 A2B2C2 A2B2C3 A2B2C4 A3B1C1 A3B1C2 A3B1C3 A3B1C4 A3B2C1 A3B2C2 A3B2C3 A3B2C4
This flattening method is known as row-major order.
The order.json file is provided as a JSON-stat example of value and dimension sorting.
status
array object string reserved word optional 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | None |
It contains metadata at the observation level. When it takes an array form of the same size of value, it assigns a status to each data by position.
{
"version" : "2.0",
"class" : "dataset",
"value" : [100, null, 102, 103, 104],
"status" : ["a", "m", "a", "a", "p"],
…
}
To assign the same status to all data, an array of size 1 can be used.
{
"version" : "2.0",
"class" : "dataset",
"value" : [100, 99, 102, 103, 104],
"status" : ["e"],
…
}
For the same purpose, a string can be used (this is the recommended way to assign the same status to all data).
{
"version" : "2.0",
"class" : "dataset",
"value" : [100, 99, 102, 103, 104],
"status" : "e",
…
}
An object can also be used to provide status information for specific cells.
{
"version" : "2.0",
"class" : "dataset",
"value" : [100, null, 102, 103, 104],
"status" : { "1" : "m" },
…
}
Currently, “status” does not have a standard meaning nor a standard vocabulary. These are, for the moment, up to the provider. It can be used to optionally include any metadata information at the observation level, not only what is regularly known as “status”. If the vocubulary does not adhere to a standard, there is no way yet to assign meaning to the status codes in the same response.
dimension
object reserved word required 1.0
Parents | Root (when class "dataset"), relation ID array element |
---|---|
Children | dimension ID |
JSON-stat follows a cube model: the values are organized in cells, and a cell is the intersection of various dimensions. The dimension property contains information about the dimensions of the dataset.
{
"version" : "2.0",
"class" : "dataset",
"value" : [4729, 4832, 9561],
"dimension" : {
…
}
}
dimension must have properties (see dimension ID) with the same names of each element in the id array.
dimension ID
object free word required 1.0
Parents | dimension |
---|---|
Children | category, label, class |
It is used to describe a particular dimension. The name of this object must be one of the strings in the id array. There must be one and only one dimension ID object for every dimension in the id array.
"dimension" : { "metric" : { … }, "time" : { … }, "geo" : { … }, "sex" : { … }, … }
category
object reserved word required 1.0
Parents | Root (when class "dimension"), dimension ID, relation ID array element |
---|---|
Children | index, label, child, coordinates, unit |
It is used to describe the possible values of a dimension.
"sex" : {
"category" : { … }
}
index
object array reserved word optional 1.0
Parents | category |
---|---|
Children | None |
It is used to order the possible values (categories) of a dimension. The order of the categories and the order of the dimensions themselves determine the order of the data in the value array. While the dimensions’ order has only this functional role (and therefore any order chosen by the provider is valid), the categories’ order has also a presentation role: it is assumed that the categories are sorted in a meaningful order and that the consumer can rely on it when displaying the information. For example, categories in dimensions with a time role are assumed to be in chronological order.
To provide the category IDs and their order, an array can be used.
"sex" : { "category" : { "index" : ["M", "F", "T"] } }
An object that attaches a position to every ID can also be used.
"sex" : { "category" : { "index" : { "M" : 0, "F" : 1, "T" : 2 } } }
index is required unless the dimension is a constant dimension (dimension with a single category). When a dimension has only one category, the index property is indeed unnecessary. In the case that a category index is not provided, a category label must be included.
label
string object reserved word optional 1.0
Parents | Root, dimension ID, category, unit category ID, relation ID array element |
---|---|
Children | None |
It is used to assign a very short (one line) descriptive text to IDs at different levels of the response tree. It is language-dependent.
When it is a root property or a child of dimension ID or of a unit category ID, it is a string.
{ "version" : "2.0", "class" : "dataset", "label" : "Tuvalu population by sex in the 2002 Census", "dimension" : { "sex" : { "label" : "sex", … }, … }, … }
When it is a child of category, it is an object where the keys are the category IDs and the values are the labels.
"sex" : { "label" : "sex", "category" : { "index" : { "M" : 0, "F" : 1, "T" : 2 }, "label" : { "M" : "men", "F" : "women", "T" : "total" } } }
label content should be written in lowercase except when it is a dataset label (when label is a root property of a response with class dataset). It’s the client’s job to capitalize it when needed according to the display context.
Sometimes, the dimension categories have a hierarchical relationship. This relationship can be expressed using the child property.
When no category labels are provided, the recommended behavior for a JSON-stat client is to use the IDs (as they appear in index) as labels. In this case, IDs should be chosen wisely.
"year" : { "category" : { "index" : { "2003" : 0, "2004" : 1, "2005" : 2, "2006" : 3, "2007" : 4, "2008" : 5, "2009" : 6, "2010" : 7, "2011" : 8, "2012" : 9, "2013" : 10, "2014" : 11 } , "label" : { "2003" : "2003", "2004" : "2004", "2005" : "2005", "2006" : "2006", "2007" : "2007", "2008" : "2008", "2009" : "2009", "2010" : "2010", "2011" : "2011", "2012" : "2012", "2013" : "2013", "2014" : "2014" } } }
When a dimension is a constant dimension (dimension with a single category) and no category index is provided for it, category label is required.
"metric" : { "category" : { "index" : { "pop" : 0 }, "label" : { "pop" : "population" } } … }
Of course, if the dimension is constant and the label is not important, you can choose to remove the category label instead of the category index.
"year" : { "category" : { "index" : { "2013" : 0 }, "label" : { "2013" : "2013" } } … }
unit label is a text that can be displayed after values (like “millions of tonnes”). The use of unit label is recommended, unless this information has already been provided in a parent label property. This last practice is discouraged: for example, when possible, a metric category label should not contain the unit label:
"label": { "exp": "exports (tonnes)" }
Better:
"label": { "exp": "exports" }, "unit": { "exp": { "label": "tonnes" } }
child
object reserved word optional 1.0
Parents | category |
---|---|
Children | None |
It is used to describe the hierarchical relationship between different categories. It takes the form of an object where the key is the ID of the parent category and the value is an array of the IDs of the child categories. It is also a way of exposing a certain category as a total.
"actstatus": { "label":"activity status", "category": { "index": { "A" : 0, "E" : 1, "U" : 2, "I" : 3, "T" : 4 }, "label": { "A" : "active population", "E" : "employment", "U" : "unemployment", "I" : "inactive population" "T" : "population 15 years old and over" }, "child": { "A" : ["E", "U"], "T" : ["A", "I"] } } }
When there are several hierarchy levels, like in the previous example, child must reference only the direct descendants. For example, the oecd.json sample file includes a total (OECD) and a subtotal (EU15): the 15 countries in EU15 are not directly referenced as OECD countries.
"child" : { "EU15" : [ "AT", "BE", "DE", "DK", "ES", "FI", "FR", "GR", "IE", "IT", "LU", "NL", "PT", "SE", "UK" ], "OECD" : [ "EU15", "AU", "CA", "CL", "CZ", "DK", "EE", "HU", "IS", "IL", "JP", "KR", "MX", "NO", "NZ", "PL", "SK", "SI", "CH", "TR", "US" ] }
As an example, see hierarchy.json.
coordinates
object reserved word optional 1.0
Parents | category |
---|---|
Children | None |
It can be used to assign longitude/latitude geographic coordinates to the categories of a dimension with a geo role. It takes the form of an object where keys are category IDs and values are an array of two numbers (longitude, latitude).
"category" : { "label" : { "ISO-3166-2:TV" : "Tuvalu" }, "coordinates" : { "ISO-3166-2:TV" : [179.1995, -8.5199] } }
The goal of JSON-stat is not to provide rich geographical information. To that purpose, use GeoJSON or TopoJSON and match your maps areas in those formats with statistical data (in JSON-stat) encoding your geographical categories with common IDs.
unit
object reserved word optional 1.0
Parents | category |
---|---|
Children | category ID |
It can be used to assign unit of measure metadata to the categories of a dimension with a metric role.
"role": { "metric": ["concept"] }
It takes the form of an object where category ID is the key and the value is an object.
"concept" : {
"category" : {
"label" : {
"exp" : "exports"
},
"unit" : {
"exp" : { … }
}
}
}
Four properties of this object are currently closed: decimals, label, symbol and position.
"unit" : { "exp" : { "decimals": 1, "label" : "millions", "symbol" : "$", "position" : "start" } }
Following the previous example, a client could display a value of 10 of metric exp as “$10.0 millions”.
Based on current standards and practices, other properties of this object could be:
- base: It is the base unit (person, gram, euro, etc.).
- type: This property should probably help deriving new data from the data. It should probably help answering questions like: does it make sense to add two different cell values? Some possible values of this property could be count or ratio. Some might also consider as possible values things like currency, mass, length, time, etc.
- multiplier: It is the unit multiplier. It should help comparing data with the same base unit but different multiplier. If a decimal system is used, it can be expressed as powers of 10 (0=1, 1=10, -1=0.1, etc.).
- adjustment: A code to express the time series adjustment (for example, seasonally adjusted or adjusted by working days) or indices adjustment (for example, chain-linked indices).
These properties are currently open. Data providers are free to use them on their own terms, although it is safer to do it under extension.
decimals
number reserved word optional 1.2
Parents | Unit category ID |
---|---|
Children | None |
It contains the number of unit decimals (integer). If unit is present, decimals is required.
symbol
string reserved word optional 1.2
Parents | Unit category ID |
---|---|
Children | None |
It contains a possible unit symbol to add to the value when it is displayed (like “€”, “$” or “%”).
position
string reserved word optional 1.2
Parents | Unit category ID |
---|---|
Children | None |
It contains the place (start or end, strings) where the unit symbol should be written (before or after the value). Default is end.
updated
string reserved word optional 1.0
Parents | Root, dimension ID, relation ID array element |
---|---|
Children | None |
It contains the update time of the dataset. It is a string representing a date in an ISO 8601 format recognized by the Javascript Date.parse method (see ECMA-262 Date Time String Format).
{
"version" : "2.0",
"class" : "dataset",
"updated" : "2012-01-22T12:30:02Z",
…
}
source
string reserved word optional 1.0
Parents | Root, dimension ID, relation ID array element |
---|---|
Children | None |
It contains a language-dependent short text describing the source of the dataset.
{
"version" : "2.0",
"class" : "dataset",
"source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections",
…
}
extension
object reserved word optional 1.1
Parents | Root, dimension ID, relation ID array element |
---|---|
Children | Open |
extension allows JSON-stat to be extended for particular needs. Providers are free to define where they include this property and what children are allowed in each case.
{ "version" : "2.0", "class" : "dataset", "label" : "Unemployment rate in the OECD countries 2003-2014", "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections", "updated" : "2012-11-27", "extension" : { "contact" : "EcoOutlook@oecd.org", "metadata" : [ { "title" : "Economic Outlook Policy and other assumptions underlying the projections Box 1.2 in General assessment", "href" : "https://www.oecd.org/eco/economicoutlookanalysisandforecasts/EO92macroeconomicsituation.pdf" }, { "title" : "Economic Outlook Sources and Methods", "href" : "https://www.oecd.org/document/22/0,3343,en_2649_34109_33702486_1_1_1_1,00.html" }, { "title" : "Database inventory (forthcoming)", "href" : "https://www.oecd.org/eco/databaseinventory" }, { "title" : "OECD Glossary", "href" : "https://stats.oecd.org/glossary/" } ] }, "value" : [ … ], "id" : [ … ], "size" : [ … ], "dimension" : { … } }
Client libraries should provide a general way to retrieve the contents of the extension elements. For example:
J.Dataset(0).extension; /* This retrieves the extension object in the first dataset with all its properties */ J.Dataset(0).extension.contact; /* This retrieves the extension property "contact" in the first dataset */
href
string reserved word optional 1.1
Parents | Root, dimension ID, relation ID array element |
---|---|
Children | None |
It specifies a URL.
Providers can use this property to avoid sending information that is shared between different requests (for example, dimensions).
{ "version" : "2.0", "class" : "dataset", "label" : "Tuvalu population by sex in the 2002 Census", "dimension" : { "sex" : { "href" : "https://example.com/dimension/sex" }, … }, … }
{ "version" : "2.0", "class" : "dimension", "href" : "https://example.com/dimension/sex", "label" : "sex", "category" : { "index" : ["T", "M", "F"], "label" : { "T" : "total", "M" : "male", "F" : "female" } } }
When it is a descendant of link, it is used to point to related resources.
link
object reserved word optional 1.1
Parents | Root, dimension ID, relation ID array element |
---|---|
Children | relation ID |
It is used to provide a list of links related to a dataset or a dimension, sorted by relation (see relation ID).
{ "version" : "2.0", "class" : "dataset", "label" : "Tuvalu population by sex in the 2002 Census", "href" : "https://example.com/2002/population/sex", "link" : { "alternate" : [ { "type" : "text/csv", "href" : "https://example.com/2002/population/sex.csv" } ], … }, … }
relation ID
array external word optional 1.1
Parents | link |
---|---|
Children | None |
This ID must be an IANA link relation name that describes the relation between the elements of the array and the parent of link.
When used to link to non-JSON-stat documents, the elements of the array are objects that can contain a type and an href property.
{ "version" : "2.0", "class" : "dataset", "link" : { "alternate" : [ { "type" : "text/csv", "href" : "https://example.com/2002/population/sex.csv" }, { "type" : "text/html", "href" : "https://example.com/2002/population/sex.html" } ], … }, … }
When used to link to JSON-stat documents, they can contain the common JSON-stat properties class, label, href and extension.
{ "version" : "2.0", "class" : "collection", "href" : "https://json-stat.org/samples/collection.json", "label" : "JSON-stat Dataset Sample Collection", "updated" : "2015-07-02", "link" : { "item" : [ { "class" : "dataset", "href" : "https://json-stat.org/samples/oecd.json", "label" : "Unemployment rate in the OECD countries 2003-2014" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/canada.json", "label" : "Population by sex and age group. Canada. 2012" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/galicia.json", "label" : "Population by province of residence, place of birth, age, gender and year in Galicia" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-gsp.json", "label" : "US States by GSP and population" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-unr.json", "label" : "Unemployment Rates by County, 2012 Annual Averages" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/us-labor.json", "label" : "Labor Force Data by County, 2012 Annual Averages" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/order.json", "label" : "Demo of value ordering: what does not change, first" }, { "class" : "dataset", "href" : "https://json-stat.org/samples/hierarchy.json", "label" : "Demo of hierarchical dimension" } ] } }
They can also be used to embed full JSON-stat responses of any class.
{ "version" : "2.0", "class" : "collection", "href" : "https://json-stat.org/samples/oecd-canada-col.json", "label" : "OECD-Canada Sample Collection", "updated" : "2015-12-24", "link" : { "item" : [ { "class" : "dataset", "href" : "https://json-stat.org/samples/oecd.json", "label" : "Unemployment rate in the OECD countries 2003-2014", "source" : "Economic Outlook No 92 - December 2012 - OECD Annual Projections", "updated" : "2012-11-27", "value" : [5.943826289, 5.39663128, 5.044790587, …], "id" : ["concept", "area", "year"], "size" : [1, 36, 12], "dimension" : { … }, … }, … ] } }, … }
type
string reserved word optional 1.1
Parents | relation ID array element |
---|---|
Children | None |
It describes the media type of the accompanying href. Not required when the resource referenced in the link is a JSON-stat resource.
note
array object reserved word optional 1.1
Parents | None, dimension ID, category, relation ID array element |
---|---|
Children | None |
This is a property similar to label. The main differences are:
- note is used to assign annotations instead of a descriptive text
- where label uses a string, note uses an array of strings
Please, see label for a general description.
{ "version" : "2.0", "class" : "dataset", "note" : [ "Most of the data in this dataset are taken from the individual contributions of national correspondents appointed by the OECD Secretariat with the approval of the authorities of Member countries. Consequently, these data have not necessarily been harmonised at international level." ], "dimension" : { "country" : { "note" : [ "Except where otherwise indicated, data refer to the actual territory of the country considered." ], "category" : { "note" : { "DEU" : [ "Germany (code DEU) was created 3 October 1990 by the accession of the Democratic Republic of Germany (code DDR) to the then Federal Republic of Germany (code DEW)." ] }, … }, … }, … }, … }
note allows to assign annotations to datasets (array), dimensions (array) and categories (object). To assign annotations to individual data, use status.
error
array reserved word optional 1.1
Parents | Root |
---|---|
Children | Open |
Dealing with errors (besides using HTTP status codes) in JSON-stat is an open issue. In version 1.1 of the format, a suggestion was added to allow a more semantic error information. According to that sugestion, JSON-stat documents could include an error property to communicate response errors. It would take the form of an array of objects, each providing information for an error. Libraries should offer a method to retrieve this array but should also check the validity of the response, as the inclusion of error was not mandatory.
The suggestion went further to list some possible elements:
- status: The HTTP status code (for example, "401").
- id: The provider’s internal error code (for example, "106").
- href: A link to a web page where information about the error is published (for example, "https://example.com/error/106").
- label: A short descriptive text about the error. It could be useful to provide two properties: one for the end user ("The selected country does not exist.") and one for the developer ("Parameter 'area' must be an ISO 3166-1 alpha-2 code.").
The suggestion introduced in version 1.1, though, is not probably too conformant with version 2.0 where a specific response for errors makes more sense:
{
"version" : "2.0",
"class" : "error",
…
}