Comparison with MediaWiki Tabular Data
Authors | Jakob Voß |
---|
MediaWiki is the software used to run Wikipedia and related projects of the Wikimedia Foundation, including the media file repository Wikimedia Commons. Commons hosts mostly images but also some records with tabular data. The MediaWiki Tabular Data Model was inspired by Data Package version 1 but it slightly differs from current Data Package specification, as described below.
Property Comparison
A MediaWiki tabular data page describes and contains an individual table of data similar to a Data Resource with inline tabular data. Both are serialized as JSON objects, but the former comes as a page with unique name in a MediaWiki instance (such as Wikimedia Commons).
Top-level Properties
MediaWiki Tabular Data has three required and two optional top-level properties. Most of these properties map to corresponding properties of a Data Resource:
MediaWiki Tabular Data | Data Package Table Schema |
---|---|
- (implied by page name) | name (required) is a string |
description (optional) is a localized string | description (optional) is a CommonMark string |
data (required) | data (optional) |
license (required) is the string CC0-1.0 or another known identifier | licenses (optional) is an array |
schema (required) as described below | schema (optional) can have multiple forms |
sources (optional) is a string with Wiki markup | sources (optional) is an array of objects |
The differences are:
- property
name
does not exist but can be implied from page name - property
description
andsources
have another format - property
data
is always an array of arrays and data types of individual values can differ - property
schema
is required but it differs in definion of schema properties - there is no property
licenses
butlicense
fixed to plain string valueCC0-1.0
(other license indicators may be possible)
Data Types
Tabular Data supports four data types that overlap with Table Schema data types:
number
subset of Table Schema number (noNaN
,INF
, or-INF
)boolean
same as Table Schema booleanstring
subset of Table Schema string (limited to 400 characters at most and must not include\n
or\t
)localized
refers to an object that maps language codes to strings with same limitations asstring
type. This type is not supported in Table Schema.
Individual values in a MediaWiki Tabular Data table can always be null
, while in Table Schema you need to explicitly list values that should be considered missing in schema.missingValues.
Schema Properties
The schema
property of MediaWiki tabular contains an object with property fields
just like Table Schema but no other properties are allowed. Elements of this array are like Table Schema field descriptors limited to three properties and different value spaces:
MediaWiki Tabular Data | Data Package Table Schema |
---|---|
name (required) must be a string matching ^[a-zA-Z_][a-zA-Z_0-9]* | name (required) can be any string |
type (required) is one of the Data Types above | type (optional) with different data types |
title (optional) is a localized string | title (optional) is a plain string |