Truly Tabular RDF/Info
This page explains the table columns being used in the Truly Tabular RDF analysis tool at http://wikidata.bitplan.com
Property columns
#
rank of the property in order of the percentage of instances where at lest one values is available for the property
%
The percentage of instances where at least one value is available for the property
pareto
The Pareto level according to the Pareto principle 80:20 (1 out of 5) as a logarithmic scale to the basis 5..
level | ratio | 1 out of |
---|---|---|
1 | 80:20 | 5 |
2 | 96:4 | 25 |
3 | 99.2:0.8 | 125 |
4 | 99.84:0.16 | 625 |
5 | 99.97:0.03 | 3125 |
6 | 99.994:0.006 | 15625 |
7 | 99.9987:0.0013 | 78125 |
8 | 99.99974:0.00026 | 390625 |
9 | 99.99995:0.00005 | 1953125 |
property
A Wikidata Property e.g. P31/instance of
propertyId
The property Identifier for a Property e.g. P31 for P31/instance of
type
a wikibase type see Supported data types
Statistics
1
number of truly tabular entries with a cardinality of 1
maxf
maximum frequency / cardinality of the property
nt
Number of non tabular entries - having a cardinality > 1
nt%
Percentage of non tabular entries.
?f
try it link to query that retrievs the frequency histogramm for this property E.g. for the property official website(P856) as queried for instances of the class Q3918 university the query used is:
# This query was generated by Truly Tabular
# Count all Q3918:university items
# with the given official website(P856) https://www.wikidata.org/wiki/Property:P856
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?count (COUNT(?count) AS ?frequency) WHERE {{
SELECT ?item ?itemLabel (COUNT (?value) AS ?count)
WHERE
{
# instance of university
?item wdt:P31 wd:Q3918.
?item rdfs:label ?itemLabel.
FILTER (LANG(?itemLabel) = "en").
# official website
?item wdt:P856 ?value.
} GROUP BY ?item ?itemLabel
}}
GROUP BY ?count
ORDER BY DESC (?frequency)
?ex
try it! link to examples for "non-tabular" entries. E.g. for the property "manufacturer" of the class "beer" the query
# This query was generated by Truly Tabular
# Count all Q44:beer items
# with the given manufacturer(P176) https://www.wikidata.org/wiki/Property:P176
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item ?itemLabel (COUNT (?value) AS ?count)
WHERE
{
# instance of beer
?item wdt:P31 wd:Q44.
?item rdfs:label ?itemLabel.
FILTER (LANG(?itemLabel) = "en").
# manufacturer
?item wdt:P176 ?value.
} GROUP BY ?item ?itemLabel
HAVING (COUNT (?value) > 1)
ORDER BY DESC(?count) try it
try it! will be generated which reveals that there are two kinds of beers that have two manufacturers: Žatecký Gus which is manufactured by Carlsberg Ukraine and Baltika Breweries and Balatoni Világos which is manufactured by Nagykanizsai Sörgyár Rt. (until 1999) and Dreher Breweries
✔
Check mark that the property statistics for this property have been calculated successfully.
Aggregates
count
apply SPARQL
COUNT()
aggregate
min
apply SPARQL
MIN()
aggregate
max
apply SPARQL
MAX()
aggregate
avg
apply SPARQL
AVG()
aggregate
sample
apply SPARQL
SAMPLE()
aggregate
list
apply
GROUP_CONCAT()
aggregate to avoid multiple solutions for the same instance
ignore
Ignore SPARQL solutions that have multiple values for the given property by using a
HAVING COUNT<=1
aggregate condition in the generated query
label
Show the label of the property result in the generated SPARQL query.
select
If a property is selected it will be included in the generated SPARQL query