Difference between revisions of "GOTExample2022"
(3 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
wikidata might be a single item of interest such as the https://upload.wikimedia.org/wikipedia/commons/thumb/2/2e/Game_of_Thrones_2011_logo.svg/160px-Game_of_Thrones_2011_logo.svg.png character [https://www.wikidata.org/wiki/Q3183235 Jon Snow] | wikidata might be a single item of interest such as the https://upload.wikimedia.org/wikipedia/commons/thumb/2/2e/Game_of_Thrones_2011_logo.svg/160px-Game_of_Thrones_2011_logo.svg.png character [https://www.wikidata.org/wiki/Q3183235 Jon Snow] | ||
[https://www.wikidata.org/wiki/Property:P31 instance of][https://www.wikidata.org/wiki/Q20086263 Game of thrones character] | [https://www.wikidata.org/wiki/Property:P31 instance of][https://www.wikidata.org/wiki/Q20086263 Game of thrones character] | ||
+ | |||
<graphviz> | <graphviz> | ||
digraph GOT { | digraph GOT { | ||
Line 21: | Line 22: | ||
} | } | ||
− | |||
</graphviz> | </graphviz> | ||
+ | ==== GOTCountQuery ==== | ||
+ | SPARQL Query to count Game of Thrones character items in Wikidata | ||
+ | |||
<source lang='sparql'> | <source lang='sparql'> | ||
SELECT (COUNT(DISTINCT ?item) AS ?count) WHERE | SELECT (COUNT(DISTINCT ?item) AS ?count) WHERE |
Latest revision as of 20:02, 5 August 2022
Game of Thrones / Jon Snow Example
A starting point for analyzing the data in a triplestore such as wikidata might be a single item of interest such as the character Jon Snow instance ofGame of thrones character
GOTCountQuery
SPARQL Query to count Game of Thrones character items in Wikidata
SELECT (COUNT(DISTINCT ?item) AS ?count) WHERE
{
# Q20086263:Game of Thrones character
?item wdt:P31 wd:Q20086263;
}
try it! The above SPARQL query will let us know the total amount of similar items we might find in Wikidata. which is 117 at the time of writing this work.
A quite natural idea is to create a query now to get a table of all Game of thrones characters with columns for the most interesting aspects.
As a starting point we might use the Wikidata Query Service and select the sample "Cats" and refine it for our purposes with some properties we find on Jon Snows page to create a "naive" query:
NaiveGOTQuery1
# Game of Thrones Characters
SELECT
?item ?itemLabel
?genderLabel
?countryLabel
?placeOfBirthLabel
?placeOfDeathLabel
WHERE
{
?item wdt:P31 wd:Q20086263. # must be a game of thrones character
?item wdt:P21 ?gender.
?item wdt:P27 ?country.
?item wdt:P19 ?placeOfBirth.
?item wdt:P20 ?placeOfDeath.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
result
item | itemLabel | genderLabel | countryLabel | placeOfBirthLabel | placeOfDeathLabel |
---|---|---|---|---|---|
http://www.wikidata.org/entity/Q259818 | Eddard Stark | male | Seven Kingdoms | Winterfell | King's Landing |
http://www.wikidata.org/entity/Q1120793 | Theon Greyjoy | male | Seven Kingdoms | Pyke | Winterfell |
http://www.wikidata.org/entity/Q2941743 | Catelyn Stark | female | Seven Kingdoms | Riverrun | Twins |
http://www.wikidata.org/entity/Q3183235 | Jon Snow | male | Seven Kingdoms | Tower of Joy | Castle Black |
http://www.wikidata.org/entity/Q3665163 | Cersei Lannister | female | Seven Kingdoms | Casterly Rock | Red Keep |
http://www.wikidata.org/entity/Q3806180 | Jaime Lannister | male | Seven Kingdoms | Casterly Rock | Red Keep |
http://www.wikidata.org/entity/Q11680202 | Beric Dondarrion | male | Seven Kingdoms | The Stormlands | Winterfell |
http://www.wikidata.org/entity/Q12056060 | Stannis Baratheon | male | Seven Kingdoms | Storm's End | The North |
http://www.wikidata.org/entity/Q12815803 | Rickon Stark | male | Seven Kingdoms | Winterfell | The North |
http://www.wikidata.org/entity/Q12900597 | Joffrey Baratheon | male | Seven Kingdoms | King's Landing | King's Landing |
http://www.wikidata.org/entity/Q12900933 | Margaery Tyrell | female | Seven Kingdoms | Highgarden | Great Sept of Baelor |
http://www.wikidata.org/entity/Q12902445 | Tywin Lannister | male | Seven Kingdoms | Casterly Rock | King's Landing |
http://www.wikidata.org/entity/Q12902594 | Viserys Targaryen | male | Seven Kingdoms | King's Landing | Vaes Dothrak |
http://www.wikidata.org/entity/Q13634884 | Robb Stark | male | Seven Kingdoms | Riverrun | Twins |
http://www.wikidata.org/entity/Q13634885 | Robert Baratheon | male | Seven Kingdoms | Storm's End | King's Landing |
http://www.wikidata.org/entity/Q17355556 | Janos Slynt | male | Seven Kingdoms | King's Landing | The North |
http://www.wikidata.org/entity/Q18920137 | Loras Tyrell | male | Seven Kingdoms | Highgarden | Great Sept of Baelor |
http://www.wikidata.org/entity/Q19791067 | Lysa Arryn | female | Seven Kingdoms | Riverrun | Eyrie |
http://www.wikidata.org/entity/Q23730371 | Hoster Tully | male | Seven Kingdoms | Riverrun | Winterfell |
http://www.wikidata.org/entity/Q23747242 | Shireen Baratheon | female | Seven Kingdoms | Dragonstone | The North |
http://www.wikidata.org/entity/Q23759020 | Rickard Stark | male | Seven Kingdoms | Winterfell | King's Landing |
http://www.wikidata.org/entity/Q23759067 | Benjen Stark | male | Seven Kingdoms | Winterfell | Beyond the Wall |
http://www.wikidata.org/entity/Q23759071 | Lyanna Stark | female | Seven Kingdoms | Winterfell | Tower of Joy |
http://www.wikidata.org/entity/Q23782360 | Brynden Tully | male | Seven Kingdoms | Riverrun | Riverrun |
http://www.wikidata.org/entity/Q23782368 | Walder Frey | male | Seven Kingdoms | Twins | Twins |
http://www.wikidata.org/entity/Q24814689 | Aemon Targaryen | male | Seven Kingdoms | King's Landing | Castle Black |
http://www.wikidata.org/entity/Q28468107 | Doran Martell | male | Seven Kingdoms | Sunspear | Dorne |
http://www.wikidata.org/entity/Q55433701 | Jojen Reed | male | Seven Kingdoms | Greywater Watch | Beyond the Wall |
The result of this query is nicely tabular and can be exported to CSV format to be imported to our favorite relational database or spreadsheet - but it does not contain 117 entries any more but only 28. SPARQL queries will not return "NULL" values as our relational database query language SQL would. The query needs to be amended with the "OPTIONAL" keyword as shown below.
NaiveGOTQuery2
Naive Game of Thrones Character tabular query with OPTIONAL
# Game of Thrones Characters
SELECT
?item ?itemLabel
?genderLabel
?countryLabel
?placeOfBirthLabel
?placeOfDeathLabel
WHERE
{
?item wdt:P31 wd:Q20086263. # must be a game of thrones character
OPTIONAL { ?item wdt:P21 ?gender}.
OPTIONAL { ?item wdt:P27 ?country}.
OPTIONAL { ?item wdt:P19 ?placeOfBirth}.
OPTIONAL { ?item wdt:P20 ?placeOfDeath}.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This table looks promising it has 117 entries and the columns are filled as expected.
The real surprise for a novice will be the result of the Query below when three more columns have been added. All over sudden the query result has 304 rows for our 117 characters. There are duplicate rows for each character and strange combinations of entries. A notorious example seems to be Jorah Mormont having 8 different values for the "occupation" column.
NaiveGOTQuery3
Naive Game of Thrones Character tabular query with duplicate rows for some characters
# Game of Thrones Characters
SELECT
?item ?itemLabel
?genderLabel
?countryLabel
?placeOfBirthLabel
?placeOfDeathLabel
?killedByLabel
?familyLabel
?occupationLabel
WHERE
{
?item wdt:P31 wd:Q20086263. # must be a game of thrones character
OPTIONAL { ?item wdt:P21 ?gender}.
OPTIONAL { ?item wdt:P27 ?country}.
OPTIONAL { ?item wdt:P19 ?placeOfBirth}.
OPTIONAL { ?item wdt:P20 ?placeOfDeath}.
OPTIONAL { ?item wdt:P157 ?killedBy}.
OPTIONAL { ?item wdt:P53 ?family}.
OPTIONAL { ?item wdt:P106 ?occupation}.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}