Difference between revisions of "Geograpy"
(30 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
{{OsProject | {{OsProject | ||
|id=geograpy3 | |id=geograpy3 | ||
+ | |state=active | ||
|owner=somnathrakshit | |owner=somnathrakshit | ||
|title=geograpy | |title=geograpy | ||
|url=https://github.com/somnathrakshit/geograpy3 | |url=https://github.com/somnathrakshit/geograpy3 | ||
− | |version=0. | + | |version=0.2.6 |
− | |date= | + | |date=2023-04-08 |
+ | |since=2018-09-18 | ||
|storemode=property | |storemode=property | ||
}} | }} | ||
+ | =tickets= | ||
+ | |||
+ | =Freitext= | ||
= What is it? = | = What is it? = | ||
Geograpy3 is a Python library to extract geographic details like: | Geograpy3 is a Python library to extract geographic details like: | ||
Line 18: | Line 23: | ||
== Examples == | == Examples == | ||
− | Let's take the [https:// | + | === Example 1 - London 2012 Olympic torch relay route === |
+ | Let's take the [https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay Wikipedia article on the 2012 London Olympics torch relay route]. In this article quite a few countries, regions and cities are mentioned. | ||
Let's extract that information using geograpy3 | Let's extract that information using geograpy3 | ||
− | === Code === | + | ==== Code ==== |
+ | [https://github.com/somnathrakshit/geograpy3/blob/master/examples/example1.py example1.py] | ||
+ | |||
<source lang='python'> | <source lang='python'> | ||
import geograpy | import geograpy | ||
− | url='https:// | + | url='https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay' |
− | places = geograpy.get_geoPlace_context(url = url) | + | places = geograpy.get_geoPlace_context(url = url) |
print(places) | print(places) | ||
</source> | </source> | ||
− | === Result === | + | |
+ | ==== Result ==== | ||
+ | <source lang='bash'> | ||
+ | python example1.py | ||
+ | </source> | ||
<source lang='json'> | <source lang='json'> | ||
− | countries=['Jersey', 'Guernsey', 'Greece', 'Belarus', 'South Africa', 'Australia', 'New Zealand', ' | + | countries=['Ireland', 'Jersey', 'Guernsey', 'Turkey', 'Greece', 'United Kingdom', 'Belarus', 'South Africa', 'Australia', 'New Zealand', 'Germany', 'France', 'Jamaica', 'Antigua and Barbuda', 'Montserrat', 'United States', 'Canada', 'Japan'] |
− | regions=[' | + | regions=['Hackney', 'Davy', 'Ireland', 'Burscough', 'Jersey', 'Munich', 'Newton Aycliffe', 'British/Irish', 'Plymouth', 'Greece', 'Thirsk', 'Wales', 'Locog', 'Cumbrian', 'Lincolnshire', 'Guernsey', 'Cardiff', 'Torch', 'Host', 'Cambridge', 'Bristol Harbour', 'Falmouth', 'Athens', 'Turkey', 'Wiltshire', 'British', 'England', 'United Kingdom', 'Sheffield', 'London', 'Aberaeron', 'Abraham', 'Northern Ireland', 'Wanted', 'East', 'Heathrow', 'Gravesend', 'Essex', 'Maidstone', 'Cornwall', 'Hyde Park', 'Hera', 'Swansea', 'Caerphilly', 'Taunton Lancashire', 'Stamford', 'Dublin', 'Derry', 'Portland', 'Ioannina', 'Scotland', 'Bangor', 'Engineering', 'Land'] |
− | cities=['Dublin', ' | + | cities=['Ioannina', 'Athens', 'Dublin', 'Hyde Park', 'Swansea', 'Sheffield', 'Portland', 'Cardiff', 'Cambridge', 'Bangor', 'Thirsk', 'Stamford', 'Plymouth', 'Newton Aycliffe', 'Maidstone', 'London', 'Hackney', 'Gravesend', 'Falmouth', 'Caerphilly', 'Burscough', 'Aberaeron', 'Munich', 'Derry', 'England', 'Scotland', 'Essex', 'Turkey', 'Davy', 'Ireland', 'Lincolnshire', 'Wales', 'Guernsey', 'Cornwall', 'Heathrow', 'Hera'] |
− | other=[] | + | other=['British', 'British', 'East'] |
+ | </source> | ||
+ | |||
+ | = Getting the source code= | ||
+ | <source lang='bash'> | ||
+ | git clone https://github.com/somnathrakshit/geograpy3 | ||
+ | cd geograpy3 | ||
+ | scripts/install | ||
</source> | </source> | ||
Line 54: | Line 73: | ||
== geograpy3 (2018) == | == geograpy3 (2018) == | ||
geograpy3 was forked from geograpy2 in 2018 by [https://github.com/somnathrakshit Somnath Rakshit]. It added python3 compatibility. In 2020 [https://github.com/WolfgangFahl Wolfgang Fahl] joined the project since he had a need to use it for the [http://ptp.bitplan.com/ Proceedings Title Parser] as part of the [https://projects.tib.eu/en/confident/ ConfIDent project] | geograpy3 was forked from geograpy2 in 2018 by [https://github.com/somnathrakshit Somnath Rakshit]. It added python3 compatibility. In 2020 [https://github.com/WolfgangFahl Wolfgang Fahl] joined the project since he had a need to use it for the [http://ptp.bitplan.com/ Proceedings Title Parser] as part of the [https://projects.tib.eu/en/confident/ ConfIDent project] | ||
− | * https://github.com/somnathrakshit/geograpy3 | + | * [https://github.com/somnathrakshit/geograpy3 geograpy github repository] |
− | * https://pypi.org/project/geograpy3/ | + | * [https://pypi.org/project/geograpy3/ geograpy on pypi] |
+ | * [https://stackoverflow.com/questions/tagged/geograpy Stackoverflow Questions about geograpy] | ||
+ | |||
+ | = Data used = | ||
+ | == Overview == | ||
+ | |||
+ | <uml> | ||
+ | title | ||
+ | geograpy Tables | ||
+ | 2021-08-13 | ||
+ | [[https://github.com/somnathrakshit/geograpy3 © 2020-2021 geograpy3 project]] | ||
+ | end title | ||
+ | package geograpy3 { | ||
+ | class cities << Entity >> { | ||
+ | city_name : TEXT | ||
+ | continent_code : TEXT | ||
+ | continent_name : TEXT | ||
+ | country_iso_code : TEXT | ||
+ | country_name : TEXT | ||
+ | geoname_id : TEXT <<PK>> | ||
+ | is_in_european_union : TEXT | ||
+ | locale_code : TEXT | ||
+ | metro_code : TEXT | ||
+ | subdivision_1_iso_code : TEXT | ||
+ | subdivision_1_name : TEXT | ||
+ | subdivision_2_iso_code : TEXT | ||
+ | subdivision_2_name : TEXT | ||
+ | time_zone : TEXT | ||
+ | } | ||
+ | class countries << Entity >> { | ||
+ | country : TEXT | ||
+ | countryCoord : TEXT | ||
+ | countryGDP_perCapita : FLOAT | ||
+ | countryIsoCode : TEXT | ||
+ | countryLabel : TEXT | ||
+ | countryPopulation : FLOAT | ||
+ | } | ||
+ | class regions << Entity >> { | ||
+ | country : TEXT | ||
+ | countryIsoCode : TEXT | ||
+ | countryLabel : TEXT | ||
+ | location : TEXT | ||
+ | region : TEXT | ||
+ | regionIsoCode : TEXT | ||
+ | regionLabel : TEXT | ||
+ | regionPopulation : FLOAT | ||
+ | } | ||
+ | class cityPops << Entity >> { | ||
+ | city : TEXT | ||
+ | cityLabel : TEXT | ||
+ | cityPop : FLOAT | ||
+ | country : TEXT | ||
+ | countryIsoCode : TEXT | ||
+ | countryLabel : TEXT | ||
+ | countryPopulation : FLOAT | ||
+ | geoNameId : TEXT | ||
+ | } | ||
+ | class citiesWithPopulation << Entity >> { | ||
+ | cityLabel : TEXT | ||
+ | cityPop : FLOAT | ||
+ | city_name : TEXT | ||
+ | country_iso_code : TEXT | ||
+ | country_name : TEXT | ||
+ | geoname_id : TEXT | ||
+ | subdivision_1_iso_code : TEXT | ||
+ | subdivision_1_name : TEXT | ||
+ | wikidataurl : TEXT | ||
+ | } | ||
+ | class Version << Entity >> { | ||
+ | version : TEXT <<PK>> | ||
+ | } | ||
+ | class cities_wikidata << Entity >> { | ||
+ | country_wikidataid : TEXT | ||
+ | lat : TEXT | ||
+ | level : INTEGER | ||
+ | locationKind : TEXT | ||
+ | lon : TEXT | ||
+ | name : TEXT | ||
+ | partOf : TEXT | ||
+ | population : INTEGER | ||
+ | region_wikidataid : TEXT | ||
+ | wikidataid : TEXT <<PK>> | ||
+ | } | ||
+ | class regions_wikidata << Entity >> { | ||
+ | country_wikidataid : TEXT | ||
+ | iso : TEXT | ||
+ | lat : FLOAT | ||
+ | level : INTEGER | ||
+ | locationKind : TEXT | ||
+ | lon : FLOAT | ||
+ | name : TEXT | ||
+ | population : FLOAT | ||
+ | wikidataid : TEXT <<PK>> | ||
+ | } | ||
+ | class countries_wikidata << Entity >> { | ||
+ | iso : TEXT | ||
+ | lat : FLOAT | ||
+ | level : INTEGER | ||
+ | locationKind : TEXT | ||
+ | lon : FLOAT | ||
+ | name : TEXT | ||
+ | population : FLOAT | ||
+ | wikidataid : TEXT <<PK>> | ||
+ | } | ||
+ | } | ||
+ | |||
+ | ' BITPlan Corporate identity skin params | ||
+ | ' Copyright (c) 2015-2020 BITPlan GmbH | ||
+ | ' see http://wiki.bitplan.com/PlantUmlSkinParams#BITPlanCI | ||
+ | ' skinparams generated by com.bitplan.restmodelmanager | ||
+ | skinparam note { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam component { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam package { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam usecase { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam activity { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam classAttribute { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam interface { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam class { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | skinparam object { | ||
+ | BackGroundColor #FFFFFF | ||
+ | FontSize 12 | ||
+ | ArrowColor #FF8000 | ||
+ | BorderColor #FF8000 | ||
+ | FontColor black | ||
+ | FontName Technical | ||
+ | } | ||
+ | hide Circle | ||
+ | ' end of skinparams ' | ||
+ | </uml> | ||
+ | |||
+ | == Cities == | ||
+ | The cities table is derived from the [http://dev.maxmind.com/geoip/geoip2/geolite2/ GeoLite2 by MaxMind] database | ||
+ | == Countries == | ||
+ | The countries table is derived from Wikidata: | ||
+ | <source lang='sparql'> | ||
+ | # get a list of countries | ||
+ | # for geograpy3 library | ||
+ | # see https://github.com/somnathrakshit/geograpy3/issues/15 | ||
+ | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
+ | PREFIX wd: <http://www.wikidata.org/entity/> | ||
+ | PREFIX wdt: <http://www.wikidata.org/prop/direct/> | ||
+ | PREFIX p: <http://www.wikidata.org/prop/> | ||
+ | PREFIX ps: <http://www.wikidata.org/prop/statement/> | ||
+ | PREFIX pq: <http://www.wikidata.org/prop/qualifier/> | ||
+ | # get City details with Country | ||
+ | SELECT DISTINCT ?country ?countryLabel ?countryIsoCode ?countryCoord ?countryPopulation ?continent ?continentLabel | ||
+ | WHERE { | ||
+ | # instance of Country | ||
+ | ?country wdt:P31/wdt:P279* wd:Q6256 . | ||
+ | # VALUES ?country { wd:Q55}. | ||
+ | # label for the country | ||
+ | ?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en"). | ||
+ | # get the continent (s) | ||
+ | #OPTIONAL { | ||
+ | # ?country wdt:P30 ?continent. | ||
+ | # ?continent rdfs:label ?continentLabel filter (lang(?continentLabel) = "en"). | ||
+ | #} | ||
+ | # get the coordinates | ||
+ | OPTIONAL { | ||
+ | ?country wdt:P625 ?countryCoord. | ||
+ | } | ||
+ | # https://www.wikidata.org/wiki/Property:P297 ISO 3166-1 alpha-2 code | ||
+ | ?country wdt:P297 ?countryIsoCode. | ||
+ | # population of country | ||
+ | OPTIONAL | ||
+ | { | ||
+ | SELECT ?country (max(?countryPopulationValue) as ?countryPopulation) | ||
+ | WHERE { | ||
+ | ?country wdt:P1082 ?countryPopulationValue | ||
+ | } group by ?country | ||
+ | } | ||
+ | # https://www.wikidata.org/wiki/Property:P2132 | ||
+ | # nominal GDP per capita | ||
+ | # OPTIONAL { ?country wdt:P2132 ?countryGDP_perCapitaValue. } | ||
+ | } | ||
+ | ORDER BY ?countryIsoCode | ||
+ | </source> | ||
+ | [https://query.wikidata.org/#%23%20get%20a%20list%20of%20countries%0A%23%20for%20geograpy3%20library%0A%23%20see%20https%3A%2F%2Fgithub.com%2Fsomnathrakshit%2Fgeograpy3%2Fissues%2F15%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20pq%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0A%23%20get%20City%20details%20with%20Country%0ASELECT%20DISTINCT%20%3Fcountry%20%3FcountryLabel%20%3FcountryIsoCode%20%20%3FcountryCoord%20%20%3FcountryPopulation%20%3Fcontinent%20%3FcontinentLabel%0AWHERE%20%7B%0A%20%20%23%20instance%20of%20Country%0A%20%20%3Fcountry%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ6256%20.%0A%20%20%23%20VALUES%20%3Fcountry%20%7B%20wd%3AQ55%7D.%0A%20%20%23%20label%20for%20the%20country%0A%20%20%3Fcountry%20rdfs%3Alabel%20%3FcountryLabel%20filter%20%28lang%28%3FcountryLabel%29%20%3D%20%22en%22%29.%0A%20%20%23%20get%20the%20continent%20%28s%29%0A%20%20%23OPTIONAL%20%7B%0A%20%20%23%20%20%3Fcountry%20wdt%3AP30%20%3Fcontinent.%0A%20%20%23%20%20%3Fcontinent%20rdfs%3Alabel%20%3FcontinentLabel%20filter%20%28lang%28%3FcontinentLabel%29%20%3D%20%22en%22%29.%0A%20%20%23%7D%0A%20%20%23%20get%20the%20coordinates%0A%20%20OPTIONAL%20%7B%20%0A%20%20%20%20%20%20%3Fcountry%20wdt%3AP625%20%3FcountryCoord.%0A%20%20%7D%20%0A%20%20%23%20https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP297%20ISO%203166-1%20alpha-2%20code%0A%20%20%3Fcountry%20wdt%3AP297%20%3FcountryIsoCode.%0A%20%20%23%20population%20of%20country%20%20%20%0A%20%20OPTIONAL%0A%20%20%7B%20%0A%20%20%20%20SELECT%20%3Fcountry%20%28max%28%3FcountryPopulationValue%29%20as%20%3FcountryPopulation%29%0A%20%20%20%20WHERE%20%7B%0A%20%20%20%20%20%20%3Fcountry%20wdt%3AP1082%20%3FcountryPopulationValue%0A%20%20%20%20%7D%20group%20by%20%3Fcountry%0A%20%20%7D%0A%20%20%23%20https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP2132%0A%20%20%23%20nominal%20GDP%20per%20capita%0A%20%20%23%20OPTIONAL%20%7B%20%3Fcountry%20wdt%3AP2132%20%3FcountryGDP_perCapitaValue.%20%7D%0A%7D%0AORDER%20BY%20%3FcountryIsoCode try it!] - 204 results in some 7.9 s as of 2021-12 | ||
+ | |||
+ | == Regions == | ||
+ | The regions list is derived from Wikidata | ||
+ | <source lang='sparql'> | ||
+ | # get a list of regions | ||
+ | # for geograpy3 library | ||
+ | # see https://github.com/somnathrakshit/geograpy3/issues/15 | ||
+ | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
+ | PREFIX wd: <http://www.wikidata.org/entity/> | ||
+ | PREFIX wdt: <http://www.wikidata.org/prop/direct/> | ||
+ | PREFIX wikibase: <http://wikiba.se/ontology#> | ||
+ | SELECT DISTINCT ?country ?countryLabel ?countryIsoCode ?region ?regionLabel ?regionIsoCode ?regionPopulation ?location | ||
+ | WHERE | ||
+ | { | ||
+ | # administrative unit of first order | ||
+ | ?region wdt:P31/wdt:P279* wd:Q10864048. | ||
+ | OPTIONAL { | ||
+ | ?region rdfs:label ?regionLabel filter (lang(?regionLabel) = "en"). | ||
+ | } | ||
+ | # isocode state/province (mandatory - filters historic regions while at it ...) | ||
+ | # filter historic regions | ||
+ | # FILTER NOT EXISTS {?region wdt:P576 ?end} | ||
+ | { | ||
+ | SELECT ?region (max(?regionAlpha2) as ?regionIsoCode) (max(?regionPopulationValue) as ?regionPopulation) (max(?locationValue) as ?location) | ||
+ | WHERE { | ||
+ | ?region wdt:P300 ?regionAlpha2. | ||
+ | # get the population | ||
+ | # https://www.wikidata.org/wiki/Property:P1082 | ||
+ | OPTIONAL { | ||
+ | ?region wdt:P1082 ?regionPopulationValue | ||
+ | } | ||
+ | # get he location | ||
+ | # https://www.wikidata.org/wiki/Property:P625 | ||
+ | OPTIONAL { | ||
+ | ?region wdt:P625 ?locationValue. | ||
+ | } | ||
+ | } GROUP BY ?region | ||
+ | } | ||
+ | # # https://www.wikidata.org/wiki/Property:P297 | ||
+ | OPTIONAL { | ||
+ | ?region wdt:P17 ?country. | ||
+ | # label for the country | ||
+ | ?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en"). | ||
+ | ?country wdt:P297 ?countryIsoCode. | ||
+ | } | ||
+ | } ORDER BY ?regionIsoCode | ||
+ | </source> | ||
+ | [https://query.wikidata.org/#%23%20get%20a%20list%20of%20regions%0A%23%20for%20geograpy3%20library%0A%23%20see%20https%3A%2F%2Fgithub.com%2Fsomnathrakshit%2Fgeograpy3%2Fissues%2F15%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0ASELECT%20DISTINCT%20%3Fcountry%20%3FcountryLabel%20%3FcountryIsoCode%20%3Fregion%20%3FregionLabel%20%3FregionIsoCode%20%3FregionPopulation%20%3Flocation%0AWHERE%0A%7B%0A%20%20%23%20administrative%20unit%20of%20first%20order%0A%20%20%3Fregion%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ10864048.%0A%20%20OPTIONAL%20%7B%0A%20%20%20%20%20%3Fregion%20rdfs%3Alabel%20%3FregionLabel%20filter%20%28lang%28%3FregionLabel%29%20%3D%20%22en%22%29.%0A%20%20%7D%0A%20%20%23%20isocode%20state%2Fprovince%20%28mandatory%20-%20filters%20historic%20regions%20while%20at%20it%20...%29%0A%20%20%23%20filter%20historic%20regions%0A%20%20%23%20FILTER%20NOT%20EXISTS%20%7B%3Fregion%20wdt%3AP576%20%3Fend%7D%0A%20%20%7B%20%0A%20%20%20%20SELECT%20%3Fregion%20%28max%28%3FregionAlpha2%29%20as%20%3FregionIsoCode%29%20%28max%28%3FregionPopulationValue%29%20as%20%3FregionPopulation%29%20%28max%28%3FlocationValue%29%20as%20%3Flocation%29%0A%20%20%20%20WHERE%20%7B%0A%20%20%20%20%20%20%3Fregion%20wdt%3AP300%20%3FregionAlpha2.%0A%20%20%20%20%20%20%23%20get%20the%20population%0A%20%20%20%20%20%20%23%20https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1082%0A%20%20%20%20%20%20OPTIONAL%20%7B%0A%20%20%20%20%20%20%20%20%3Fregion%20wdt%3AP1082%20%3FregionPopulationValue%0A%20%20%20%20%20%20%7D%20%0A%20%20%20%20%20%20%23%20get%20he%20location%0A%20%20%20%20%20%20%23%20https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP625%0A%20%20%20%20%20%20OPTIONAL%20%7B%0A%20%20%20%20%20%20%20%20%3Fregion%20wdt%3AP625%20%3FlocationValue.%20%0A%20%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%20GROUP%20BY%20%3Fregion%0A%20%20%7D%0A%20%20%23%20%23%20https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP297%0A%20%20OPTIONAL%20%7B%20%0A%20%20%20%20%3Fregion%20wdt%3AP17%20%3Fcountry.%0A%20%20%20%20%23%20label%20for%20the%20country%0A%20%20%20%20%3Fcountry%20rdfs%3Alabel%20%3FcountryLabel%20filter%20%28lang%28%3FcountryLabel%29%20%3D%20%22en%22%29.%0A%20%20%20%20%3Fcountry%20wdt%3AP297%20%3FcountryIsoCode.%20%0A%20%20%7D%0A%7D%20ORDER%20BY%20%3FregionIsoCode try it!] - 3753 results in 11.4 s as of 2021-08 | ||
= Adding city details from Wikidata = | = Adding city details from Wikidata = | ||
Line 95: | Line 395: | ||
[http://wiki.bitplan.com/images/confident/city_wikidata_population.db Sqlite version of query result] | [http://wiki.bitplan.com/images/confident/city_wikidata_population.db Sqlite version of query result] | ||
and e.g. inspect it with the [https://sqlitebrowser.org/ DB Browser for SQLite] | and e.g. inspect it with the [https://sqlitebrowser.org/ DB Browser for SQLite] | ||
+ | == CityPops Stats == | ||
+ | Here are some statistic queries about the data imported from Wikidata | ||
+ | <source lang='sql' highlight='1,3,5,7-9'> | ||
+ | select count(*) from cityPops where cityPop is not Null | ||
+ | 164503 | ||
+ | select count(*) from cityPops | ||
+ | 453306 | ||
+ | select count(distinct geoNameId) from cityPops | ||
+ | 414198 | ||
+ | select count(*) | ||
+ | from cities c | ||
+ | join cityPops cp on c.geoname_id =cp.geoNameId | ||
+ | 90482 | ||
+ | </source> | ||
+ | == Difference in Name/Label == | ||
+ | 17499 differences: | ||
+ | <source lang='sql'> | ||
+ | select c.city_name as name,cp.cityLabel,c.*,city as wikidataurl,cityPop | ||
+ | from cities c | ||
+ | join cityPops cp | ||
+ | on c.geoname_id=cp.geoNameId | ||
+ | where not c.city_name =cp.cityLabel | ||
+ | group by geoNameId | ||
+ | </source> | ||
+ | {| class='wikitable sortable' | ||
+ | !name !!cityLabel !!id !!wikidataurl !!pop | ||
+ | |- | ||
+ | | ||Chongqing ||1814905 ||http://www.wikidata.org/entity/Q11725 ||30165500 | ||
+ | |- | ||
+ | | ||Shanghai ||1796231 ||http://www.wikidata.org/entity/Q8686 ||24152700 | ||
+ | |- | ||
+ | | ||Beijing ||2038349 ||http://www.wikidata.org/entity/Q956 ||21705000 | ||
+ | |- | ||
+ | | ||Delhi ||1273293 ||http://www.wikidata.org/entity/Q1353 ||16314838 | ||
+ | |- | ||
+ | | ||Tianjin ||1792943 ||http://www.wikidata.org/entity/Q11736 ||15469500 | ||
+ | |- | ||
+ | | ||Istanbul ||745042 ||http://www.wikidata.org/entity/Q406 ||14657434 | ||
+ | |- | ||
+ | | ||Metro Manila ||7521311 ||http://www.wikidata.org/entity/Q13580 ||12877253 | ||
+ | |- | ||
+ | | ||Moscow ||524894 ||http://www.wikidata.org/entity/Q649 ||12380664 | ||
+ | |- | ||
+ | | ||Seoul ||1835847 ||http://www.wikidata.org/entity/Q8684 ||9805506 | ||
+ | |- | ||
+ | | ||Jakarta ||1642907 ||http://www.wikidata.org/entity/Q3630 ||9607787 | ||
+ | |- | ||
+ | | ||Mexico City ||3527646 ||http://www.wikidata.org/entity/Q1489 ||8918653 | ||
+ | |- | ||
+ | | New York ||New York City ||5128581 ||http://www.wikidata.org/entity/Q60 ||8537673 | ||
+ | |- | ||
+ | | Bengaluru ||Bangalore ||1277333 ||http://www.wikidata.org/entity/Q1355 ||8425970 | ||
+ | |- | ||
+ | | ||Bogotá ||3688685 ||http://www.wikidata.org/entity/Q2841 ||8080734 | ||
+ | |- | ||
+ | | Prayagraj ||Allahabad ||1278994 ||http://www.wikidata.org/entity/Q162442 ||5954391 | ||
+ | |- | ||
+ | | ||Singapore ||1880251 ||http://www.wikidata.org/entity/Q334 ||5781728 | ||
+ | |- | ||
+ | | Changteh ||Changde ||1791121 ||http://www.wikidata.org/entity/Q416544 ||5717200 | ||
+ | |- | ||
+ | | ||Bangkok ||1609348 ||http://www.wikidata.org/entity/Q1861 ||5696409 | ||
+ | |- | ||
+ | | St Petersburg ||Saint Petersburg ||498817 ||http://www.wikidata.org/entity/Q656 ||5281579 | ||
+ | |- | ||
+ | | ||New Taipei City ||1665148 ||http://www.wikidata.org/entity/Q244898 ||3970644 | ||
+ | |- | ||
+ | | Sanaa ||Sana'a ||71137 ||http://www.wikidata.org/entity/Q2471 ||2957000 | ||
+ | |- | ||
+ | | ||Kiev ||703447 ||http://www.wikidata.org/entity/Q1899 ||2907684 | ||
+ | |- | ||
+ | | ||Buenos Aires ||3433955 ||http://www.wikidata.org/entity/Q1486 ||2890151 | ||
+ | |- | ||
+ | | Osaka ||Ōsaka ||1853909 ||http://www.wikidata.org/entity/Q35765 ||2713157 | ||
+ | |- | ||
+ | | ||Dubai ||292224 ||http://www.wikidata.org/entity/Q612 ||2502715 | ||
+ | |- | ||
+ | | Yaoundé ||Yaounde ||2220957 ||http://www.wikidata.org/entity/Q3808 ||2440462 | ||
+ | |- | ||
+ | | ||Baku ||587081 ||http://www.wikidata.org/entity/Q9248 ||2181800 | ||
+ | |- | ||
+ | | Basrah ||Basra ||99532 ||http://www.wikidata.org/entity/Q48195 ||2150000 | ||
+ | |- | ||
+ | | Taoyuan City ||Taoyuan District ||6696918 ||http://www.wikidata.org/entity/Q715975 ||2058328 | ||
+ | |- | ||
+ | | ||Vienna ||2761367 ||http://www.wikidata.org/entity/Q1741 ||1840573 | ||
+ | |- | ||
+ | | Makkah ||Mecca ||104515 ||http://www.wikidata.org/entity/Q5806 ||1675368 | ||
+ | |- | ||
+ | | Ecatepec ||Ecatepec de Morelos ||3529612 ||http://www.wikidata.org/entity/Q8972 ||1655015 | ||
+ | |- | ||
+ | | Kobe ||Kōbe ||1859171 ||http://www.wikidata.org/entity/Q48320 ||1532153 | ||
+ | |- | ||
+ | | Ulan Bator ||Ulaanbaatar ||2028462 ||http://www.wikidata.org/entity/Q23430 ||1396288 | ||
+ | |- | ||
+ | | ||Prague ||3067695 ||http://www.wikidata.org/entity/Q1085 ||1280508 | ||
+ | |- | ||
+ | | Nizhniy Novgorod ||Nizhny Novgorod ||520555 ||http://www.wikidata.org/entity/Q891 ||1264075 | ||
+ | |- | ||
+ | | ||Can Tho ||1581188 ||http://www.wikidata.org/entity/Q216075 ||1237300 | ||
+ | |- | ||
+ | | Kazan’ ||Kazan ||551487 ||http://www.wikidata.org/entity/Q900 ||1231878 | ||
+ | |- | ||
+ | | South Tangerang ||Tangerang Selatan ||8581443 ||http://www.wikidata.org/entity/Q10128 ||1219245 | ||
+ | |- | ||
+ | | Bandar Lampung ||Bandarlampung ||1624917 ||http://www.wikidata.org/entity/Q8156 ||1166761 | ||
+ | |- | ||
+ | | Bien Hoa ||Biên Hòa ||1587923 ||http://www.wikidata.org/entity/Q19316 ||1104000 | ||
+ | |- | ||
+ | | ||Tbilisi ||611716 ||http://www.wikidata.org/entity/Q994 ||1082400 | ||
+ | |- | ||
+ | | Ciudad Nezahualcoyotl ||Ciudad Nezahualcóyotl ||3530589 ||http://www.wikidata.org/entity/Q210307 ||1039867 | ||
+ | |- | ||
+ | | Ogbomoso ||Ogbomosho ||2327735 ||http://www.wikidata.org/entity/Q500366 ||1032000 | ||
+ | |- | ||
+ | | Shubra al Khaymah ||Shubra El-Kheima ||349076 ||http://www.wikidata.org/entity/Q269960 ||1025569 | ||
+ | |- | ||
+ | | Sao Goncalo ||São Gonçalo ||3449072 ||http://www.wikidata.org/entity/Q83114 ||999728 | ||
+ | |- | ||
+ | | Goyang-si ||Goyang ||1842485 ||http://www.wikidata.org/entity/Q42061 ||990073 | ||
+ | |- | ||
+ | | Dnipropetrovsk ||Dnipro ||709930 ||http://www.wikidata.org/entity/Q48256 ||983836 | ||
+ | |- | ||
+ | | Yongin-si ||Yongin ||1832426 ||http://www.wikidata.org/entity/Q18459 ||971327 | ||
+ | |- | ||
+ | | Kitakyushu ||Kitakyūshū ||1859307 ||http://www.wikidata.org/entity/Q188806 ||950646 | ||
+ | |- | ||
+ | | Heroica Matamoros ||Matamoros ||3523466 ||http://www.wikidata.org/entity/Q738353 ||918536 | ||
+ | |- | ||
+ | | Tiruchi ||Trichy ||1254388 ||http://www.wikidata.org/entity/Q207754 ||916857 | ||
+ | |- | ||
+ | | Moradabad ||Muradabad ||1262801 ||http://www.wikidata.org/entity/Q330643 ||887871 | ||
+ | |- | ||
+ | | Chihuahua City ||Chihuahua ||4014338 ||http://www.wikidata.org/entity/Q61302 ||870769 | ||
+ | |- | ||
+ | | Bucheon-si ||Bucheon ||1838716 ||http://www.wikidata.org/entity/Q42099 ||867678 | ||
+ | |- | ||
+ | | Cheongju-si ||Cheongju ||1845604 ||http://www.wikidata.org/entity/Q42147 ||833276 | ||
+ | |- | ||
+ | | Najaf ||An Najaf ||98860 ||http://www.wikidata.org/entity/Q168193 ||820000 | ||
+ | |- | ||
+ | | Nur-Sultan ||Astana ||1526273 ||http://www.wikidata.org/entity/Q1520 ||814401 | ||
+ | |- | ||
+ | | Al Hillah ||Hillah ||99347 ||http://www.wikidata.org/entity/Q243846 ||780000 | ||
+ | |- | ||
+ | | Antipolo City ||Antipolo ||1730501 ||http://www.wikidata.org/entity/Q1636 ||776386 | ||
+ | |- | ||
+ | | San Luis Potosí City ||San Luis Potosí ||3985606 ||http://www.wikidata.org/entity/Q204271 ||772828 | ||
+ | |- | ||
+ | | Krakow ||Kraków ||3094802 ||http://www.wikidata.org/entity/Q31487 ||766739 | ||
+ | |- | ||
+ | | Alexandrovsk ||Zaporizhzhya ||687700 ||http://www.wikidata.org/entity/Q157835 ||748984 | ||
+ | |- | ||
+ | | Trivandrum ||Thiruvananthapuram ||1254163 ||http://www.wikidata.org/entity/Q167715 ||743691 | ||
+ | |- | ||
+ | | Frankfurt am Main ||Frankfurt ||2925533 ||http://www.wikidata.org/entity/Q1794 ||732688 | ||
+ | |- | ||
+ | | ||Chișinău ||618069 ||http://www.wikidata.org/entity/Q21197 ||723500 | ||
+ | |- | ||
+ | | Chisinau ||Chișinău ||618426 ||http://www.wikidata.org/entity/Q21197 ||723500 | ||
+ | |- | ||
+ | | Acapulco de Juárez ||Acapulco ||3533462 ||http://www.wikidata.org/entity/Q81398 ||673479 | ||
+ | |- | ||
+ | | ||Bremen ||2944387 ||http://www.wikidata.org/entity/Q1209 ||661000 | ||
+ | |- | ||
+ | | Washington ||Washington, D.C. ||4140963 ||http://www.wikidata.org/entity/Q61 ||658893 | ||
+ | |- | ||
+ | | Jaboatao dos Guararapes ||Jaboatão dos Guararapes ||6317344 ||http://www.wikidata.org/entity/Q271393 ||644620 | ||
+ | |- | ||
+ | | Al Ain City ||Al Ain ||292913 ||http://www.wikidata.org/entity/Q234600 ||631005 | ||
+ | |- | ||
+ | | Querétaro City ||Querétaro ||3991164 ||http://www.wikidata.org/entity/Q173121 ||626495 | ||
+ | |- | ||
+ | | Taiz ||Ta'izz ||70225 ||http://www.wikidata.org/entity/Q466216 ||596672 | ||
+ | |- | ||
+ | | Macao ||Macau ||1821274 ||http://www.wikidata.org/entity/Q14773 ||566375 | ||
+ | |- | ||
+ | | ||Macau ||1821275 ||http://www.wikidata.org/entity/Q14773 ||566375 | ||
+ | |- | ||
+ | | Bacolod City ||Bacolod ||1729564 ||http://www.wikidata.org/entity/Q5217 ||561875 | ||
+ | |- | ||
+ | | Tlajomulco de Zuniga ||Tlajomulco de Zuñiga ||3981467 ||http://www.wikidata.org/entity/Q1962285 ||549442 | ||
+ | |- | ||
+ | | Poznan ||Poznań ||3088171 ||http://www.wikidata.org/entity/Q268 ||544612 | ||
+ | |- | ||
+ | | Naberezhnyye Chelny ||Naberezhnye Chelny ||523750 ||http://www.wikidata.org/entity/Q95041 ||529797 | ||
+ | |- | ||
+ | | New Mirpur ||Mirpur ||1169027 ||http://www.wikidata.org/entity/Q2579925 ||523500 | ||
+ | |- | ||
+ | | Tultitlan de Mariano Escobedo ||Tultitlán de Mariano Escobedo ||3515042 ||http://www.wikidata.org/entity/Q740477 ||520557 | ||
+ | |- | ||
+ | | Ile-Ife ||Ife ||2338900 ||http://www.wikidata.org/entity/Q180084 ||501952 | ||
+ | |- | ||
+ | | Mykolayiv ||Mykolaiv ||700569 ||http://www.wikidata.org/entity/Q41572 ||487996 | ||
+ | |- | ||
+ | | Ciudad Lopez Mateos ||Ciudad López Mateos ||3532624 ||http://www.wikidata.org/entity/Q1963483 ||472526 | ||
+ | |- | ||
+ | | Sao Joao de Meriti ||São João de Meriti ||3448877 ||http://www.wikidata.org/entity/Q459690 ||460541 | ||
+ | |- | ||
+ | | Aparecida de Goiania ||Aparecida de Goiânia ||6316406 ||http://www.wikidata.org/entity/Q459711 ||455657 | ||
+ | |- | ||
+ | | Malacca ||Malacca City ||1734759 ||http://www.wikidata.org/entity/Q61089 ||455300 | ||
+ | |- | ||
+ | | Amagasaki Shi ||Amagasaki ||1865383 ||http://www.wikidata.org/entity/Q213318 ||451000 | ||
+ | |- | ||
+ | | Marikina City ||Marikina ||1700925 ||http://www.wikidata.org/entity/Q17175 ||450741 | ||
+ | |- | ||
+ | | Al Mahallah al Kubra ||El-Mahalla El-Kubra ||360829 ||http://www.wikidata.org/entity/Q312723 ||442958 | ||
+ | |- | ||
+ | | Sao Jose do Rio Preto ||São José do Rio Preto ||3448639 ||http://www.wikidata.org/entity/Q192181 ||442548 | ||
+ | |- | ||
+ | | Al Mansurah ||Mansoura ||360761 ||http://www.wikidata.org/entity/Q223587 ||439348 | ||
+ | |- | ||
+ | | Jeju City ||Jeju ||1846266 ||http://www.wikidata.org/entity/Q42142 ||435413 | ||
+ | |- | ||
+ | | Diadema ||Diadema, São Paulo ||3464739 ||http://www.wikidata.org/entity/Q651891 ||412428 | ||
+ | |- | ||
+ | | Shekhupura ||Sheikhupura ||1165221 ||http://www.wikidata.org/entity/Q972756 ||411834 | ||
+ | |- | ||
+ | | Kisumu ||City of Kisumu ||191245 ||http://www.wikidata.org/entity/Q214485 ||409928 | ||
+ | |- | ||
+ | | Lapu-Lapu City ||Lapu-Lapu ||1707267 ||http://www.wikidata.org/entity/Q574903 ||408112 | ||
+ | |- | ||
+ | | Ciudad Obregón ||Ciudad culiacan ||4013704 ||http://www.wikidata.org/entity/Q681340 ||405000 | ||
+ | |- | ||
+ | | Gifu City ||Gifu ||1863641 ||http://www.wikidata.org/entity/Q45798 ||404233 | ||
+ | |- | ||
+ | | Assiut ||Asyut ||359783 ||http://www.wikidata.org/entity/Q29962 ||389307 | ||
+ | |- | ||
+ | | Mandaluyong City ||Mandaluyong ||1701966 ||http://www.wikidata.org/entity/Q9085 ||386276 | ||
+ | |- | ||
+ | | Okazaki-shi ||Okazaki ||1854373 ||http://www.wikidata.org/entity/Q242783 ||385221 | ||
+ | |- | ||
+ | | Sanandij ||Sanandaj ||117574 ||http://www.wikidata.org/entity/Q272093 ||373987 | ||
+ | |- | ||
+ | | Ambon City ||Ambon ||1651531 ||http://www.wikidata.org/entity/Q18970 ||372249 | ||
+ | |- | ||
+ | | Mogilev ||Mogilyov ||625665 ||http://www.wikidata.org/entity/Q154835 ||370712 | ||
+ | |- | ||
+ | | Jhang City ||Jhang ||1330335 ||http://www.wikidata.org/entity/Q1026616 ||365198 | ||
+ | |- | ||
+ | | Mandaue City ||Mandaue ||1701947 ||http://www.wikidata.org/entity/Q1889017 ||362654 | ||
+ | |- | ||
+ | | Mogale City ||Mogale City Local Municipality ||7291234 ||http://www.wikidata.org/entity/Q127787 ||362422 | ||
+ | |- | ||
+ | | Anantapur ||Anantapur, Andhra Pradesh ||1278672 ||http://www.wikidata.org/entity/Q760144 ||362350 | ||
+ | |- | ||
+ | | Hrodna ||Grodno ||627904 ||http://www.wikidata.org/entity/Q181376 ||361352 | ||
+ | |- | ||
+ | | Petionville ||Pétion-Ville ||3719028 ||http://www.wikidata.org/entity/Q1001440 ||359615 | ||
+ | |- | ||
+ | | Sao Vicente ||São Vicente ||3448136 ||http://www.wikidata.org/entity/Q272254 ||355542 | ||
+ | |- | ||
+ | | Serrekunda ||Serekunda ||2411989 ||http://www.wikidata.org/entity/Q217568 ||348118 | ||
+ | |- | ||
+ | | Baguio City ||Baguio ||1728930 ||http://www.wikidata.org/entity/Q1822 ||345366 | ||
+ | |- | ||
+ | | Iligan City ||Iligan ||1711082 ||http://www.wikidata.org/entity/Q285488 ||342618 | ||
+ | |- | ||
+ | | Binan ||Biñan ||1725115 ||http://www.wikidata.org/entity/Q75961 ||333028 | ||
+ | |- | ||
+ | | Lipa City ||Lipa ||1706090 ||http://www.wikidata.org/entity/Q1725 ||332386 | ||
+ | |- | ||
+ | | Volzhskiy ||Volzhsky ||472231 ||http://www.wikidata.org/entity/Q98995 ||326055 | ||
+ | |- | ||
+ | | Cordova ||Córdoba ||2519240 ||http://www.wikidata.org/entity/Q5818 ||325916 | ||
+ | |- | ||
+ | | Nazret ||Adama ||330186 ||http://www.wikidata.org/entity/Q351427 ||324000 | ||
+ | |- | ||
+ | | St Louis ||St. Louis ||4407066 ||http://www.wikidata.org/entity/Q38022 ||318416 | ||
+ | |- | ||
+ | | Ust-Kamenogorsk ||Oskemen ||1520316 ||http://www.wikidata.org/entity/Q162548 ||316699 | ||
+ | |- | ||
+ | | Al Fayyum ||Faiyum ||361320 ||http://www.wikidata.org/entity/Q203299 ||315940 | ||
+ | |- | ||
+ | | Malmo ||Malmö ||2692969 ||http://www.wikidata.org/entity/Q2211 ||307496 | ||
+ | |- | ||
+ | | Taubate ||Taubaté ||3446682 ||http://www.wikidata.org/entity/Q170540 ||302331 | ||
+ | |- | ||
+ | | Cabanatuan City ||Cabanatuan ||1721906 ||http://www.wikidata.org/entity/Q55595 ||302231 | ||
+ | |- | ||
+ | | Oaxaca City ||Oaxaca de Juárez ||3522507 ||http://www.wikidata.org/entity/Q131429 ||300050 | ||
+ | |- | ||
+ | | Bialystok ||Białystok ||776069 ||http://www.wikidata.org/entity/Q761 ||295624 | ||
+ | |- | ||
+ | | Los Reyes Acaquilpan ||La Paz ||3523908 ||http://www.wikidata.org/entity/Q2276224 ||293725 | ||
+ | |- | ||
+ | | Ismailia ||Ismaïlia ||361055 ||http://www.wikidata.org/entity/Q217156 ||293184 | ||
+ | |- | ||
+ | | Iasi ||Iași ||675810 ||http://www.wikidata.org/entity/Q16898350 ||290422 | ||
+ | |- | ||
+ | | Khmelnytskyy ||Khmelnytskyi ||706369 ||http://www.wikidata.org/entity/Q156717 ||290100 | ||
+ | |- | ||
+ | | Taboao da Serra ||Taboão da Serra ||3447186 ||http://www.wikidata.org/entity/Q841231 ||272177 | ||
+ | |- | ||
+ | | Nagaoka Shi ||Nagaoka ||1856195 ||http://www.wikidata.org/entity/Q540312 ||271722 | ||
+ | |- | ||
+ | | Santo Domingo de los Colorados ||Santo Domingo ||3651297 ||http://www.wikidata.org/entity/Q1015654 ||270875 | ||
+ | |- | ||
+ | | Yao-shi ||Yao ||1848519 ||http://www.wikidata.org/entity/Q490872 ||267581 | ||
+ | |- | ||
+ | | Lucena City ||Lucena ||1705357 ||http://www.wikidata.org/entity/Q104125 ||266248 | ||
+ | |- | ||
+ | | San Pablo City ||San Pablo ||1688830 ||http://www.wikidata.org/entity/Q76001 ||266068 | ||
+ | |- | ||
+ | | Nal'chik ||Nalchik ||523523 ||http://www.wikidata.org/entity/Q5265 ||265162 | ||
+ | |- | ||
+ | | Embu ||Embu das Artes ||3464305 ||http://www.wikidata.org/entity/Q651860 ||261781 | ||
+ | |- | ||
+ | | Mossoro ||Mossoró ||3394682 ||http://www.wikidata.org/entity/Q694845 ||259815 | ||
+ | |- | ||
+ | | Magugpo Poblacion ||Tagum ||1684269 ||http://www.wikidata.org/entity/Q725168 ||259444 | ||
+ | |- | ||
+ | | Misratah ||Misrata ||2214846 ||http://www.wikidata.org/entity/Q131323 ||259056 | ||
+ | |- | ||
+ | | Gomez Palacio ||Gómez Palacio ||4005775 ||http://www.wikidata.org/entity/Q1775128 ||257352 | ||
+ | |- | ||
+ | | Cukai ||Chukai ||1732945 ||http://www.wikidata.org/entity/Q2546441 ||255865 | ||
+ | |- | ||
+ | | Puerto Princesa City ||Puerto Princesa ||1692685 ||http://www.wikidata.org/entity/Q111739 ||255116 | ||
+ | |- | ||
+ | | Brasov ||Brașov ||683844 ||http://www.wikidata.org/entity/Q16898139 ||252814 | ||
+ | |- | ||
+ | | Braunschweig ||Brunswick ||2945024 ||http://www.wikidata.org/entity/Q2773 ||252768 | ||
+ | |- | ||
+ | | Coban ||Cobán ||3598119 ||http://www.wikidata.org/entity/Q867077 ||250675 | ||
+ | |- | ||
+ | | Al 'Ashir min Ramadan ||10th of Ramadan City ||353229 ||http://www.wikidata.org/entity/Q337539 ||250000 | ||
+ | |- | ||
+ | | Al Bayda' ||Al Bayda ||89055 ||http://www.wikidata.org/entity/Q35784 ||250000 | ||
+ | |- | ||
+ | | Galati ||Galați ||677697 ||http://www.wikidata.org/entity/Q16898261 ||249432 | ||
+ | |- | ||
+ | | Soka Shi ||Sōka City ||7464123 ||http://www.wikidata.org/entity/Q734442 ||249027 | ||
+ | |- | ||
+ | | Tehuacán ||Tehuacan ||3516109 ||http://www.wikidata.org/entity/Q842261 ||248716 | ||
+ | |- | ||
+ | | Port Montt ||Puerto Montt ||3874960 ||http://www.wikidata.org/entity/Q36214 ||245902 | ||
+ | |- | ||
+ | | Rishon LeZiyyon ||Rishon LeZion ||293703 ||http://www.wikidata.org/entity/Q201051 ||243973 | ||
+ | |- | ||
+ | | Tacloban City ||Tacloban ||1684712 ||http://www.wikidata.org/entity/Q40626 ||242089 | ||
+ | |- | ||
+ | | Dniprodzerzhynsk ||Kamianske ||709932 ||http://www.wikidata.org/entity/Q156719 ||237244 | ||
+ | |- | ||
+ | | Marabu ||Miri ||1738050 ||http://www.wikidata.org/entity/Q986803 ||234541 | ||
+ | |- | ||
+ | | Olongapo City ||Olongapo ||1697175 ||http://www.wikidata.org/entity/Q56759 ||233040 | ||
+ | |- | ||
+ | | Okara ||Okara, Pakistan ||1168718 ||http://www.wikidata.org/entity/Q968211 ||232386 | ||
+ | |- | ||
+ | | Petaẖ Tiqwa ||Petah Tikva ||293918 ||http://www.wikidata.org/entity/Q190828 ||230984 | ||
+ | |- | ||
+ | | Kamoke ||Kāmoke ||1175088 ||http://www.wikidata.org/entity/Q1260929 ||230979 | ||
+ | |- | ||
+ | | Durán ||Durán, Ecuador ||3658192 ||http://www.wikidata.org/entity/Q1120810 ||230839 | ||
+ | |- | ||
+ | | Barishal ||Barisal ||1336137 ||http://www.wikidata.org/entity/Q747840 ||230000 | ||
+ | |- | ||
+ | | Ashaiman ||Ashiaman ||2304121 ||http://www.wikidata.org/entity/Q724730 ||228509 | ||
+ | |- | ||
+ | | Talisay City ||Talisay ||1683881 ||http://www.wikidata.org/entity/Q316500 ||227645 | ||
+ | |- | ||
+ | | Mage ||Magé ||3458142 ||http://www.wikidata.org/entity/Q841218 ||227322 | ||
+ | |- | ||
+ | | Turkestan ||Turkistan ||1517945 ||http://www.wikidata.org/entity/Q848638 ||227098 | ||
+ | |- | ||
+ | | Engel's ||Engels ||563464 ||http://www.wikidata.org/entity/Q198748 ||225752 | ||
+ | |- | ||
+ | | Padangsidempuan ||Padang Sidempuan ||1214369 ||http://www.wikidata.org/entity/Q5974 ||225544 | ||
+ | |- | ||
+ | | Kremenchug ||Kremenchuk ||704147 ||http://www.wikidata.org/entity/Q156724 ||225216 | ||
+ | |- | ||
+ | | Saddiqabad ||Sadiqabad ||1166652 ||http://www.wikidata.org/entity/Q1251234 ||221866 | ||
+ | |- | ||
+ | | Ota-shi ||Ōta ||1853626 ||http://www.wikidata.org/entity/Q386179 ||221403 | ||
+ | |- | ||
+ | | Patan ||Lalitpur ||1282931 ||http://www.wikidata.org/entity/Q6647 ||220802 | ||
+ | |- | ||
+ | | Concepción de la Vega ||La Vega ||3509382 ||http://www.wikidata.org/entity/Q538953 ||220279 | ||
+ | |- | ||
+ | | Itaborai ||Itaboraí ||3460950 ||http://www.wikidata.org/entity/Q841244 ||218008 | ||
+ | |- | ||
+ | | Alor Star ||Alor Setar ||1736309 ||http://www.wikidata.org/entity/Q474868 ||217000 | ||
+ | |- | ||
+ | | Ormoc City ||Ormoc ||1697018 ||http://www.wikidata.org/entity/Q1014782 ||215031 | ||
+ | |- | ||
+ | | Padova ||Padua ||3171728 ||http://www.wikidata.org/entity/Q617 ||211560 | ||
+ | |- | ||
+ | | Ploieşti ||Ploiești ||670474 ||http://www.wikidata.org/entity/Q16898469 ||209945 | ||
+ | |- | ||
+ | | Isesaki Shi ||Isesaki ||1861435 ||http://www.wikidata.org/entity/Q328596 ||209895 | ||
+ | |- | ||
+ | | Sao Jose ||São José, Santa Catarina ||3448744 ||http://www.wikidata.org/entity/Q173849 ||209804 | ||
+ | |- | ||
+ | | Maracanau ||Maracanaú ||3395473 ||http://www.wikidata.org/entity/Q1794732 ||209057 | ||
+ | |- | ||
+ | | Mostoles ||Móstoles ||3116025 ||http://www.wikidata.org/entity/Q187826 ||206589 | ||
+ | |- | ||
+ | | Zanzibar ||Zanzibar City ||148730 ||http://www.wikidata.org/entity/Q2222874 ||205870 | ||
+ | |- | ||
+ | | Santa Luzia ||Santa Luzia, Minas Gerais ||3450144 ||http://www.wikidata.org/entity/Q942235 ||202942 | ||
+ | |- | ||
+ | | La Romana ||La Romana, La Romana ||3500957 ||http://www.wikidata.org/entity/Q40508 ||202488 | ||
+ | |- | ||
+ | | Tuy Hoa ||Tuy Hòa ||1563281 ||http://www.wikidata.org/entity/Q35747 ||202030 | ||
+ | |- | ||
+ | | Marawi City ||Marawi ||1701053 ||http://www.wikidata.org/entity/Q592338 ||201785 | ||
+ | |- | ||
+ | | Episkopi ||Episkopi, Limassol ||146633 ||http://www.wikidata.org/entity/Q5383505 ||201524 | ||
+ | |- | ||
+ | | Qina ||Qena ||350550 ||http://www.wikidata.org/entity/Q336661 ||201191 | ||
+ | |- | ||
+ | | Poza Rica de Hidalgo ||Poza Rica ||3521168 ||http://www.wikidata.org/entity/Q1010503 ||200119 | ||
+ | |- | ||
+ | | Prokop'yevsk ||Prokopyevsk ||1494114 ||http://www.wikidata.org/entity/Q184147 ||196406 | ||
+ | |- | ||
+ | | Joetsu ||Jōetsu ||6825489 ||http://www.wikidata.org/entity/Q582289 ||193777 | ||
+ | |- | ||
+ | | Bani Suwayf ||Beni Suef ||359173 ||http://www.wikidata.org/entity/Q394080 ||193048 | ||
+ | |- | ||
+ | | NIA Valencia ||Valencia ||1680116 ||http://www.wikidata.org/entity/Q2158 ||192993 | ||
+ | |- | ||
+ | | Kishiwada Shi ||Kishiwada City ||1859382 ||http://www.wikidata.org/entity/Q740456 ||192637 | ||
+ | |- | ||
+ | | Tottori-shi ||Tottori ||1849892 ||http://www.wikidata.org/entity/Q200731 ||191601 | ||
+ | |} |
Latest revision as of 07:51, 8 April 2023
OsProject
OsProject | |
---|---|
edit | |
id | geograpy3 |
state | active |
owner | somnathrakshit |
title | geograpy |
url | https://github.com/somnathrakshit/geograpy3 |
version | 0.2.6 |
description | |
date | 2023-04-08 |
since | 2018-09-18 |
until |
tickets
Freitext
What is it?
Geograpy3 is a Python library to extract geographic details like:
- country
- region
- city
from plaintext and websites.
Examples
Example 1 - London 2012 Olympic torch relay route
Let's take the Wikipedia article on the 2012 London Olympics torch relay route. In this article quite a few countries, regions and cities are mentioned. Let's extract that information using geograpy3
Code
import geograpy
url='https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay'
places = geograpy.get_geoPlace_context(url = url)
print(places)
Result
python example1.py
countries=['Ireland', 'Jersey', 'Guernsey', 'Turkey', 'Greece', 'United Kingdom', 'Belarus', 'South Africa', 'Australia', 'New Zealand', 'Germany', 'France', 'Jamaica', 'Antigua and Barbuda', 'Montserrat', 'United States', 'Canada', 'Japan']
regions=['Hackney', 'Davy', 'Ireland', 'Burscough', 'Jersey', 'Munich', 'Newton Aycliffe', 'British/Irish', 'Plymouth', 'Greece', 'Thirsk', 'Wales', 'Locog', 'Cumbrian', 'Lincolnshire', 'Guernsey', 'Cardiff', 'Torch', 'Host', 'Cambridge', 'Bristol Harbour', 'Falmouth', 'Athens', 'Turkey', 'Wiltshire', 'British', 'England', 'United Kingdom', 'Sheffield', 'London', 'Aberaeron', 'Abraham', 'Northern Ireland', 'Wanted', 'East', 'Heathrow', 'Gravesend', 'Essex', 'Maidstone', 'Cornwall', 'Hyde Park', 'Hera', 'Swansea', 'Caerphilly', 'Taunton Lancashire', 'Stamford', 'Dublin', 'Derry', 'Portland', 'Ioannina', 'Scotland', 'Bangor', 'Engineering', 'Land']
cities=['Ioannina', 'Athens', 'Dublin', 'Hyde Park', 'Swansea', 'Sheffield', 'Portland', 'Cardiff', 'Cambridge', 'Bangor', 'Thirsk', 'Stamford', 'Plymouth', 'Newton Aycliffe', 'Maidstone', 'London', 'Hackney', 'Gravesend', 'Falmouth', 'Caerphilly', 'Burscough', 'Aberaeron', 'Munich', 'Derry', 'England', 'Scotland', 'Essex', 'Turkey', 'Davy', 'Ireland', 'Lincolnshire', 'Wales', 'Guernsey', 'Cornwall', 'Heathrow', 'Hera']
other=['British', 'British', 'East']
Getting the source code
git clone https://github.com/somnathrakshit/geograpy3
cd geograpy3
scripts/install
History
first geograpy (2013)
The name "geograpy" was coined by Chris Albon
Angela Oduor Lungat, Brunobg, Jonathon Morgan, Romina Suarez and other contributors from Ushahidi, Nairobi, Kenya created the first and popular geograpy version. It was forked more than a hundred times and had more than 200 Stars on github.
This version was restricted to python2 and as of 2020-09 there are still some 29 open issues in this project. The project is officially archived and you might want to use geograpy3 instead.
geograpy2 (2014)
The geograpy2 fork was created in 2014. It solves several problems (such as support for utf8, places names with multiple words, confusion over homonyms etc).
Since 2015 the project didn't move forward much so you might want to use geograpy3 instead. https://github.com/Corollarium/geograpy2
geograpy3 (2018)
geograpy3 was forked from geograpy2 in 2018 by Somnath Rakshit. It added python3 compatibility. In 2020 Wolfgang Fahl joined the project since he had a need to use it for the Proceedings Title Parser as part of the ConfIDent project
Data used
Overview
Cities
The cities table is derived from the GeoLite2 by MaxMind database
Countries
The countries table is derived from Wikidata:
# get a list of countries
# for geograpy3 library
# see https://github.com/somnathrakshit/geograpy3/issues/15
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
# get City details with Country
SELECT DISTINCT ?country ?countryLabel ?countryIsoCode ?countryCoord ?countryPopulation ?continent ?continentLabel
WHERE {
# instance of Country
?country wdt:P31/wdt:P279* wd:Q6256 .
# VALUES ?country { wd:Q55}.
# label for the country
?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
# get the continent (s)
#OPTIONAL {
# ?country wdt:P30 ?continent.
# ?continent rdfs:label ?continentLabel filter (lang(?continentLabel) = "en").
#}
# get the coordinates
OPTIONAL {
?country wdt:P625 ?countryCoord.
}
# https://www.wikidata.org/wiki/Property:P297 ISO 3166-1 alpha-2 code
?country wdt:P297 ?countryIsoCode.
# population of country
OPTIONAL
{
SELECT ?country (max(?countryPopulationValue) as ?countryPopulation)
WHERE {
?country wdt:P1082 ?countryPopulationValue
} group by ?country
}
# https://www.wikidata.org/wiki/Property:P2132
# nominal GDP per capita
# OPTIONAL { ?country wdt:P2132 ?countryGDP_perCapitaValue. }
}
ORDER BY ?countryIsoCode
try it! - 204 results in some 7.9 s as of 2021-12
Regions
The regions list is derived from Wikidata
# get a list of regions
# for geograpy3 library
# see https://github.com/somnathrakshit/geograpy3/issues/15
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT DISTINCT ?country ?countryLabel ?countryIsoCode ?region ?regionLabel ?regionIsoCode ?regionPopulation ?location
WHERE
{
# administrative unit of first order
?region wdt:P31/wdt:P279* wd:Q10864048.
OPTIONAL {
?region rdfs:label ?regionLabel filter (lang(?regionLabel) = "en").
}
# isocode state/province (mandatory - filters historic regions while at it ...)
# filter historic regions
# FILTER NOT EXISTS {?region wdt:P576 ?end}
{
SELECT ?region (max(?regionAlpha2) as ?regionIsoCode) (max(?regionPopulationValue) as ?regionPopulation) (max(?locationValue) as ?location)
WHERE {
?region wdt:P300 ?regionAlpha2.
# get the population
# https://www.wikidata.org/wiki/Property:P1082
OPTIONAL {
?region wdt:P1082 ?regionPopulationValue
}
# get he location
# https://www.wikidata.org/wiki/Property:P625
OPTIONAL {
?region wdt:P625 ?locationValue.
}
} GROUP BY ?region
}
# # https://www.wikidata.org/wiki/Property:P297
OPTIONAL {
?region wdt:P17 ?country.
# label for the country
?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
?country wdt:P297 ?countryIsoCode.
}
} ORDER BY ?regionIsoCode
try it! - 3753 results in 11.4 s as of 2021-08
Adding city details from Wikidata
Query
# get a list of human settlements having a geoName identifier
# to add to geograpy3 library
# see https://github.com/somnathrakshit/geograpy3/issues/15
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?city ?cityLabel ?cityPop ?geoNameId ?country ?countryLabel ?countryIsoCode ?countryPopulation
WHERE {
# geoName Identifier
?city wdt:P1566 ?geoNameId.
# instance of human settlement https://www.wikidata.org/wiki/Q486972
?city wdt:P31/wdt:P279* wd:Q486972 .
# population of city
OPTIONAL { ?city wdt:P1082 ?cityPop.}
# label of the City
?city rdfs:label ?cityLabel filter (lang(?cityLabel) = "en").
# country this city belongs to
?city wdt:P17 ?country .
# label for the country
?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
# https://www.wikidata.org/wiki/Property:P297 ISO 3166-1 alpha-2 code
?country wdt:P297 ?countryIsoCode.
# population of country
?country wdt:P1082 ?countryPopulation.
OPTIONAL {
?country wdt:P2132 ?countryGdpPerCapita.
}
}
try it! - you may probably experience a timeout on this query. It takes about 1 min on a local wikidata copy based on blazegraph
If your are intested in the result you can download the Sqlite version of query result and e.g. inspect it with the DB Browser for SQLite
CityPops Stats
Here are some statistic queries about the data imported from Wikidata
select count(*) from cityPops where cityPop is not Null
164503
select count(*) from cityPops
453306
select count(distinct geoNameId) from cityPops
414198
select count(*)
from cities c
join cityPops cp on c.geoname_id =cp.geoNameId
90482
Difference in Name/Label
17499 differences:
select c.city_name as name,cp.cityLabel,c.*,city as wikidataurl,cityPop
from cities c
join cityPops cp
on c.geoname_id=cp.geoNameId
where not c.city_name =cp.cityLabel
group by geoNameId