OsProject | |
id | geograpy3 |
state | |
owner | somnathrakshit |
title | geograpy |
url | |
version | 0.1.15 |
description | |
date | 2020/09/26 |
since | |
until |
What is it?
Geograpy3 is a Python library to extract geographic details like:
- country
- region
- city
from plaintext and websites.
Let's take the BBC New article of May 2011 'London 2012 Olympic torch relay route revealed'. In this article quite a few countries, regions and cities are mentioned. Let's extract that information using geograpy3
import geograpy
places = geograpy.get_geoPlace_context(url = url)
countries=['Jersey', 'Guernsey', 'Greece', 'Belarus', 'South Africa', 'Australia', 'New Zealand', 'United Kingdom', 'Ireland', 'United States', 'Canada']
regions=['Newcastle', 'Bristol', 'Oxford', 'Southampton', 'Greek', 'Sheffield', 'Greece', 'Media', 'Land', 'Cornwall', 'June', 'Nottingham', 'London', 'Dublin', 'Belfast', 'Guernsey', 'Locog', 'Olympia', 'Shetland', 'Jersey', 'Cardiff']
cities=['Dublin', 'Newcastle', 'Belfast', 'Sheffield', 'Cardiff', 'Oxford', 'Southampton', 'Nottingham', 'London', 'Bristol', 'Media', 'Olympia', 'Guernsey', 'Cornwall']
Getting the source code
git clone
cd geograpy3
first geograpy (2013)
The name "geograpy" was coined by Chris Albon
Angela Oduor Lungat, Brunobg, Jonathon Morgan, Romina Suarez and other contributors from Ushahidi, Nairobi, Kenya created the first and popular geograpy version. It was forked more than a hundred times and had more than 200 Stars on github.
This version was restricted to python2 and as of 2020-09 there are still some 29 open issues in this project. The project is officially archived and you might want to use geograpy3 instead.
geograpy2 (2014)
The geograpy2 fork was created in 2014. It solves several problems (such as support for utf8, places names with multiple words, confusion over homonyms etc).
Since 2015 the project didn't move forward much so you might want to use geograpy3 instead.
geograpy3 (2018)
geograpy3 was forked from geograpy2 in 2018 by Somnath Rakshit. It added python3 compatibility. In 2020 Wolfgang Fahl joined the project since he had a need to use it for the Proceedings Title Parser as part of the ConfIDent project
Data used
title geograpy Tables 2020-09-26 [© 2020 geograpy3 project] end title package geograpy3 {
class cities << Entity >> { city_name : TEXT continent_code : TEXT continent_name : TEXT country_iso_code : TEXT country_name : TEXT geoname_id : TEXT <<PK>> is_in_european_union : TEXT locale_code : TEXT metro_code : TEXT subdivision_1_iso_code : TEXT subdivision_1_name : TEXT subdivision_2_iso_code : TEXT subdivision_2_name : TEXT time_zone : TEXT } class countries << Entity >> { coord : TEXT country : TEXT countryGDP_perCapita : FLOAT countryIsoCode : TEXT <<PK>> countryLabel : TEXT countryPopulation : FLOAT } class regions << Entity >> { country : TEXT countryIsoCode : TEXT countryLabel : TEXT location : TEXT region : TEXT regionIsoCode : TEXT regionLabel : TEXT regionPopulation : FLOAT } class City_wikidata << Entity >> { cityPopulation : FLOAT coord : TEXT country : TEXT countryGDP_perCapita : FLOAT countryIsoCode : TEXT countryLabel : TEXT countryPopulation : FLOAT date : TIMESTAMP name : TEXT ratio : TEXT region : TEXT regionIsoCode : TEXT regionLabel : TEXT wikidataurl : TEXT } class prefixes << Entity >> { count : INTEGER level : INTEGER prefix : TEXT <<PK>> } class ambiguous << Entity >> { name : TEXT <<PK>> } class cityPops << Entity >> { city : TEXT cityLabel : TEXT cityPop : FLOAT country : TEXT countryIsoCode : TEXT countryLabel : TEXT countryPopulation : FLOAT geoNameId : TEXT } class citiesWithPopulation << Entity >> { cityPop : FLOAT city_name : TEXT continent_code : TEXT continent_name : TEXT country_iso_code : TEXT country_name : TEXT geoname_id : TEXT is_in_european_union : TEXT locale_code : TEXT metro_code : TEXT subdivision_1_iso_code : TEXT subdivision_1_name : TEXT subdivision_2_iso_code : TEXT subdivision_2_name : TEXT time_zone : TEXT wikidataurl : TEXT }
' BITPlan Corporate identity skin params ' Copyright (c) 2015 BITPlan GmbH ' see ' skinparams generated by com.bitplan.restmodelmanager skinparam note {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam component {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam package {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam usecase {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam activity {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam classAttribute {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam interface {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam class {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} skinparam object {
BackGroundColor #FFFFFF FontSize 12 ArrowColor #FF8000 BorderColor #FF8000 FontColor black FontName Technical
} hide Circle ' end of skinparams '
Adding city details from Wikidata
# get a list of human settlements having a geoName identifier
# to add to geograpy3 library
# see
PREFIX rdfs: <>
PREFIX wdt: <>
PREFIX wd: <>
SELECT ?city ?cityLabel ?cityPop ?geoNameId ?country ?countryLabel ?countryIsoCode ?countryPopulation
# geoName Identifier
?city wdt:P1566 ?geoNameId.
# instance of human settlement
?city wdt:P31/wdt:P279* wd:Q486972 .
# population of city
OPTIONAL { ?city wdt:P1082 ?cityPop.}
# label of the City
?city rdfs:label ?cityLabel filter (lang(?cityLabel) = "en").
# country this city belongs to
?city wdt:P17 ?country .
# label for the country
?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
# ISO 3166-1 alpha-2 code
?country wdt:P297 ?countryIsoCode.
# population of country
?country wdt:P1082 ?countryPopulation.
?country wdt:P2132 ?countryGdpPerCapita.
try it! - you may probably experience a timeout on this query. It takes about 1 min on a local wikidata copy based on blazegraph
If your are intested in the result you can download the Sqlite version of query result and e.g. inspect it with the DB Browser for SQLite
CityPops Stats
Here are some statistic queries about the data imported from Wikidata
select count(*) from cityPops where cityPop is not Null
select count(*) from cityPops
select count(distinct geoNameId) from cityPops
select count(*)
from cities c
join cityPops cp on c.geoname_id =cp.geoNameId