Difference between revisions of "Geograpy"

From BITPlan Wiki
Jump to navigation Jump to search
Line 15: Line 15:
 
* region
 
* region
 
* city
 
* city
from plaintext.
+
from plaintext and websites.
 +
 
 +
== Examples ==
 +
Let's take the [https://www.bbc.com/news/av/world-africa-54272558 BBC New article of May 2011 'London 2012 Olympic torch relay route revealed']. In this article quite a few countries, regions and cities are mentioned.
 +
Let's extract that information using geograpy3
 +
=== Code ===
 +
<source lang='python'>
 +
import geograpy
 +
url='https://www.bbc.com/news/av/world-africa-54272558'
 +
places = geograpy.get_geoPlace_context(url = url)
 +
print(places)
 +
</source>
 +
=== Result ==
 +
<source lang='json'>
 +
countries=['Jersey', 'Guernsey', 'Greece', 'Belarus', 'South Africa', 'Australia', 'New Zealand', 'United Kingdom', 'Ireland', 'United States', 'Canada']
 +
regions=['Newcastle', 'Bristol', 'Oxford', 'Southampton', 'Greek', 'Sheffield', 'Greece', 'Media', 'Land', 'Cornwall', 'June', 'Nottingham', 'London', 'Dublin', 'Belfast', 'Guernsey', 'Locog', 'Olympia', 'Shetland', 'Jersey', 'Cardiff']
 +
cities=['Dublin', 'Newcastle', 'Belfast', 'Sheffield', 'Cardiff', 'Oxford', 'Southampton', 'Nottingham', 'London', 'Bristol', 'Media', 'Olympia', 'Guernsey', 'Cornwall']
 +
other=[]
 +
</source>
  
 
= History =
 
= History =

Revision as of 09:16, 26 September 2020

OsProject

OsProject
edit
id  geograpy3
state  
owner  somnathrakshit
title  geograpy
url  https://github.com/somnathrakshit/geograpy3
version  0.1.15
description  
date  2020/09/26
since  
until  

What is it?

Geograpy3 is a Python library to extract geographic details like:

  • country
  • region
  • city

from plaintext and websites.

Examples

Let's take the BBC New article of May 2011 'London 2012 Olympic torch relay route revealed'. In this article quite a few countries, regions and cities are mentioned. Let's extract that information using geograpy3

Code

import geograpy
url='https://www.bbc.com/news/av/world-africa-54272558'
places = geograpy.get_geoPlace_context(url = url) 
print(places)

= Result

countries=['Jersey', 'Guernsey', 'Greece', 'Belarus', 'South Africa', 'Australia', 'New Zealand', 'United Kingdom', 'Ireland', 'United States', 'Canada']
regions=['Newcastle', 'Bristol', 'Oxford', 'Southampton', 'Greek', 'Sheffield', 'Greece', 'Media', 'Land', 'Cornwall', 'June', 'Nottingham', 'London', 'Dublin', 'Belfast', 'Guernsey', 'Locog', 'Olympia', 'Shetland', 'Jersey', 'Cardiff']
cities=['Dublin', 'Newcastle', 'Belfast', 'Sheffield', 'Cardiff', 'Oxford', 'Southampton', 'Nottingham', 'London', 'Bristol', 'Media', 'Olympia', 'Guernsey', 'Cornwall']
other=[]

History

first geograpy (2013)

Angela Oduor Lungat, Brunobg, Jonathon Morgan, Romina Suarez and other contributors from Ushahidi, Nairobi, Kenya created the first and popular geograpy version This version was restricted to python2.

Adding city details from Wikidata

Query

# get a list of human settlements having a geoName identifier
# to add to geograpy3 library
# see https://github.com/somnathrakshit/geograpy3/issues/15
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?city ?cityLabel ?cityPop ?geoNameId ?country ?countryLabel ?countryIsoCode ?countryPopulation
WHERE {
  # geoName Identifier
  ?city wdt:P1566 ?geoNameId.
  # instance of human settlement https://www.wikidata.org/wiki/Q486972
  ?city wdt:P31/wdt:P279* wd:Q486972 .
  # population of city
  OPTIONAL { ?city wdt:P1082 ?cityPop.}

  # label of the City
  ?city rdfs:label ?cityLabel filter (lang(?cityLabel) = "en").
  # country this city belongs to
  ?city wdt:P17 ?country .
  # label for the country
  ?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
  # https://www.wikidata.org/wiki/Property:P297 ISO 3166-1 alpha-2 code
  ?country wdt:P297 ?countryIsoCode.
  # population of country
  ?country wdt:P1082 ?countryPopulation.
  OPTIONAL {
     ?country wdt:P2132 ?countryGdpPerCapita.
  }
}

try it! - you may probably experience a timeout on this query. It takes about 1 min on a local wikidata copy based on blazegraph

If your are intested in the result you can download the Sqlite version of query result and e.g. inspect it with the DB Browser for SQLite