Difference between revisions of "Pysotsog"
Line 114: | Line 114: | ||
Q170422 2010 Summer Youth Olympics 2010 edition of the Summer Youth Olympics | Q170422 2010 Summer Youth Olympics 2010 edition of the Summer Youth Olympics | ||
Q40176 Singapore MRT rapid transit system in Singapore | Q40176 Singapore MRT rapid transit system in Singapore | ||
+ | ... | ||
+ | Q7522845 Singapore ghost town in Michigan | ||
</source> | </source> | ||
Revision as of 14:57, 20 November 2022
OsProject | |
---|---|
edit | |
id | pysotsog |
state | active |
owner | WolfgangFahl |
title | pysotsog is a python library for scholars to help navigate the conceptual knowledge graph consisting of authors,organizations,papers,scientific events,scientific event series |
url | https://github.com/WolfgangFahl/pysotsog |
version | 0.0.8 |
description | |
date | 2022-11-17 |
since | 2022-11-19 |
until |
Motivation
Standing on the shoulders of giants is a core motto for scholars when doing research. To pursue this motto scholars need to be able to navigate the conceptual knowledge graph depicted in the diagram below. This knowledge graph is implemented in Wikidata,dblp, library catalogs such as TIB and the general internet. Quite a few items for the relevant entities are accessible via the scholia portal.
pysotsog is a python library to improve the search, navigation and general accessibility of the items in this scholary knowledge graph.
Search strategy
sotsog searches are specialized. They will try to select results by relevance. E.g. if you search for the country "Singapore" the disambiguation will make sure that the ghost town "Singapore" in wikidata is ignored since it is not related as much to scientific as the Singapore city-state is.
wd search Singapore
Q334 Singapore city-state in maritime Southeast Asia
Q3306197 Central Area, Singapore city centre of Singapore
Q4420036 Singapore in the Straits Settlements period of Singapore History
Q3484945 Singapore 1947 film by John Brahm
Q7522845 Singapore ghost town in Michigan
Q5124558 Civil Service College college for Singapore government employees
Q7522857 Singapore 1980 song by 2 Plus 1
Q30628723 Singapore settlement in South Africa
Q110537331 Singapore ship built in 1924
Q98150266 Singapore 2002 children's nonfiction book
Q20470370 Singapore listed historical ship in Sweden
Q30276503 Singapore preserved British 0-4-0ST locomotive
Q48990479 Singapore British-bred Thoroughbred racehorse
Q11893609 Singapore album by Frederik
Q7522855 Singapore 1960 film directed by Shakti Samanta
Q97987607 SINGAPORE Barque built in Aberdeen in 1833
Q115262842 Singapore geographic township in Ontario, Canada
Q84264331 Singapore ship built in 2004
Q170422 2010 Summer Youth Olympics 2010 edition of the Summer Youth Olympics
Q40176 Singapore MRT rapid transit system in Singapore
...
Q7522845 Singapore ghost town in Michigan
The above search is first filtered by relevant classes that is the P31*/P279* relations of wikidata are considered to find items that have a base class which is part of the concept skg shown above.
For relevant "neighbors" of our instances the same holds true depending on the relevance calculated as a function of frequency and or value. E.g. Singapore is rate heigh since the frequency of events in Singapore is high.
To get the frequency information we scan our sources regularly e.g. using wikidata,dblp and conferencecorpus as the sources. In the first phase the relevance calculations will be focussed on scientific events since this work is part of the ConfIDent project.
A special case is "academic field" "topic of work" and the likes which are specifically only tracked for the most relevant items e.g. the most relevant 1000,10.000,100.000,1.000.000, 10.000,000 and so - this is a "long-tail" issue that will only be covered by counting links and keeping as much relevance information as is technically and organizationally a "low hanging fruit". Here no specific class handling is done anymore - the class info is only available by following the link (this does not mean that search by topic is not possible since of course the topic itself will be searchable and of course a reverse search "WhatLinksHere" is possible but no further structure is available directly in the sotsog code or infrastructure.
Installation
pip install pysotsog
# alternatively if your pip is not a python3 pip
pip3 install pysotsog
# local install from source directory of pysotsog
pip install .
upgrade
pip install pysotsog -U
# alternatively if your pip is not a python3 pip
pip3 install pysotsog -U
Open Source access
git clone https://github.com/WolfgangFahl/pysotsog
cd pysotsog
pip install .
Testing
pip install green
green
...
Ran 14 tests in 13.228s using 4 processes
OK (passes=14)
Usage
Command line
sotsog -h
usage: sotsog [-h] [-d] [-la LANG] [-li LIMIT] [-V] [search ...]
python Library for Scholars to achieve "Standing on the shoulders of giants"
Created by Wolfgang Fahl on 2022-11-16.
Copyright 2022 Wolfgang Fahl. All rights reserved.
Licensed under the Apache License 2.0
http://www.apache.org/licenses/LICENSE-2.0
Distributed on an "AS IS" basis without warranties
or conditions of any kind, either express or implied.
USAGE
positional arguments:
search search terms
options:
-h, --help show this help message and exit
-d, --debug show debug info
-la LANG, --lang LANG
language code to use
-li LIMIT, --limit LIMIT
limit the number of search results
-V, --version show program's version number and exit
Examples
Scholar
sotsog Tim Berners-Lee
Tim Berners-Lee(Q80):English computer scientist, inventor of the World Wide Web (born 1955)✅
Scholar➜Tim Berners-Lee:
wikiDataId=http://www.wikidata.org/entity/Q80
gndId=121649091
dblpId=b/TimBernersLee
orcid=0000-0003-1279-3709
homepage=http://www.w3.org/People/Berners-Lee/
givenName=http://www.wikidata.org/entity/Q1369663
familyName=http://www.wikidata.org/entity/Q18375238
gender=http://www.wikidata.org/entity/Q6581097
image=http://commons.wikimedia.org/wiki/Special:FilePath/Sir%20Tim%20Berners-Lee%20%28cropped%29.jpg
opening https://scholia.toolforge.org/author/Q80 in browser
Paper
We Need a Magna Carta for the Internet
sotsog We Need a Magna Carta for the Internet
We Need a Magna Carta for the Internet(Q55693402):✅
Paper➜We Need a Magna Carta for the Internet:
wikiDataId=http://www.wikidata.org/entity/Q55693402
DOI=10.1111/NPQU.11475
publication_date=2014-07-01 00:00:00
opening https://scholia.toolforge.org/work/Q55693402 in browser
Institution
sotsog RWTH
RWTH Aachen University(Q273263):university in Aachen, Germany✅
Institution➜RWTH Aachen University:
wikiDataId=http://www.wikidata.org/entity/Q273263
short_name=RWTH Aachen
inception=1870-10-10 00:00:00
country=http://www.wikidata.org/entity/Q183
image=http://commons.wikimedia.org/wiki/Special:FilePath/1196-18-rwth-aachen-hg-von-hendrik-brixius.jpg
located_in=http://www.wikidata.org/entity/Q1017
official_website=http://www.rwth-aachen.de
opening https://scholia.toolforge.org/organization/Q273263 in browser
Event Series
sotsog WWW
The Web Conference(Q3570023):conference series✅
EventSeries➜The Web Conference:
wikiDataId=http://www.wikidata.org/entity/Q3570023
short_name=WWW
title=The Web Conference
official_website=http://www.iw3c2.org/conferences/
DBLP_venue_ID=conf/www
inception=1994-01-01 00:00:00
gndId=1092529268
opening https://scholia.toolforge.org/event-series/Q3570023 in browser
Event
sotsog VNC 2021
2021 IEEE Vehicular Networking Conference (VNC)(Q109551429):2021 edition of VNC Conference on Vehicular Networking✅
Event➜2021 IEEE Vehicular Networking Conference (VNC):
wikiDataId=http://www.wikidata.org/entity/Q109551429
title=2021 IEEE Vehicular Networking Conference (VNC)
location=http://www.wikidata.org/entity/Q3012
official_website=https://ieee-vnc.org/2021/
opening https://scholia.toolforge.org/event/Q109551429 in browser
Proceedings
Proceedings of the 35th International Workshop on Description Logics (DL 2022)
sotsog "Proceedings of the 35th International Workshop on Description Logics (DL 2022)"
Proceedings of the 35th International Workshop on Description Logics (DL 2022)(Q115118238):Proceedings of DL 2022 workshop✅
Proceedings➜Proceedings of the 35th International Workshop on Description Logics (DL 2022):
wikiDataId=http://www.wikidata.org/entity/Q115118238
short_name=DL 2022
title=Proceedings of the 35th International Workshop on Description Logics (DL 2022)
publication_date=2022-11-03 00:00:00
full_work_available_at_URL=http://ceur-ws.org/Vol-3263/
opening https://scholia.toolforge.org/venue/Q115118238 in browser
Dblp schema
References
- ^ willighagenci