Difference between revisions of "Get your own copy of WikiData"
Jump to navigation
Jump to search
(Created page with "= Why would you want your own WikiData copy? = The resources behind https://query.wikidata.org/ are scarce and used by a lot of people. You might hit the https://www.wikidata....") |
|||
Line 5: | Line 5: | ||
= Success Reports = | = Success Reports = | ||
− | + | == 2022 - 2024 == | |
+ | {{#ask: [[Concept:Import]]|format=count}} | ||
+ | {{#forminput:form=Import|button text=add Import}} | ||
+ | {{#ask: [[Concept:Import]] | ||
+ | |mainlabel=Import | ||
+ | |?Import state = state | ||
+ | |?Import url = url | ||
+ | |?Import target = target | ||
+ | |?Import start = start | ||
+ | |?Import end = end | ||
+ | |?Import days = days | ||
+ | |?Import os = os | ||
+ | |?Import cpu = cpu | ||
+ | |?Import ram = ram | ||
+ | |?Import triples = triples | ||
+ | |limit=200 | ||
+ | |sort=Import start | ||
+ | |order=desc | ||
+ | }} | ||
+ | == 2017-2022 == | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! # Date !! Source !! Target !! Triples !! days !! RAM GB !! CPU Cores !! Speed !! Link | ! # Date !! Source !! Target !! Triples !! days !! RAM GB !! CPU Cores !! Speed !! Link | ||
|- | |- | ||
− | | | + | | 2022-07 || latest-all.ttl (2022-07-12) || [https://www.stardog.com stardog] || 17.2 billion || 1 d 19 h || 253 || || || {{Link|target=WikiData_Import_2022-07-12|title=Tim Holzheim - BITPlan Wiki}} |
+ | |- | ||
+ | | 2022-06 || latest-all.nt (2022-06-25) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 17.2 billion || 1 d 2 h || 128 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-06-25|title=Wolfgang Fahl - BITPlan Wiki}} | ||
|- | |- | ||
− | | | + | | 2022-05 || latest-all.ttl.bz2 (2022-05-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || ~17 billion || 14h || 128 || 12/24 || 4.8 GHz boost || {{Link|target=https://github.com/ad-freiburg/qlever/wiki/Using-QLever-for-Wikidata|title=Hannah Bast - QLever}} |
|- | |- | ||
− | | | + | | 2022-02 || latest-all.nt (2022-02) || [https://www.stardog.com stardog] || 16.7 billion || 9h || || || || {{Link|target=https://www.stardog.com/labs/blog/wikidata-in-stardog|title=Evren Sirin - stardog}} |
|- | |- | ||
− | | | + | | 2022-02 || latest-all.nt (2022-01-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 16.9 billion || 4 d 2 h || 127 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-01-29|title=Wolfgang Fahl - BITPlan Wiki}} |
|- | |- | ||
− | | | + | | 2020-08 || latest-all.nt (2020-08-15) || Apache Jena || 13.8 billion || 9 d 21 h || 64 || || || {{Link|target=WikiData_Import_2020-08-15|title=Wolfgang Fahl BITPlan Wiki}} |
|- | |- | ||
− | | | + | | 2020-07 || latest-truthy.nt (2020-07-15) || Apache Jena || 5.2 billion || 4 d 14 h || 64 || || || {{Link|target=WikiData_Import_2020-07-15|title=Wolfgang Fahl BITPlan Wiki}} |
|- | |- | ||
− | | | + | | 2020-06 || latest-all.ttl (2020-04-28) || Apache Jena || 12.9 billion || 6 d 16 h || ? || || || [https://issues.apache.org/jira/browse/JENA-1909 Jonas Sourlier - Jena Issue 1909] |
|- | |- | ||
| 2020-03 || latest-all.nt.bz2 (2020-03-01 || Virtuoso || ~11.8 billion || 10 hours + 1day prep || 248 || || || [https://community.openlinksw.com/t/loading-wikidata-into-virtuoso-open-source-or-enterprise-edition/2717 Hugh Williams - Virtuoso] | | 2020-03 || latest-all.nt.bz2 (2020-03-01 || Virtuoso || ~11.8 billion || 10 hours + 1day prep || 248 || || || [https://community.openlinksw.com/t/loading-wikidata-into-virtuoso-open-source-or-enterprise-edition/2717 Hugh Williams - Virtuoso] | ||
|- | |- | ||
− | | | + | | 2019-10 || || blazegraph || ~10 billion || 5.5 d|| 104 || 16 || || [https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits/ Adam Shoreland Wikimedia Foundation] |
|- | |- | ||
− | | | + | | 2019-09 || latest-all.ttl (2019-09)|| Virtuoso || 9.5 billion || 9.1 hours || ? || || || [https://lists.wikimedia.org/pipermail/wikidata/2019-September/013420.html Adam Sanchez - WikiData mailing list] |
|- | |- | ||
− | | | + | | 2019-05 || wikidata-20190513-all-BETA.ttl || Virtuoso || ? || 43 hours || ? || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam |- |
|- | |- | ||
− | | | + | | 2019-05 || wikidata-20190513-all-BETA.ttl || Blazegraph || ? || 10.2 days || || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam Sanchez WikiData mailing list] |
|- | |- | ||
− | | | + | | 2019-02 || latest-all.ttl.gz || Apache Jena || ? || > 2 days || ? || || ||[https://muncca.com/2019/02/14/wikidata-import-in-apache-jena corsin - muncca blog] |
|- | |- | ||
− | | | + | | 2018-01 || wikidata-20180101-all-BETA.ttl || Blazegraph || 3 billion || 4 days || 32 || 4 || 2.2 GHz || [http://wiki.bitplan.com/index.php?title=Get_your_own_copy_of_WikiData&action Wolfgang Fahl - BITPlan wiki] |
|- | |- | ||
− | | | + | | 2017-12 || latest-truthy.nt.gz || Apache Jena || ? || 8 hours || ? || || || [https://lists.apache.org/thread.html/70dde8e3d99ce3d69de613b5013c3f4c583d96161dec494ece49a412%40%3Cusers.jena.apache.org%3E Andy Seaborne Apache Jena Mailinglist] |
− | |||
− | |||
|} | |} | ||
+ | |||
= Prerequisites = | = Prerequisites = |
Revision as of 05:46, 29 January 2024
Why would you want your own WikiData copy?
The resources behind https://query.wikidata.org/ are scarce and used by a lot of people. You might hit the https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_limits quite quickly.
See SPARQL for some examples that work online (mostly) without hitting these limits.
Success Reports
2022 - 2024
25
2017-2022
# Date | Source | Target | Triples | days | RAM GB | CPU Cores | Speed | Link |
---|---|---|---|---|---|---|---|---|
2022-07 | latest-all.ttl (2022-07-12) | stardog | 17.2 billion | 1 d 19 h | 253 | Tim Holzheim - BITPlan Wiki | ||
2022-06 | latest-all.nt (2022-06-25) | QLever | 17.2 billion | 1 d 2 h | 128 | 8 | 1.8 GHz | Wolfgang Fahl - BITPlan Wiki |
2022-05 | latest-all.ttl.bz2 (2022-05-29) | QLever | ~17 billion | 14h | 128 | 12/24 | 4.8 GHz boost | Hannah Bast - QLever |
2022-02 | latest-all.nt (2022-02) | stardog | 16.7 billion | 9h | Evren Sirin - stardog | |||
2022-02 | latest-all.nt (2022-01-29) | QLever | 16.9 billion | 4 d 2 h | 127 | 8 | 1.8 GHz | Wolfgang Fahl - BITPlan Wiki |
2020-08 | latest-all.nt (2020-08-15) | Apache Jena | 13.8 billion | 9 d 21 h | 64 | Wolfgang Fahl BITPlan Wiki | ||
2020-07 | latest-truthy.nt (2020-07-15) | Apache Jena | 5.2 billion | 4 d 14 h | 64 | Wolfgang Fahl BITPlan Wiki | ||
2020-06 | latest-all.ttl (2020-04-28) | Apache Jena | 12.9 billion | 6 d 16 h | ? | Jonas Sourlier - Jena Issue 1909 | ||
2020-03 | latest-all.nt.bz2 (2020-03-01 | Virtuoso | ~11.8 billion | 10 hours + 1day prep | 248 | Hugh Williams - Virtuoso | ||
2019-10 | blazegraph | ~10 billion | 5.5 d | 104 | 16 | Adam Shoreland Wikimedia Foundation | ||
2019-09 | latest-all.ttl (2019-09) | Virtuoso | 9.5 billion | 9.1 hours | ? | Adam Sanchez - WikiData mailing list | ||
2019-05 | wikidata-20190513-all-BETA.ttl | Virtuoso | ? | 43 hours | ? | - | ||
2019-05 | wikidata-20190513-all-BETA.ttl | Blazegraph | ? | 10.2 days | Adam Sanchez WikiData mailing list | |||
2019-02 | latest-all.ttl.gz | Apache Jena | ? | > 2 days | ? | corsin - muncca blog | ||
2018-01 | wikidata-20180101-all-BETA.ttl | Blazegraph | 3 billion | 4 days | 32 | 4 | 2.2 GHz | Wolfgang Fahl - BITPlan wiki |
2017-12 | latest-truthy.nt.gz | Apache Jena | ? | 8 hours | ? | Andy Seaborne Apache Jena Mailinglist |
Prerequisites
Getting a copy of WikiData is not for the faint of heart.
You need quite a bit of patience and some hardware resources to get your own WikiData copy working. The resources you need are a moving target since WikiData is growing all the time.