Difference between revisions of "Get your own copy of WikiData"

From BITPlan Wiki
Jump to navigation Jump to search
(Created page with "= Why would you want your own WikiData copy? = The resources behind https://query.wikidata.org/ are scarce and used by a lot of people. You might hit the https://www.wikidata....")
 
Line 5: Line 5:
  
 
= Success Reports =
 
= Success Reports =
 
+
== 2022 - 2024 ==
 +
{{#ask: [[Concept:Import]]|format=count}}
 +
{{#forminput:form=Import|button text=add Import}}
 +
{{#ask: [[Concept:Import]]
 +
|mainlabel=Import
 +
|?Import state = state
 +
|?Import url = url
 +
|?Import target = target
 +
|?Import start = start
 +
|?Import end = end
 +
|?Import days = days
 +
|?Import os = os
 +
|?Import cpu = cpu
 +
|?Import ram = ram
 +
|?Import triples = triples
 +
|limit=200
 +
|sort=Import start
 +
|order=desc
 +
}}
 +
== 2017-2022 ==
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
 
! # Date !! Source !! Target !!  Triples !! days !! RAM GB !! CPU Cores !! Speed !! Link  
 
! # Date !! Source !! Target !!  Triples !! days !! RAM GB !! CPU Cores !! Speed !! Link  
 
|-
 
|-
| 2017-12 || latest-truthy.nt.gz || Apache Jena || ? || 8 hours || ? || || || [https://lists.apache.org/thread.html/70dde8e3d99ce3d69de613b5013c3f4c583d96161dec494ece49a412%40%3Cusers.jena.apache.org%3E Andy Seaborne Apache Jena Mailinglist]
+
| 2022-07 || latest-all.ttl (2022-07-12) || [https://www.stardog.com stardog] || 17.2 billion || 1 d 19 h || 253  || || || {{Link|target=WikiData_Import_2022-07-12|title=Tim Holzheim - BITPlan Wiki}}
 +
|-
 +
| 2022-06 || latest-all.nt (2022-06-25) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 17.2 billion || 1 d 2 h || 128 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-06-25|title=Wolfgang Fahl -  BITPlan Wiki}}
 
|-
 
|-
| 2018-01 || wikidata-20180101-all-BETA.ttl || Blazegraph || 3 billion || 4 days || 32 || 4 || 2.2 GHz || [http://wiki.bitplan.com/index.php?title=Get_your_own_copy_of_WikiData&action Wolfgang Fahl - BITPlan wiki]
+
| 2022-05 || latest-all.ttl.bz2 (2022-05-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || ~17 billion || 14h || 128 || 12/24 || 4.8 GHz boost || {{Link|target=https://github.com/ad-freiburg/qlever/wiki/Using-QLever-for-Wikidata|title=Hannah Bast - QLever}}
 
|-
 
|-
| 2019-02 || latest-all.ttl.gz || Apache Jena || ? || > 2 days || || || ||[https://muncca.com/2019/02/14/wikidata-import-in-apache-jena corsin - muncca blog]
+
| 2022-02 || latest-all.nt (2022-02) || [https://www.stardog.com stardog] || 16.7 billion || 9h || || || || {{Link|target=https://www.stardog.com/labs/blog/wikidata-in-stardog|title=Evren Sirin - stardog}}
 
|-
 
|-
| 2019-05 || wikidata-20190513-all-BETA.ttl || Blazegraph || ? || 10.2 days || || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam Sanchez WikiData mailing list]
+
| 2022-02 || latest-all.nt (2022-01-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 16.9 billion || 4 d 2 h || 127 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-01-29|title=Wolfgang Fahl - BITPlan Wiki}}
 
|-
 
|-
| 2019-05 || wikidata-20190513-all-BETA.ttl || Virtuoso || ? || 43 hours || ? || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam |-
+
| 2020-08 || latest-all.nt (2020-08-15) || Apache Jena || 13.8 billion || 9 d 21 h || 64 || || || {{Link|target=WikiData_Import_2020-08-15|title=Wolfgang Fahl BITPlan Wiki}}
 
|-
 
|-
| 2019-09 || latest-all.ttl (2019-09)|| Virtuoso || 9.5 billion || 9.1 hours || ? || || || [https://lists.wikimedia.org/pipermail/wikidata/2019-September/013420.html Adam Sanchez - WikiData mailing list]
+
| 2020-07 || latest-truthy.nt (2020-07-15) || Apache Jena || 5.2 billion || 4 d 14 h || 64 || || || {{Link|target=WikiData_Import_2020-07-15|title=Wolfgang Fahl BITPlan Wiki}}
 
|-
 
|-
| 2019-10 || || blazegraph || ~10 billion || 5.5 d|| 104 || 16 || || [https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits/ Adam Shoreland Wikimedia Foundation]
+
| 2020-06 || latest-all.ttl (2020-04-28) || Apache Jena || 12.9 billion || 6 d 16 h || ? || || || [https://issues.apache.org/jira/browse/JENA-1909 Jonas Sourlier - Jena Issue 1909]
 
|-
 
|-
 
| 2020-03 ||  latest-all.nt.bz2 (2020-03-01 || Virtuoso || ~11.8 billion || 10 hours + 1day prep || 248 || ||  || [https://community.openlinksw.com/t/loading-wikidata-into-virtuoso-open-source-or-enterprise-edition/2717 Hugh Williams - Virtuoso]
 
| 2020-03 ||  latest-all.nt.bz2 (2020-03-01 || Virtuoso || ~11.8 billion || 10 hours + 1day prep || 248 || ||  || [https://community.openlinksw.com/t/loading-wikidata-into-virtuoso-open-source-or-enterprise-edition/2717 Hugh Williams - Virtuoso]
 
|-
 
|-
| 2020-06 || latest-all.ttl (2020-04-28) || Apache Jena || 12.9 billion || 6 d 16 h || ? || || || [https://issues.apache.org/jira/browse/JENA-1909 Jonas Sourlier - Jena Issue 1909]
+
| 2019-10 || || blazegraph || ~10 billion || 5.5 d|| 104 || 16 || || [https://addshore.com/2019/10/your-own-wikidata-query-service-with-no-limits/ Adam Shoreland Wikimedia Foundation]
 
|-
 
|-
| 2020-07 || latest-truthy.nt (2020-07-15) || Apache Jena || 5.2 billion || 4 d 14 h || 64 || || || {{Link|target=WikiData_Import_2020-07-15|title=Wolfgang Fahl BITPlan Wiki}}
+
| 2019-09 || latest-all.ttl (2019-09)|| Virtuoso || 9.5 billion || 9.1 hours || ? || || || [https://lists.wikimedia.org/pipermail/wikidata/2019-September/013420.html Adam Sanchez - WikiData mailing list]
 
|-
 
|-
| 2020-08 || latest-all.nt (2020-08-15) || Apache Jena || 13.8 billion || 9 d 21 h || 64 || || || {{Link|target=WikiData_Import_2020-08-15|title=Wolfgang Fahl BITPlan Wiki}}
+
| 2019-05 || wikidata-20190513-all-BETA.ttl || Virtuoso || ? || 43 hours || ? || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam |-
 
|-
 
|-
| 2022-02 || latest-all.nt (2022-01-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 16.9 billion || 4 d 2 h || 127 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-01-29|title=Wolfgang Fahl - BITPlan Wiki}}
+
| 2019-05 || wikidata-20190513-all-BETA.ttl || Blazegraph || ? || 10.2 days || || || ||[https://lists.wikimedia.org/pipermail/wikidata/2019-June/013201.html Adam Sanchez WikiData mailing list]
 
|-
 
|-
| 2022-02 || latest-all.nt (2022-02) || [https://www.stardog.com stardog] || 16.7 billion || 9h || || || || {{Link|target=https://www.stardog.com/labs/blog/wikidata-in-stardog|title=Evren Sirin - stardog}}
+
| 2019-02 || latest-all.ttl.gz || Apache Jena || ? || > 2 days || || || ||[https://muncca.com/2019/02/14/wikidata-import-in-apache-jena corsin - muncca blog]
 
|-
 
|-
| 2022-05 || latest-all.ttl.bz2 (2022-05-29) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || ~17 billion || 14h || 128 || 12/24 || 4.8 GHz boost || {{Link|target=https://github.com/ad-freiburg/qlever/wiki/Using-QLever-for-Wikidata|title=Hannah Bast - QLever}}
+
| 2018-01 || wikidata-20180101-all-BETA.ttl || Blazegraph || 3 billion || 4 days || 32 || 4 || 2.2 GHz || [http://wiki.bitplan.com/index.php?title=Get_your_own_copy_of_WikiData&action Wolfgang Fahl - BITPlan wiki]
 
|-
 
|-
| 2022-06 || latest-all.nt (2022-06-25) || [https://qlever.cs.uni-freiburg.de/wikidata QLever] || 17.2 billion || 1 d 2 h || 128 || 8 || 1.8 GHz || {{Link|target=WikiData_Import_2022-06-25|title=Wolfgang Fahl -  BITPlan Wiki}}
+
| 2017-12 || latest-truthy.nt.gz || Apache Jena || ? || 8 hours || ? || || || [https://lists.apache.org/thread.html/70dde8e3d99ce3d69de613b5013c3f4c583d96161dec494ece49a412%40%3Cusers.jena.apache.org%3E Andy Seaborne Apache Jena Mailinglist]
|-
 
| 2022-07 || latest-all.ttl (2022-07-12) || [https://www.stardog.com stardog] || 17.2 billion || 1 d 19 h || 253  || || || {{Link|target=WikiData_Import_2022-07-12|title=Tim Holzheim - BITPlan Wiki}}
 
 
|}
 
|}
 +
  
 
= Prerequisites =
 
= Prerequisites =

Revision as of 06:46, 29 January 2024

Why would you want your own WikiData copy?

The resources behind https://query.wikidata.org/ are scarce and used by a lot of people. You might hit the https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_limits quite quickly.

See SPARQL for some examples that work online (mostly) without hitting these limits.

Success Reports

2022 - 2024

19

Importstateurltargetstartenddaysoscpuramtriples
Wikidata Import 2024-04-13https://wiki.bitplan.com/index.php/Wikidata Import 2024-04-13QLever13 April 2024Ubuntu 22.04.3 LTS512
Wikidata Import 2024-02-18https://wiki.bitplan.com/index.php/Wikidata Import 2024-02-18QLever18 February 202418 February 20240.5Ubuntu 22.04.3 LTSAMD Ryzen 9 5900X 12-Core Processor @ 4.95GHz12815.5
Wikidata Import 2024-01-20https://wiki.bitplan.com/index.php/Wikidata Import 2024-01-20QLever20 January 202421 January 20240.5Ubuntu 22.04.2 LTSIntel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz25619.1
Wikidata Import 2023-05-15https://wiki.bitplan.com/index.php/Wikidata Import 2023-05-15QLever15 May 2023Ubuntu 22.04.2 LTSIntel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz256
Wikidata Import 2023-05-14https://wiki.bitplan.com/index.php/Wikidata Import 2023-05-14blazegraph14 May 2023Ubuntu 22.04.2 LTSIntel(R) Core(TM) i5-3570K CPU @ 3.40GHz32
Wikidata Import 2023-05-10https://wiki.bitplan.com/index.php/Wikidata Import 2023-05-10blazegraph10 May 2023Ubuntu 22.04.2 LTSIntel(R) Xeon(R) CPU X5690@3.47GHz6414.7
Wikidata Import 2023-05-05https://wiki.bitplan.com/index.php/Wikidata Import 2023-05-05blazegraph5 May 2023Ubuntu 22.04.2 LTSIntel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz25614.7
Wikidata Import 2023-05-03https://wiki.bitplan.com/index.php/Wikidata Import 2023-05-03blazegraph3 May 202326 June 202323Ubuntu 20.04.6 LTSIntel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz12814.7
Wikidata Import 2023-04-26https://wiki.bitplan.com/index.php/Wikidata Import 2023-04-26blazegraph26 April 2023Ubuntu 22.04.2 LTSIntel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz25614.7
Wikidata Import 2023-04-18https://wiki.bitplan.com/index.php/Wikidata Import 2023-04-18blazegraph18 April 202318 April 2023Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz25614.6
Wikidata Import 2023-01-24https://wiki.bitplan.com/index.php/Wikidata Import 2023-01-24QLever24 January 2023
WikiData Import 2022-07-20https://wiki.bitplan.com/index.php/WikiData Import 2022-07-20virtuoso20 July 2022
WikiData Import 2022-07-12https://wiki.bitplan.com/index.php/Wikidata On StardogStardog11 July 202214 July 20223256
Wikidata On Stardoghttps://wiki.bitplan.com/index.php/Wikidata On StardogStardog11 July 202214 July 20223256
WikiData Import 2022-06-25https://wiki.bitplan.com/index.php/WikiData Import 2022-06-25QLever25 June 202227 June 20221.1Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz128
Wikidata on Allegrographhttps://wiki.bitplan.com/index.php/Wikidata on AllegrographAllegrograph15 April 20220.3CentOS Linux release 7.9.2009AMD EPYC 7302, 3.0GHz25616.8
WikiData Import 2022-03-11https://wiki.bitplan.com/index.php/WikiData Import 2022-03-11QLever12 March 202212 March 2022
WikiData Import 2022-01-29https://wiki.bitplan.com/index.php/WikiData Import 2022-01-29QLever2 February 20226 February 20224Ubuntu 20.04.3 LTSQuad-Core AMD Opteron(tm) Processor 2374 HE6416.9
Wikidata Import 2018-01-05https://wiki.bitplan.com/index.php/Wikidata Import 2018-01-05blazegraph5 January 2018Quad-Core AMD Opteron(tm) Processor 2374 HE

2017-2022

# Date Source Target Triples days RAM GB CPU Cores Speed Link
2022-07 latest-all.ttl (2022-07-12) stardog 17.2 billion 1 d 19 h 253 Tim Holzheim - BITPlan Wiki
2022-06 latest-all.nt (2022-06-25) QLever 17.2 billion 1 d 2 h 128 8 1.8 GHz Wolfgang Fahl - BITPlan Wiki
2022-05 latest-all.ttl.bz2 (2022-05-29) QLever ~17 billion 14h 128 12/24 4.8 GHz boost Hannah Bast - QLever
2022-02 latest-all.nt (2022-02) stardog 16.7 billion 9h Evren Sirin - stardog
2022-02 latest-all.nt (2022-01-29) QLever 16.9 billion 4 d 2 h 127 8 1.8 GHz Wolfgang Fahl - BITPlan Wiki
2020-08 latest-all.nt (2020-08-15) Apache Jena 13.8 billion 9 d 21 h 64 Wolfgang Fahl BITPlan Wiki
2020-07 latest-truthy.nt (2020-07-15) Apache Jena 5.2 billion 4 d 14 h 64 Wolfgang Fahl BITPlan Wiki
2020-06 latest-all.ttl (2020-04-28) Apache Jena 12.9 billion 6 d 16 h ? Jonas Sourlier - Jena Issue 1909
2020-03 latest-all.nt.bz2 (2020-03-01 Virtuoso ~11.8 billion 10 hours + 1day prep 248 Hugh Williams - Virtuoso
2019-10 blazegraph ~10 billion 5.5 d 104 16 Adam Shoreland Wikimedia Foundation
2019-09 latest-all.ttl (2019-09) Virtuoso 9.5 billion 9.1 hours ? Adam Sanchez - WikiData mailing list
2019-05 wikidata-20190513-all-BETA.ttl Virtuoso ? 43 hours ? -
2019-05 wikidata-20190513-all-BETA.ttl Blazegraph ? 10.2 days Adam Sanchez WikiData mailing list
2019-02 latest-all.ttl.gz Apache Jena ? > 2 days ? corsin - muncca blog
2018-01 wikidata-20180101-all-BETA.ttl Blazegraph 3 billion 4 days 32 4 2.2 GHz Wolfgang Fahl - BITPlan wiki
2017-12 latest-truthy.nt.gz Apache Jena ? 8 hours ? Andy Seaborne Apache Jena Mailinglist


Prerequisites

Getting a copy of WikiData is not for the faint of heart.

You need quite a bit of patience and some hardware resources to get your own WikiData copy working. The resources you need are a moving target since WikiData is growing all the time.