Difference between revisions of "Wikidata Import 2025-06-06"

From BITPlan Wiki
Jump to navigation Jump to search
Line 32: Line 32:
 
== setup wdqs environment ==
 
== setup wdqs environment ==
 
# see also  [https://github.com/scatter-llc/private-wikidata-query/blob/main/README.md README.md] at https://github.com/scatter-llc/private-wikidata-query/blob/main/README.md
 
# see also  [https://github.com/scatter-llc/private-wikidata-query/blob/main/README.md README.md] at https://github.com/scatter-llc/private-wikidata-query/blob/main/README.md
<source lang='bash' highlight='1'>
+
<source lang='bash' highlight='1-4'>
https://github.com/scatter-llc/private-wikidata-query wdqs
+
git clone https://github.com/scatter-llc/private-wikidata-query wdqs
 
mkdir wdqs/data
 
mkdir wdqs/data
 
mv data.jnl wdqs/data
 
mv data.jnl wdqs/data
 +
docker compose up -d
 +
[+] Running 28/28
 +
✔ wdqs-frontend Pulled                                                                      5.4s
 +
✔ wdqs Pulled                                                                              17.2s
 +
✔ wdqs-proxy Pulled                                                                        11.7s
 +
[+] Running 4/4
 +
✔ Network wdqs_default            Created                                                  0.1s
 +
✔ Container wdqs-wdqs-1          Started                                                  3.6s
 +
✔ Container wdqs-wdqs-proxy-1    Started                                                  0.4s
 +
✔ Container wdqs-wdqs-frontend-1  Started                                                  0.6s
 
</source>
 
</source>

Revision as of 06:59, 7 June 2025

Import

Import
edit
state  ✅
url  https://wiki.bitplan.com/index.php/Wikidata_Import_2025-06-06
target  blazegraph
start  2025-06-06
end  
days  
os  Ubuntu 22.04.3 LTS
cpu  
ram  128
triples  
comment  seeded with 1.3 TB data.jnl file originally provided by James Hare


This "import" is not using a dump and indexing approach but directly copying a blazegraph journal file.

Steps

Copy journal file

md5sum data.jnl
6ebe0cced1a22c6cf3fecb56afcf1c10  data.jnl
blockdownload --name wikidata --blocksize 512  --boost 8 --progress https://wikidata-dump.wikidata.dbis.rwth-aachen.de/data.jnl . 
Blocks ∅: 100%|████████████████████████████████████████████| 1.39T/1.39T [7:31:03<00:00, 51.5MB/s]
blockdownload --name wikidata --output data.jnl https://wikidata-dump.wikidata.dbis.rwth-aachen.de/data.jnl 2025-06-05 --progress

setup wdqs environment

  1. see also README.md at https://github.com/scatter-llc/private-wikidata-query/blob/main/README.md
git clone https://github.com/scatter-llc/private-wikidata-query wdqs
mkdir wdqs/data
mv data.jnl wdqs/data
docker compose up -d
[+] Running 28/28
 ✔ wdqs-frontend Pulled                                                                      5.4s 
 ✔ wdqs Pulled                                                                              17.2s 
 ✔ wdqs-proxy Pulled                                                                        11.7s 
[+] Running 4/4
 ✔ Network wdqs_default            Created                                                   0.1s 
 ✔ Container wdqs-wdqs-1           Started                                                   3.6s 
 ✔ Container wdqs-wdqs-proxy-1     Started                                                   0.4s 
 ✔ Container wdqs-wdqs-frontend-1  Started                                                   0.6s