Wikidata Import 2025-05-02

From BITPlan Wiki
Jump to navigation Jump to search

Import

Import
edit
state  
url  https://wiki.bitplan.com/index.php/Wikidata_Import_2025-05-02
target  blazegraph
start  2025-05-02
end  2025-05-03
days  0.6
os  Ubuntu 22.04.3 LTS
cpu  Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz (16 cores)
ram  512
triples  
comment  


This "import" is not using a dump and indexing approach but directly copying a blazegraph journal file.

Steps

Copy journal file

Source https://scatter.red/ wikidata installation. Usimng aria2c with 16 connections the copy took some 5 hours.

git clone the priv-wd-query

git clone https://github.com/scatter-llc/private-wikidata-query
mkdir data
mv data.jnl private-wikidata-query/data
cd private-wikidata-query/data
# use proper uid and gid as per the containers preferences
chown 666:66 data.jnl
jh@wikidata:/hd/delta/blazegraph/private-wikidata-query/data$ ls -l
total 346081076
-rw-rw-r-- 1 666 66 1328514809856 May  2 22:07 data.jnl

start docker

docker compose up -d
WARN[0000] /hd/delta/blazegraph/private-wikidata-query/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
[+] Running 3/3
 ✔ Container private-wikidata-query-wdqs-1           Started               0.4s 
 ✔ Container private-wikidata-query-wdqs-proxy-1     Started               0.7s 
 ✔ Container private-wikidata-query-wdqs-frontend-1  Started               1.1s
docker ps | grep wdqs
36dad88ebfdc   wikibase/wdqs-frontend:wmde.11      "/entrypoint.sh ngin…"   About an hour ago   Up 3 minutes                    0.0.0.0:8099->80/tcp, [::]:8099->80/tcp                           private-wikidata-query-wdqs-frontend-1
f0d273cca376   caddy                               "caddy run --config …"   About an hour ago   Up 3 minutes                    80/tcp, 443/tcp, 2019/tcp, 443/udp                                private-wikidata-query-wdqs-proxy-1
d86124984e0f   wikibase/wdqs:0.3.97-wmde.8         "/entrypoint.sh /run…"   About an hour ago   Up 3 minutes                    0.0.0.0:9999->9999/tcp, [::]:9999->9999/tcp                       private-wikidata-query-wdqs-1
6011f5c1cc03   caddy                               "caddy run --config …"   12 months ago       Up 3 days                       80/tcp, 443/tcp, 2019/tcp, 443/udp                                wdqs-wdqs-proxy-1

Incompatible RWStore header version

docker logs private-wikidata-query-wdqs-1 2>&1 | grep -m 1 "Incompatible RWStore header version"
java.lang.RuntimeException: java.lang.IllegalStateException: Incompatible RWStore header version: storeVersion=0, cVersion=1024, demispace: true
docker exec -it private-wikidata-query-wdqs-1 /bin/bash
diff RWStore.properties RWStore.properties.bak-20250503 
--- RWStore.properties
+++ RWStore.properties.bak-20250503
@@ -56,6 +56,3 @@
    {"valueType":"DOUBLE","multiplier":"1000000000","serviceMapping":"LATITUDE"},\
    {"valueType":"LONG","multiplier":"1","minValue":"0","serviceMapping":"COORD_SYSTEM"}\
   ]}}
-
-# Added to fix Incompatible RWStore header version error
-com.bigdata.rwstore.RWStore.readBlobsAsync=false