Wikidata Import 2025-06-06
Jump to navigation
Jump to search
Import
Import | |
---|---|
state | ✅ |
url | https://wiki.bitplan.com/index.php/Wikidata_Import_2025-06-06 |
target | blazegraph |
start | 2025-06-06 |
end | 2025-06-07 |
days | 1 |
os | Ubuntu 22.04.5 LTS |
cpu | AMD Ryzen 9 5900X 12-Core Processor |
ram | 128 |
triples | |
comment | seeded with 1.3 TB data.jnl file originally provided by James Hare |
This "import" is not using a dump and indexing approach but directly copying a blazegraph journal file.
Steps
Copy journal file
md5sum data.jnl
6ebe0cced1a22c6cf3fecb56afcf1c10 data.jnl
blockdownload --name wikidata --blocksize 512 --boost 8 --progress https://wikidata-dump.wikidata.dbis.rwth-aachen.de/data.jnl .
Blocks ∅: 100%|████████████████████████████████████████████| 1.39T/1.39T [7:31:03<00:00, 51.5MB/s]
blockdownload --name wikidata --output data.jnl https://wikidata-dump.wikidata.dbis.rwth-aachen.de/data.jnl 2025-06-05 --progress
setup wdqs environment
git clone https://github.com/scatter-llc/private-wikidata-query wdqs
mkdir wdqs/data
mv data.jnl wdqs/data
docker compose up -d
[+] Running 28/28
✔ wdqs-frontend Pulled 5.4s
✔ wdqs Pulled 17.2s
✔ wdqs-proxy Pulled 11.7s
[+] Running 4/4
✔ Network wdqs_default Created 0.1s
✔ Container wdqs-wdqs-1 Started 3.6s
✔ Container wdqs-wdqs-proxy-1 Started 0.4s
✔ Container wdqs-wdqs-frontend-1 Started 0.6s
Wikidata state
Returns total triple count and dateModified of the Wikidata root node
query
# show the number of triples and the timestamp of the last modification
PREFIX schema: <http://schema.org/>
SELECT
(?count as ?tripleCount)
?dateModified
(STR(?dateModified) as ?timestamp)
WHERE {
{
SELECT (COUNT(*) AS ?count) {
?s ?p ?o
}
}
OPTIONAL {
<http://www.wikidata.org> schema:dateModified ?dateModified
}
}
result
tripleCount | dateModified | timestamp |
---|---|---|
16842771273 | 2025-06-07 17:45:08 | 2025-06-07T17:45:08Z |