Difference between revisions of "WikiData Import 2022-03-16"
Jump to navigation
Jump to search
Line 37: | Line 37: | ||
soft ulimit for files | soft ulimit for files | ||
1048576 | 1048576 | ||
+ | </source> | ||
+ | = Wikidata dump download = | ||
+ | <source lang='bash' highlight='1'> | ||
+ | ./qlever --wikidata_download | ||
+ | qlever-indices/wikidata already exists | ||
+ | wikidata.settings.json already copied to qlever-indices/wikidata | ||
+ | downloading wikidata lexemes:latest-lexemes.ttl.bz2 ... please wait typically 3min ... | ||
+ | wikidata lexemes download started at Mi 16. Mär 09:55:07 CET 2022 | ||
+ | --2022-03-16 09:55:07-- https://dumps.wikimedia.org/wikidatawiki/entities//latest-lexemes.ttl.bz2 | ||
+ | Resolving dumps.wikimedia.org (dumps.wikimedia.org)... 2620:0:861:1:208:80:154:7, 208.80.154.7 | ||
+ | Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|2620:0:861:1:208:80:154:7|:443... connected. | ||
+ | HTTP request sent, awaiting response... 200 OK | ||
+ | Length: 319665811 (305M) [application/octet-stream] | ||
+ | Saving to: ‘latest-lexemes.ttl.bz2’ | ||
+ | |||
+ | latest-lexemes.ttl.bz2 100%[========================================================================================>] 304,86M 4,41MB/s in 70s | ||
+ | |||
+ | 2022-03-16 09:56:17 (4,37 MB/s) - ‘latest-lexemes.ttl.bz2’ saved [319665811/319665811] | ||
+ | |||
+ | wikidata lexemes download finished at Mi 16. Mär 09:56:17 CET 2022 after 70 seconds | ||
+ | downloading wikidata dump:latest-all.ttl.bz2 ... please wait typically 6hours ... | ||
+ | wikidata dump download started at Mi 16. Mär 09:56:17 CET 2022 | ||
+ | --2022-03-16 09:56:17-- https://dumps.wikimedia.org/wikidatawiki/entities//latest-all.ttl.bz2 | ||
+ | Resolving dumps.wikimedia.org (dumps.wikimedia.org)... 2620:0:861:1:208:80:154:7, 208.80.154.7 | ||
+ | Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|2620:0:861:1:208:80:154:7|:443... connected. | ||
+ | HTTP request sent, awaiting response... 200 OK | ||
+ | Length: 93072933618 (87G) [application/octet-stream] | ||
+ | Saving to: ‘latest-all.ttl.bz2’ | ||
+ | |||
+ | latest-all.ttl.bz2 1%[> ] 1,02G 4,08MB/s eta 6h 0m | ||
</source> | </source> |
Revision as of 10:01, 16 March 2022
QLever trial
see https://github.com/ad-freiburg/qlever/blob/master/docs/quickstart.md
see QLever/script as discussed in QLever Issue #562 for the script which makes reproducing this attempt easier.
Environment/prerequisites
>=64 GB RAM and docker environment (e.g. Ubuntu) >1 TB diskspace (SSD preferred for speed)
./qlever -v -e
qlever version : 1.27 $ : 2022/03/16 08:54:18 $
needed software
docker → /usr/bin/docker ✅
top → /usr/bin/top ✅
df → /usr/bin/df ✅
jq → /usr/bin/jq ✅
lsb_release → /usr/bin/lsb_release ✅
free → /usr/bin/free ✅
operating system
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
docker version
Docker version 20.10.13, build a224086
memory
total used free shared buff/cache available
Mem: 125Gi 1,1Gi 121Gi 31Mi 2,9Gi 123Gi
Swap: 2,0Gi 0B 2,0Gi
diskspace
/dev/sdb5 116G 23G 88G 21% /
tmpfs 63G 0 63G 0% /dev/shm
/dev/sda1 3,6T 987G 2,5T 29% /hd/seel
/dev/sdb1 511M 4,0K 511M 1% /boot/efi
soft ulimit for files
1048576
Wikidata dump download
./qlever --wikidata_download
qlever-indices/wikidata already exists
wikidata.settings.json already copied to qlever-indices/wikidata
downloading wikidata lexemes:latest-lexemes.ttl.bz2 ... please wait typically 3min ...
wikidata lexemes download started at Mi 16. Mär 09:55:07 CET 2022
--2022-03-16 09:55:07-- https://dumps.wikimedia.org/wikidatawiki/entities//latest-lexemes.ttl.bz2
Resolving dumps.wikimedia.org (dumps.wikimedia.org)... 2620:0:861:1:208:80:154:7, 208.80.154.7
Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|2620:0:861:1:208:80:154:7|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 319665811 (305M) [application/octet-stream]
Saving to: ‘latest-lexemes.ttl.bz2’
latest-lexemes.ttl.bz2 100%[========================================================================================>] 304,86M 4,41MB/s in 70s
2022-03-16 09:56:17 (4,37 MB/s) - ‘latest-lexemes.ttl.bz2’ saved [319665811/319665811]
wikidata lexemes download finished at Mi 16. Mär 09:56:17 CET 2022 after 70 seconds
downloading wikidata dump:latest-all.ttl.bz2 ... please wait typically 6hours ...
wikidata dump download started at Mi 16. Mär 09:56:17 CET 2022
--2022-03-16 09:56:17-- https://dumps.wikimedia.org/wikidatawiki/entities//latest-all.ttl.bz2
Resolving dumps.wikimedia.org (dumps.wikimedia.org)... 2620:0:861:1:208:80:154:7, 208.80.154.7
Connecting to dumps.wikimedia.org (dumps.wikimedia.org)|2620:0:861:1:208:80:154:7|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 93072933618 (87G) [application/octet-stream]
Saving to: ‘latest-all.ttl.bz2’
latest-all.ttl.bz2 1%[> ] 1,02G 4,08MB/s eta 6h 0m