Difference between revisions of "Wikidata Import 2024-02-18"

From BITPlan Wiki
Jump to navigation Jump to search
(Created page with "{{PageSequence|prev=Wikidata Import 2024-01-20|next=|category=Wikidata|categoryIcon=cloud-download}}")
 
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{PageSequence|prev=Wikidata Import 2024-01-20|next=|category=Wikidata|categoryIcon=cloud-download}}
+
{{PageSequence|prev=Wikidata Import 2024-01-20|next=Wikidata Import 2024-04-13|category=Wikidata|categoryIcon=cloud-download}}
 +
=Import=
 +
{{Import
 +
|state=✅
 +
|url=https://wiki.bitplan.com/index.php/Wikidata_Import_2024-02-18
 +
|target=QLever
 +
|start=2024-02-18
 +
|end=2024-02-18
 +
|days=0.5
 +
|os=Ubuntu 22.04.3 LTS
 +
|cpu=AMD Ryzen 9 5900X 12-Core Processor @ 4.95GHz
 +
|ram=128
 +
|triples=15.5
 +
|storemode=property
 +
}}
 +
= Docker =
 +
<source lang='bash' highlight='1,14'>
 +
docker pull adfreiburg/qlever
 +
Using default tag: latest
 +
latest: Pulling from adfreiburg/qlever
 +
01007420e9b0: Pull complete
 +
460c63749ea2: Pull complete
 +
91b2277608b5: Pull complete
 +
c1a82dc7696f: Pull complete
 +
4593d1466d3e: Pull complete
 +
84b5c44e1220: Pull complete
 +
46cc3c2a5eaf: Pull complete
 +
Digest: sha256:80bc5f65dc9fe7cf5cd4c7ce326cdf97773a218d53534f3262c858a03b0e6d40
 +
Status: Downloaded newer image for adfreiburg/qlever:latest
 +
docker.io/adfreiburg/qlever:latest
 +
docker pull adfreiburg/qlever-ui
 +
Using default tag: latest
 +
latest: Pulling from adfreiburg/qlever-ui
 +
59bf1c3509f3: Pull complete
 +
07a400e93df3: Pull complete
 +
64052ee245ef: Pull complete
 +
a44d093ad4a5: Pull complete
 +
0381087ee065: Pull complete
 +
91c88323734b: Pull complete
 +
fdcee6d0309d: Pull complete
 +
e6b2715c1d5d: Pull complete
 +
b9c9f00cb678: Pull complete
 +
3f12ea50b177: Pull complete
 +
Digest: sha256:7f4b358d6a127e512979074de0c6e84f250a37bca46c494d8e04a62844716e48
 +
Status: Downloaded newer image for adfreiburg/qlever-ui:latest
 +
docker.io/adfreiburg/qlever-ui:latest
 +
 
 +
</source>
 +
= QLever control =
 +
https://github.com/ad-freiburg/qlever-control
 +
<source lang='bash' highlight='1,9-10,13'>
 +
git clone https://github.com/ad-freiburg/qlever-control.git
 +
Cloning into 'qlever-control'...
 +
remote: Enumerating objects: 1118, done.
 +
remote: Counting objects: 100% (876/876), done.
 +
remote: Compressing objects: 100% (432/432), done.
 +
remote: Total 1118 (delta 399), reused 788 (delta 375), pack-reused 242
 +
Receiving objects: 100% (1118/1118), 247.70 KiB | 949.00 KiB/s, done.
 +
Resolving deltas: 100% (513/513), done.
 +
cd qlever-control/
 +
git checkout python-qlever
 +
Branch 'python-qlever' set up to track remote branch 'python-qlever' from 'origin'.
 +
Switched to a new branch 'python-qlever'
 +
pip install .
 +
Defaulting to user installation because normal site-packages is not writeable
 +
Processing /home/wf/source/python/qlever-control
 +
  Installing build dependencies ... done
 +
  Getting requirements to build wheel ... done
 +
  Preparing metadata (pyproject.toml) ... done
 +
Building wheels for collected packages: UNKNOWN
 +
  Building wheel for UNKNOWN (pyproject.toml) ... done
 +
  Created wheel for UNKNOWN: filename=UNKNOWN-0.0.0-py3-none-any.whl size=5111 sha256=83e57ed4efe8c8115d3d266f05a0cb97388cfc42f0f17bba900da02ba9c31bef
 +
  Stored in directory: /home/wf/.cache/pip/wheels/07/95/58/79d49197785a6e837569fd3f894d646428d2e272f53582c762
 +
Successfully built UNKNOWN
 +
Installing collected packages: UNKNOWN
 +
Successfully installed UNKNOWN-0.0.0
 +
</source>
 +
== dblp warmup test ==
 +
<source lang='bash' highlight='1,2'>
 +
wf@fur:/hd/tepig/dblp$ qlever setup-config dblp
 +
wf@fur:/hd/tepig/dblp$ qlever get-data index restart test-query ui
 +
 
 +
Action "get-data"
 +
 
 +
curl -LO -C - https://dblp.org/rdf/dblp.ttl.gz
 +
 
 +
  % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
 +
                                Dload  Upload  Total  Spent    Left  Speed
 +
24 1663M  24  401M    0    0  7092k      0  0:04:00  0:00:57  0:03:03 7126k
 +
...
 +
</source>
 +
== wikidata ==
 +
<source lang='bash' highlight='1-2'>
 +
qlever setup-config wikidata
 +
qlever get-data index
 +
...
 +
2024-02-18 15:57:53.958 - INFO: QLever IndexBuilder, compiled on Fri Feb 16 16:15:27 UTC 2024 using git hash a70652
 +
2024-02-18 15:57:53.958 - INFO: You specified the input format: TTL
 +
2024-02-18 15:57:53.958 - INFO: Processing input triples from /dev/stdin ...
 +
2024-02-18 15:57:53.958 - INFO: You specified "locale = en_US" and "ignore-punctuation = 1"
 +
2024-02-18 15:57:53.958 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files that don't include multiline literals with unescaped newline characters and that have newline characters after the end of triples.
 +
2024-02-18 15:57:53.958 - INFO: You specified "num-triples-per-batch = 5,000,000", choose a lower value if the index builder runs out of memory
 +
2024-02-18 15:57:53.958 - INFO: Integers that cannot be represented by QLever will throw an exception (this is the default behavior)
 +
2024-02-18 15:59:03.263 - INFO: Input triples processed: 100,000,000
 +
2024-02-18 16:00:12.052 - INFO: Input triples processed: 200,000,000
 +
2024-02-18 16:01:17.385 - INFO: Input triples processed: 300,000,000
 +
...
 +
2024-02-18 23:47:16.498 - INFO: Triples processed: 27,800,000,000
 +
2024-02-18 23:47:26.341 - INFO: Triples processed: 27,900,000,000
 +
2024-02-18 23:47:38.089 - INFO: Triples processed: 28,000,000,000
 +
2024-02-18 23:47:48.185 - INFO: Triples processed: 28,100,000,000
 +
2024-02-18 23:47:59.993 - INFO: Triples processed: 28,200,000,000
 +
2024-02-18 23:48:11.605 - INFO: Triples processed: 28,300,000,000
 +
2024-02-18 23:48:15.254 - INFO: Statistics for PSO: #relations = 70,167, #blocks = 912,255, #triples = 28,339,760,365
 +
2024-02-18 23:48:15.254 - INFO: Statistics for POS: #relations = 70,167, #blocks = 912,255, #triples = 28,339,760,365
 +
2024-02-18 23:48:15.254 - INFO: Writing meta data for PSO and POS ...
 +
2024-02-18 23:48:19.327 - INFO: Index build completed
 +
</source>

Latest revision as of 10:32, 13 April 2024

Import

Import
edit
state  ✅
url  https://wiki.bitplan.com/index.php/Wikidata_Import_2024-02-18
target  QLever
start  2024-02-18
end  2024-02-18
days  0.5
os  Ubuntu 22.04.3 LTS
cpu  AMD Ryzen 9 5900X 12-Core Processor @ 4.95GHz
ram  128
triples  15.5
comment  

Docker

docker pull adfreiburg/qlever
Using default tag: latest
latest: Pulling from adfreiburg/qlever
01007420e9b0: Pull complete 
460c63749ea2: Pull complete 
91b2277608b5: Pull complete 
c1a82dc7696f: Pull complete 
4593d1466d3e: Pull complete 
84b5c44e1220: Pull complete 
46cc3c2a5eaf: Pull complete 
Digest: sha256:80bc5f65dc9fe7cf5cd4c7ce326cdf97773a218d53534f3262c858a03b0e6d40
Status: Downloaded newer image for adfreiburg/qlever:latest
docker.io/adfreiburg/qlever:latest
docker pull adfreiburg/qlever-ui
Using default tag: latest
latest: Pulling from adfreiburg/qlever-ui
59bf1c3509f3: Pull complete 
07a400e93df3: Pull complete 
64052ee245ef: Pull complete 
a44d093ad4a5: Pull complete 
0381087ee065: Pull complete 
91c88323734b: Pull complete 
fdcee6d0309d: Pull complete 
e6b2715c1d5d: Pull complete 
b9c9f00cb678: Pull complete 
3f12ea50b177: Pull complete 
Digest: sha256:7f4b358d6a127e512979074de0c6e84f250a37bca46c494d8e04a62844716e48
Status: Downloaded newer image for adfreiburg/qlever-ui:latest
docker.io/adfreiburg/qlever-ui:latest

QLever control

https://github.com/ad-freiburg/qlever-control

git clone https://github.com/ad-freiburg/qlever-control.git
Cloning into 'qlever-control'...
remote: Enumerating objects: 1118, done.
remote: Counting objects: 100% (876/876), done.
remote: Compressing objects: 100% (432/432), done.
remote: Total 1118 (delta 399), reused 788 (delta 375), pack-reused 242
Receiving objects: 100% (1118/1118), 247.70 KiB | 949.00 KiB/s, done.
Resolving deltas: 100% (513/513), done.
cd qlever-control/
git checkout python-qlever
Branch 'python-qlever' set up to track remote branch 'python-qlever' from 'origin'.
Switched to a new branch 'python-qlever'
pip install .
Defaulting to user installation because normal site-packages is not writeable
Processing /home/wf/source/python/qlever-control
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: UNKNOWN
  Building wheel for UNKNOWN (pyproject.toml) ... done
  Created wheel for UNKNOWN: filename=UNKNOWN-0.0.0-py3-none-any.whl size=5111 sha256=83e57ed4efe8c8115d3d266f05a0cb97388cfc42f0f17bba900da02ba9c31bef
  Stored in directory: /home/wf/.cache/pip/wheels/07/95/58/79d49197785a6e837569fd3f894d646428d2e272f53582c762
Successfully built UNKNOWN
Installing collected packages: UNKNOWN
Successfully installed UNKNOWN-0.0.0

dblp warmup test

wf@fur:/hd/tepig/dblp$ qlever setup-config dblp
wf@fur:/hd/tepig/dblp$ qlever get-data index restart test-query ui 

Action "get-data"

curl -LO -C - https://dblp.org/rdf/dblp.ttl.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 24 1663M   24  401M    0     0  7092k      0  0:04:00  0:00:57  0:03:03 7126k
...

wikidata

qlever setup-config wikidata
qlever get-data index
...
2024-02-18 15:57:53.958 - INFO: QLever IndexBuilder, compiled on Fri Feb 16 16:15:27 UTC 2024 using git hash a70652
2024-02-18 15:57:53.958 - INFO: You specified the input format: TTL
2024-02-18 15:57:53.958 - INFO: Processing input triples from /dev/stdin ...
2024-02-18 15:57:53.958 - INFO: You specified "locale = en_US" and "ignore-punctuation = 1"
2024-02-18 15:57:53.958 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files that don't include multiline literals with unescaped newline characters and that have newline characters after the end of triples.
2024-02-18 15:57:53.958 - INFO: You specified "num-triples-per-batch = 5,000,000", choose a lower value if the index builder runs out of memory
2024-02-18 15:57:53.958 - INFO: Integers that cannot be represented by QLever will throw an exception (this is the default behavior)
2024-02-18 15:59:03.263 - INFO: Input triples processed: 100,000,000
2024-02-18 16:00:12.052 - INFO: Input triples processed: 200,000,000
2024-02-18 16:01:17.385 - INFO: Input triples processed: 300,000,000
...
2024-02-18 23:47:16.498 - INFO: Triples processed: 27,800,000,000
2024-02-18 23:47:26.341 - INFO: Triples processed: 27,900,000,000
2024-02-18 23:47:38.089 - INFO: Triples processed: 28,000,000,000
2024-02-18 23:47:48.185 - INFO: Triples processed: 28,100,000,000
2024-02-18 23:47:59.993 - INFO: Triples processed: 28,200,000,000
2024-02-18 23:48:11.605 - INFO: Triples processed: 28,300,000,000
2024-02-18 23:48:15.254 - INFO: Statistics for PSO: #relations = 70,167, #blocks = 912,255, #triples = 28,339,760,365
2024-02-18 23:48:15.254 - INFO: Statistics for POS: #relations = 70,167, #blocks = 912,255, #triples = 28,339,760,365
2024-02-18 23:48:15.254 - INFO: Writing meta data for PSO and POS ...
2024-02-18 23:48:19.327 - INFO: Index build completed