Difference between revisions of "Get your own copy of WikiData/2023"

From BITPlan Wiki
Jump to navigation Jump to search
Line 22: Line 22:
 
* https://www.mediawiki.org/wiki/User:AKlapper_(WMF)
 
* https://www.mediawiki.org/wiki/User:AKlapper_(WMF)
 
== Queries after import ==
 
== Queries after import ==
<source lang='sparql'>
+
=== Number of Triples ==
 +
<source lang='sparql' highlight='1'>
 +
SELECT (COUNT(*) as ?Triples) WHERE { ?s ?p ?o}
 +
Triples
 +
3.019.914.549
 +
</source>
 +
 
 +
=== TypeCount ===
 +
<source lang='sparql' highlight='1-7>
 
SELECT ?type (COUNT(?type) AS ?typecount)
 
SELECT ?type (COUNT(?type) AS ?typecount)
 
WHERE {
 
WHERE {
Line 30: Line 38:
 
ORDER by desc(?typecount)
 
ORDER by desc(?typecount)
 
LIMIT 7
 
LIMIT 7
</source>
 
<pre>
 
 
<http://wikiba.se/ontology#BestRank> 369637917
 
<http://wikiba.se/ontology#BestRank> 369637917
 
schema:Article 61229687
 
schema:Article 61229687
Line 39: Line 45:
 
<http://wikiba.se/ontology#GeoAutoPrecision> 101897
 
<http://wikiba.se/ontology#GeoAutoPrecision> 101897
 
<http://www.wikidata.org/prop/novalue/P17> 37884
 
<http://www.wikidata.org/prop/novalue/P17> 37884
</pre>
+
</source>
  
 
= Second Attempt 2020-05 =
 
= Second Attempt 2020-05 =
 
= Links =
 
= Links =
 
* https://stackoverflow.com/questions/48020506/wikidata-on-local-blazegraph-expected-an-rdf-value-here-found-line-1/48110100
 
* https://stackoverflow.com/questions/48020506/wikidata-on-local-blazegraph-expected-an-rdf-value-here-found-line-1/48110100

Revision as of 15:42, 9 May 2020

First Attempt 2018-01

The start of this attempt was on 2018-01-05. I tried to follow the procedure at:

~/wikidata/wikidata-query-rdf/dist/target/service-0.3.0-SNAPSHOT$nohup ./munge.sh -f data/latest-all.ttl.gz -d data/split -l en,de &
#logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n
08:23:02.391 [main] INFO  org.wikidata.query.rdf.tool.Munge - Switching to data/split/wikidump-000000001.ttl.gz
08:24:21.249 [main] INFO  org.wikidata.query.rdf.tool.Munge - Processed 10000 entities at (105, 47, 33)
08:25:07.369 [main] INFO  org.wikidata.query.rdf.tool.Munge - Processed 20000 entities at (162, 70, 41)
08:25:56.862 [main] INFO  org.wikidata.query.rdf.tool.Munge - Processed 30000 entities at (186, 91, 50)
08:26:43.594 [main] INFO  org.wikidata.query.rdf.tool.Munge - Processed 40000 entities at (203, 109, 59)
08:27:24.042 [main] INFO  org.wikidata.query.rdf.tool.Munge - Processed 50000 entities at (224, 126, 67)
...
java.nio.file.NoSuchFileException: ./mwservices.json

Import issues

Queries after import

= Number of Triples

SELECT (COUNT(*) as ?Triples) WHERE { ?s ?p ?o}
Triples
3.019.914.549

TypeCount

SELECT ?type (COUNT(?type) AS ?typecount)
WHERE {
  ?subject a ?type.
}
GROUP by ?type
ORDER by desc(?typecount)
LIMIT 7
<http://wikiba.se/ontology#BestRank>	369637917
schema:Article	61229687
<http://wikiba.se/ontology#GlobecoordinateValue>	5379022
<http://wikiba.se/ontology#QuantityValue>	697187
<http://wikiba.se/ontology#TimeValue>	234556
<http://wikiba.se/ontology#GeoAutoPrecision>	101897
<http://www.wikidata.org/prop/novalue/P17>	37884

Second Attempt 2020-05

Links