Apache Jena

From BITPlan Wiki
Revision as of 06:40, 14 September 2020 by Wf (talk | contribs) (→‎Example usecases)
Jump to navigation Jump to search

What is Apache Jena?

First Steps

The goals are:

  • Setting up an Apache Jena instance
  • loading some data
  • starting a Fuseki server

For Querying your data your might want to look at the SPARQL tutorial in this wiki.

When i first tried to get Apache Jena and the Fuseki server running i had some issues doing so. This page is to share how i solved the issues e.g.

Also some things are done slightly different here then the "standard" Apache Jena way. This is to avoid having to learn the details of Jena configuration but go with standard Unix and shell script approaches instead. We try to use things as much "out of the box" as possible.

Script to download Apache jena and load initial data with tdb2.tdbloader

#!/bin/bash
# WF 2020-05-10

# global settings
jena=apache-jena-3.16.0
tgz=$jena.tar.gz
jenaurl=http://mirror.easyname.ch/apache/jena/binaries/$tgz
base=/hd/luxio/gnd
data=$base/data
tdbloader=$jena/bin/tdb2.tdbloader

getjena() {
# download
if [ ! -f $tgz ]
then
  echo "downloading $tgz from $jenaurl"
  wget $jenaurl
else
  echo "$tgz already downloaded"
fi
# unpack
if [ ! -d $jena ]
then
  echo "unpacking $jena from $tgz"
  tar xvzf $tgz
else
  echo "$jena already unpacked"
fi
# create data directory
if [ ! -d $data ]
then
  echo "creating $data directory"
  mkdir -p $data
else
  echo "$data directory already created"
fi
}

#
# show the given timestamp
#
timestamp() {
 local msg="$1"
 local ts=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
 echo "$msg at $ts"
}

#
# load data for the given data dir and input
#
loaddata() {
  local data="$1"
  local input="$2"
  timestamp "start loading $input to $data"
  $tdbloader --loader=parallel --loc "$data" "$input" > tdb2-$phase-out.log 2> tdb2-$phase-err.log
	timestamp "finished loading $input to $data"
}

getjena
export TMPDIR=$base/tmp
if [ ! -d $TMPDIR ]
then
  echo "creating temporary directory $TMPDIR"
  mkdir $TMPDIR
else
  echo "using temporary directory $TMPDIR"
fi
if [ ! -f authorities-kongress_lds.ttl ]
then
  wget https://data.dnb.de/opendata/authorities-kongress_lds.ttl.gz
  gunzip authorities-kongress_lds.ttl.gz
fi
loaddata $data authorities-kongress_lds.ttl

Script to start Fuseki server

This script tries to avoid the complexity described in https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html by using the --loc parameter

#!/bin/bash
# WF 2020-06-25
# Jena Fuseki server installation
# see https://jena.apache.org/documentation/fuseki2/fuseki-run.html
version=3.16.0
fuseki=apache-jena-fuseki-$version
if [ ! -d $fuseki ]
then
 if [ ! -f $fuseki.tar.gz ]
 then
 wget http://archive.apache.org/dist/jena/binaries/$fuseki.tar.gz
 else
 echo $fuseki.tar.gz already downloaded
 fi
 echo "unpacking $fuseki.tar.gz"
 tar xvfz $fuseki.tar.gz
else
 echo $fuseki already downloaded and unpacked
fi
cd $fuseki
java -jar fuseki-server.jar --tdb2 --loc=../data /gnd

SSH tunnel to make port 3030 available

The default installation of Fuseki only allows connections from localhost. If you don't want to bother with the configuration you can use a SSH tunnel like this:

ssh -L 3030:localhost:3030 capri.bitplan.com

replacing capri.bitplan.com with the name of the server you are using. While the above ssh session is active you'll be able to point your browser to http://localhost:3030 and the Apache admin Web GUI will show and be functional.

Script to start Jena on boot

You might want to adapt the harddisk location where you keep your jena/fuseki installation and adapt the script name "fuseki" which is the script for starting Fuseki shown above.

# 
# start Apache jena server
#
startJena() {
  cd /hd/luxio/gnd
  log=/var/log/jena.log
  sudo touch $log
  sudo chmod 666 $log
  nohup sudo ./fuseki > /var/log/jena.log 2>&1 &
}

startJena

Apache server configuration

This configuration does not give you any access security so it's only good for small secure intranets / development environments.

# 
# Apache configuration for jena.bitplan.com
# Virtual host jena 
#
# see http://wiki.ubuntuusers.de/Apache/Virtual_Hosts
#
# to enable run
#   a2ensite jena 
# to disable run 
#   a2dissite jena 
#
# see  http://stackoverflow.com/a/13089668/1497139
#
# WF 2020-06-23 - use reverse proxy jena 
<VirtualHost *:80>
  ServerAdmin webmaster@bitplan.com
  ServerName jena.bitplan.com
  ProxyPreserveHost On
  ProxyRequests Off
  ProxyPass / http://localhost:3030/
  ProxyPassReverse / http://localhost:3030/
</VirtualHost>

Fuseki configuration

see https://stackoverflow.com/questions/63874908/fuseki-configuration

Example usecases

Get your own copy of wikidata

see Get_your_own_copy_of_WikiData

RDF API Tutorial examples

see https://jena.apache.org/tutorials/rdf_api.html

Script to compile and run tutorial examples

#!/bin/bash
# WF 2020-06-14
num=$1
pwd=$(pwd)
base=$pwd/apache-jena-3.15.0
cd $base/src-examples
echo compiling Tutorial $num
javac -cp "$base/lib/*" jena/examples/rdf/Tutorial$num.java
echo running Tutorial $num
java -cp ".:$base/lib/*" jena/examples/rdf/Tutorial$num

Tutorial03.java

Java Code

/** Tutorial 3 Statement attribute accessor methods
 */
public class Tutorial03 extends Object {
    public static void main (String args[]) {
    
        // some definitions
        String personURI    = "http://somewhere/JohnSmith";
        String givenName    = "John";
        String familyName   = "Smith";
        String fullName     = givenName + " " + familyName;
        // create an empty model
        Model model = ModelFactory.createDefaultModel();

        // create the resource
        //   and add the properties cascading style
        Resource johnSmith 
          = model.createResource(personURI)
                 .addProperty(VCARD.FN, fullName)
                 .addProperty(VCARD.N, 
                              model.createResource()
                                   .addProperty(VCARD.Given, givenName)
                                   .addProperty(VCARD.Family, familyName));
        
        // list the statements in the graph
        StmtIterator iter = model.listStatements();
        
        // print out the predicate, subject and object of each statement
        while (iter.hasNext()) {
            Statement stmt      = iter.nextStatement();         // get next statement
            Resource  subject   = stmt.getSubject();   // get the subject
            Property  predicate = stmt.getPredicate(); // get the predicate
            RDFNode   object    = stmt.getObject();    // get the object
            
            System.out.print(subject.toString());
            System.out.print(" " + predicate.toString() + " ");
            if (object instanceof Resource) {
                System.out.print(object.toString());
            } else {
                // object is a literal
                System.out.print(" \"" + object.toString() + "\"");
            }
            System.out.println(" .");
        }
    }
}

Trying it

./ct 03
compiling Tutorial 03
running Tutorial 03
http://somewhere/JohnSmith http://www.w3.org/2001/vcard-rdf/3.0#N 7555a24e-8f13-4508-91f0-9dfb26cd2239 .
http://somewhere/JohnSmith http://www.w3.org/2001/vcard-rdf/3.0#FN  "John Smith" .
7555a24e-8f13-4508-91f0-9dfb26cd2239 http://www.w3.org/2001/vcard-rdf/3.0#Family  "Smith" .
7555a24e-8f13-4508-91f0-9dfb26cd2239 http://www.w3.org/2001/vcard-rdf/3.0#Given  "John" .

Tutorial04.java

Java Code

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package jena.examples.rdf ;

import org.apache.jena.rdf.model.*;
import org.apache.jena.vocabulary.*;

/** Tutorial 4 - create a model and write it in XML form to standard out
 */
public class Tutorial04 extends Object {
    
    // some definitions
    static String tutorialURI  = "http://hostname/rdf/tutorial/";
    static String briansName   = "Brian McBride";
    static String briansEmail1 = "brian_mcbride@hp.com";
    static String briansEmail2 = "brian_mcbride@hpl.hp.com";
    static String title        = "An Introduction to RDF and the Jena API";
    static String date         = "23/01/2001";
    
    public static void main (String args[]) {
    
        // some definitions
        String personURI    = "http://somewhere/JohnSmith";
        String givenName    = "John";
        String familyName   = "Smith";
        String fullName     = givenName + " " + familyName;
        // create an empty model
        Model model = ModelFactory.createDefaultModel();

        // create the resource
        //   and add the properties cascading style
        Resource johnSmith 
          = model.createResource(personURI)
                 .addProperty(VCARD.FN, fullName)
                 .addProperty(VCARD.N, 
                              model.createResource()
                                   .addProperty(VCARD.Given, givenName)
                                   .addProperty(VCARD.Family, familyName));
        
        // now write the model in XML form to a file
        model.write(System.out);
    }
}

Trying it

./ct 04
compiling Tutorial 04
running Tutorial 04
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:vcard="http://www.w3.org/2001/vcard-rdf/3.0#">
  <rdf:Description rdf:about="http://somewhere/JohnSmith">
    <vcard:N rdf:parseType="Resource">
      <vcard:Family>Smith</vcard:Family>
      <vcard:Given>John</vcard:Given>
    </vcard:N>
    <vcard:FN>John Smith</vcard:FN>
  </rdf:Description>
</rdf:RDF>

Tutorial05.java

Java Code

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package jena.examples.rdf ;

import org.apache.jena.rdf.model.*;
import org.apache.jena.util.FileManager;

import java.io.*;

/** Tutorial 5 - read RDF XML from a file and write it to standard out
 */
public class Tutorial05 extends Object {

    /**
        NOTE that the file is loaded from the class-path and so requires that
        the data-directory, as well as the directory containing the compiled
        class, must be added to the class-path when running this and
        subsequent examples.
    */    
    static final String inputFileName  = "vc-db-1.rdf";
                              
    public static void main (String args[]) {
        // create an empty model
        Model model = ModelFactory.createDefaultModel();

        InputStream in = FileManager.get().open( inputFileName );
        if (in == null) {
            throw new IllegalArgumentException( "File: " + inputFileName + " not found");
        }
        
        // read the RDF/XML file
        model.read(in, "");
                    
        // write it to standard out
        model.write(System.out);            
    }
}

Try it!

wget https://jena.apache.org/tutorials/sparql_data/vc-db-1.rdf
mv vc-db-1.rdf apache-jena-3.15.0/src-examples/
./ct 05
compiling Tutorial 05
running Tutorial 05
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#">
  <rdf:Description rdf:about="http://somewhere/SarahJones">
    <vCard:N rdf:parseType="Resource">
      <vCard:Given>Sarah</vCard:Given>
      <vCard:Family>Jones</vCard:Family>
    </vCard:N>
    <vCard:FN>Sarah Jones</vCard:FN>
  </rdf:Description>
  <rdf:Description rdf:about="http://somewhere/JohnSmith">
    <vCard:N rdf:parseType="Resource">
      <vCard:Given>John</vCard:Given>
      <vCard:Family>Smith</vCard:Family>
    </vCard:N>
    <vCard:FN>John Smith</vCard:FN>
  </rdf:Description>
  <rdf:Description rdf:about="http://somewhere/MattJones">
    <vCard:N rdf:parseType="Resource">
      <vCard:Given>Matthew</vCard:Given>
      <vCard:Family>Jones</vCard:Family>
    </vCard:N>
    <vCard:FN>Matt Jones</vCard:FN>
  </rdf:Description>
  <rdf:Description rdf:about="http://somewhere/RebeccaSmith">
    <vCard:N rdf:parseType="Resource">
      <vCard:Given>Rebecca</vCard:Given>
      <vCard:Family>Smith</vCard:Family>
    </vCard:N>
    <vCard:FN>Becky Smith</vCard:FN>
  </rdf:Description>
</rdf:RDF>

Tutorial06.java

Java Code

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package jena.examples.rdf ;

import org.apache.jena.rdf.model.*;
import org.apache.jena.util.FileManager;
import org.apache.jena.vocabulary.*;

import java.io.*;

/** Tutorial navigating a model
 */
public class Tutorial06 extends Object {
    
    static final String inputFileName = "vc-db-1.rdf";
    static final String johnSmithURI = "http://somewhere/JohnSmith";
    
    public static void main (String args[]) {
        // create an empty model
        Model model = ModelFactory.createDefaultModel();
       
        // use the FileManager to find the input file
        InputStream in = FileManager.get().open(inputFileName);
        if (in == null) {
            throw new IllegalArgumentException( "File: " + inputFileName + " not found");
        }
        
        // read the RDF/XML file
        model.read(new InputStreamReader(in), "");
        
        // retrieve the Adam Smith vcard resource from the model
        Resource vcard = model.getResource(johnSmithURI);

        // retrieve the value of the N property
        Resource name = (Resource) vcard.getRequiredProperty(VCARD.N)
                                        .getObject();
        // retrieve the given name property
        String fullName = vcard.getRequiredProperty(VCARD.FN)
                               .getString();
        // add two nick name properties to vcard
        vcard.addProperty(VCARD.NICKNAME, "Smithy")
             .addProperty(VCARD.NICKNAME, "Adman");
        
        // set up the output
        System.out.println("The nicknames of \"" + fullName + "\" are:");
        // list the nicknames
        StmtIterator iter = vcard.listProperties(VCARD.NICKNAME);
        while (iter.hasNext()) {
            System.out.println("    " + iter.nextStatement().getObject()
                                            .toString());
        }
    }
}

Try it!

./ct 06
compiling Tutorial 06
running Tutorial 06
The nicknames of "John Smith" are:
    Adman
    Smithy

Tutorial07.java

Java Code

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package jena.examples.rdf ;

import org.apache.jena.rdf.model.*;
import org.apache.jena.util.FileManager;
import org.apache.jena.vocabulary.*;

import java.io.*;

/** Tutorial 7 - selecting the VCARD resources
 */
public class Tutorial07 extends Object {
    
    static final String inputFileName = "vc-db-1.rdf";
    
    public static void main (String args[]) {
        // create an empty model
        Model model = ModelFactory.createDefaultModel();
       
        // use the FileManager to find the input file
        InputStream in = FileManager.get().open(inputFileName);
        if (in == null) {
            throw new IllegalArgumentException( "File: " + inputFileName + " not found");
        }
        
        // read the RDF/XML file
        model.read( in, "");
        
        // select all the resources with a VCARD.FN property
        ResIterator iter = model.listResourcesWithProperty(VCARD.FN);
        if (iter.hasNext()) {
            System.out.println("The database contains vcards for:");
            while (iter.hasNext()) {
                System.out.println("  " + iter.nextResource()
                                              .getRequiredProperty(VCARD.FN)
                                              .getString() );
            }
        } else {
            System.out.println("No vcards were found in the database");
        }            
    }
}

Try it !

compiling Tutorial 07
running Tutorial 07
The database contains vcards for:
  Sarah Jones
  John Smith
  Matt Jones
  Becky Smith

Tutorial08.java

Java Code

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package jena.examples.rdf ;

import org.apache.jena.rdf.model.*;
import org.apache.jena.util.FileManager;
import org.apache.jena.vocabulary.*;

import java.io.*;


/** Tutorial 8 - demonstrate Selector methods
 */
public class Tutorial08 extends Object {
    
    static final String inputFileName = "vc-db-1.rdf";
    
    public static void main (String args[]) {
        // create an empty model
        Model model = ModelFactory.createDefaultModel();
       
        // use the FileManager to find the input file
        InputStream in = FileManager.get().open(inputFileName);
        if (in == null) {
            throw new IllegalArgumentException( "File: " + inputFileName + " not found");
        }
        
        // read the RDF/XML file
        model.read( in, "" );
        
        // select all the resources with a VCARD.FN property
        // whose value ends with "Smith"
        StmtIterator iter = model.listStatements(
            new 
                SimpleSelector(null, VCARD.FN, (RDFNode) null) {
                    @Override
                    public boolean selects(Statement s) {
                            return s.getString().endsWith("Smith");
                    }
                });
        if (iter.hasNext()) {
            System.out.println("The database contains vcards for:");
            while (iter.hasNext()) {
                System.out.println("  " + iter.nextStatement()
                                              .getString());
            }
        } else {
            System.out.println("No Smith's were found in the database");
        }            
    }
}

Try it !

./ct 08
compiling Tutorial 08
running Tutorial 08
The database contains vcards for:
  Becky Smith
  John Smith