Difference between revisions of "Gremlin python"
| Line 272: | Line 272: | ||
|    # we have a traversal now |    # we have a traversal now | ||
|    assert isinstance(gV,GraphTraversal) |    assert isinstance(gV,GraphTraversal) | ||
| − |    # convert it to a list to get the actual  | + |    # convert it to a list to get the actual vertices | 
|    vList=gV.toList() |    vList=gV.toList() | ||
|    # there should be 6 vertices |    # there should be 6 vertices | ||
Revision as of 15:14, 18 September 2019
| OsProject | |
|---|---|
| id | gremlin-python-tutorial | 
| state | |
| owner | WolfgangFahl | 
| title | Gremlin-Python mini tutorial | 
| url | https://github.com/WolfgangFahl/gremlin-python-tutorial | 
| version | 0.0.1 | 
| description | |
| date | 2019-09-17 | 
| since | |
| until | |
Do you already now Gremlin / Apache Tinkerpop?
If so you can continue with the preqrequisites part. Otherwise you might want to click on the Gremlin logo below.
There is also an explanation of Gremlin steps based on Java in this wiki.
This mini-tutorial is inspired by this stackoverflow question.
The goal is to get access to an apache tinkerpop/gremlin graph database via Python.
The examples in this tutorial have been tested on Ubuntu 18.04 LTS and MacOS with a MacPorts environment.
Prerequisites
- Java
- Python
- Gremlin-Server
- Gremlin-Console (for debugging)
To get the preqequisites you can either follow the manual or script based installation below. The script based installation is quicker - the manual installation gives you more insight and control over the installation steps.
Manual Installation
Installing Java
There are many ways to install Java and your mileage may vary.
sudo apt-get install openjdk-8-jre 
java -version
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
Installing Python and Pip
We assume you'd like to work with python 3.7
sudo apt install python3.7
python --version
Python 3.7.3
sudo apt install python-pip
pip --version
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.7)
Installing Gremlin-Python
sudo -H pip install -r requirements.txt
Installing Gremlin Server and Console
Download Gremlin Server and optionally Gremlin Console and unzip the downloaded files.
Starting the Gremlin Server
cd apache-tinkerpop-gremlin-server-3.4.3
bin/gremlin-server.sh conf/gremlin-server-modern.yaml
See #Gremlin-Server_start for the expected result.
Starting the Gremlin Console
cd apache-tinkerpop-gremlin-console-3.4.3
bin/gremlin.sh
Script based installation
The "run" installation helper script tries to automate the necessary steps
- Installation
- Gremlin-Server start
- Gremlin-Console start (for debugging)
- Python script start
The following command should get you going:
git clone https://github.com/WolfgangFahl/gremlin-python-tutorial
./run -i
./run -s
# in another console
./run -p
Help
usage: ./run  [-c|-h|-i|-p|-s|-t|-v]
  -c|--console: start console
  -h|--help: show this usage
  -i|--install: install prerequisites
  -p|--python: start python trial code
  -s|--server: start server
  -t|--test: start pytest
  -v|--version: show version
Version
./run -v
apache-tinkerpop-gremlin version 3.4.3
Installation
 run -i
installs
- gremlin server
- gremlin console
- gremlin python module
checking prerequisites ...
/usr/bin/java
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
/usr/bin/python
Python 2.7.15+
/usr/bin/pip
pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7)
downloading apache-tinkerpop-gremlin-server-3.4.3-bin.zip
unzipping apache-tinkerpop-gremlin-server-3.4.3-bin.zip
downloading apache-tinkerpop-gremlin-console-3.4.3-bin.zip
unzipping apache-tinkerpop-gremlin-console-3.4.3-bin.zip
installing needed python modules
Requirement already satisfied: futures in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 2))
Requirement already satisfied: gremlinpython in /usr/local/lib/python2.7/dist-packages (from -r requirements.txt (line 4))
Requirement already satisfied: isodate>=0.6.0 in /usr/local/lib/python2.7/dist-packages (from gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: six>=1.10.0 in /usr/lib/python2.7/dist-packages (from gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: aenum>=1.4.5 in /usr/local/lib/python2.7/dist-packages (from gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: tornado<5.0,>=4.4.1 in /usr/local/lib/python2.7/dist-packages (from gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: certifi in /usr/local/lib/python2.7/dist-packages (from tornado<5.0,>=4.4.1->gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: singledispatch in /usr/local/lib/python2.7/dist-packages (from tornado<5.0,>=4.4.1->gremlinpython->-r requirements.txt (line 4))
Requirement already satisfied: backports-abc>=0.4 in /usr/local/lib/python2.7/dist-packages (from tornado<5.0,>=4.4.1->gremlinpython->-r requirements.txt (line 4))
Gremlin-Server start
 ./run -s
starts the gremlin server with a default yaml-file in foreground
starting gremlin-server ...
[INFO] GremlinServer - 3.4.3
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
[INFO] GremlinServer - Configuring Gremlin Server from /home/wf/source/python/gremlin-python-tutorial/apache-tinkerpop-gremlin-server-3.4.3/conf/gremlin-server-modern.yaml
[INFO] MetricManager - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
[INFO] DefaultGraphManager - Graph [graph] was successfully configured via [conf/tinkergraph-empty.properties].
[INFO] ServerGremlinExecutor - Initialized Gremlin thread pool.  Threads in pool named with pattern gremlin-*
[INFO] ServerGremlinExecutor - Initialized GremlinExecutor and preparing GremlinScriptEngines instances.
[INFO] ServerGremlinExecutor - Initialized gremlin-groovy GremlinScriptEngine and registered metrics
[INFO] ServerGremlinExecutor - A GraphTraversalSource is now bound to [g] with graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
[INFO] OpLoader - Adding the standard OpProcessor.
[INFO] OpLoader - Adding the session OpProcessor.
[INFO] OpLoader - Adding the traversal OpProcessor.
[INFO] TraversalOpProcessor - Initialized cache for TraversalOpProcessor with size 1000 and expiration time of 600000 ms
[INFO] GremlinServer - Executing start up LifeCycleHook
[INFO] Logger$info - Loading 'modern' graph data.
[INFO] GremlinServer - idleConnectionTimeout was set to 0 which resolves to 0 seconds when configuring this value - this feature will be disabled
[INFO] GremlinServer - keepAliveInterval was set to 0 which resolves to 0 seconds when configuring this value - this feature will be disabled
[WARN] AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 serialization class is deprecated.
[INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0
[WARN] AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 serialization class is deprecated.
[INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0
[INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0
[INFO] AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0
[INFO] AbstractChannelizer - Configured application/vnd.graphbinary-v1.0 with org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1
[INFO] AbstractChannelizer - Configured application/vnd.graphbinary-v1.0-stringd with org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1
[INFO] GremlinServer$1 - Gremlin Server configured with worker thread pool of 1, gremlin pool of 4 and boss thread pool of 1.
[INFO] GremlinServer$1 - Channel started at port 8182.
Gremlin-Console start (for debugging)
 ./run -c
starts the gremlin console
starting gremlin-console ...
Sep 17, 2019 4:16:03 PM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin>
You can try out https://stackoverflow.com/a/52998299/1497139:
gremlin>  :remote connect tinkerpop.server conf/remote.yaml
==>Configured localhost/127.0.0.1:8182
:> g.V().values('name')
==>marko
==>vadas
==>lop
==>josh
==>ripple
==>peter
gremlin>  :remote console
==>All scripts will now be sent to Gremlin Server - [localhost/127.0.0.1:8182] - type ':remote console' to return to local mode
g.V()
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
gremlin> :exit
Python script start
./run -p
starts the python test script.
./run -p
starting python test code
The modern graph has 6 vertices
Python unit tests start
./run -t
Starts the pytest unit tests. Please make sure a gremlin-server is running.
./run -t
==================================== test session starts =====================================
platform darwin -- Python 3.7.4, pytest-5.1.2, py-1.8.0, pluggy-0.12.0
rootdir: /Users/wf/source/python/gremlin-python-tutorial
collected 1 item                                                                             
test_001.py .                                                                          [100%]
===================================== 1 passed in 12.92s =====================================
Getting Started
The Apache Tinkerpop Getting Started tutorial assumes you are using the groovy console to try things out. We'll use these steps of the tutorial to show how the same traversals are available via gremlin-python.
The modern graph will be the basis  for our first steps.
 for our first steps.
Gremlin-Python is just a Gremlin Language Variant - this means that the Graph Traversals are not executed in the Python enviroment but instead sent as "bytecode" to a server that will execute the traversal and sent back the result.
The first five minutes
g - the graph traversal
In the python environment to get the starting point "g" - the graph traversal you need to create a remote connection to a gremlin server. That's why we have to start the gremlin server e.g. with run -s from our automation script above. The gremlin server is configured to supply travesals for the "modern graph" example depicted above.
Python code for getting g - the graph traversal
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
g = traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
g.V() - the vertices
# http://wiki.bitplan.com/index.php/Gremlin_python#g.V.28.29_-_the_vertices
def test_gV():
  # get the vertices
  gV=g.V()
  # we have a traversal now
  assert isinstance(gV,GraphTraversal)
  # convert it to a list to get the actual vertices
  vList=gV.toList()
  # there should be 6 vertices
  assert len(vList)==6
  # the default string representation of a vertex is showing the # IDEA:
  # of a vertex
  assert str(vList)=="[v[1], v[2], v[3], v[4], v[5], v[6]]"
Links
- https://pypi.org/project/gremlinpython/
- https://stackoverflow.com/questions/tagged/gremlinpython
- http://tinkerpop.apache.org/downloads.html
- http://tinkerpop.apache.org/docs/3.4.3/reference/#connecting-via-console
- https://gist.githubusercontent.com/okram/f193d5616563a69ad5714a42c504276f/raw/b8075410e400e18f18360015945f3760d99d044a/gremlin-python-play.py
