Difference between revisions of "Gremlin"
|  (→Graph) | |||
| Line 43: | Line 43: | ||
| </source> | </source> | ||
| − | |||
| − | |||
| − | |||
| − | |||
| === JUnit Testcase === | === JUnit Testcase === | ||
| <source lang='java'> | <source lang='java'> | ||
Revision as of 09:01, 25 April 2019
Gremlin is the graph traversal language of Apache TinkerPop. Gremlin is a functional, data-flow language that enables users to succinctly express complex traversals on (or queries of) their application's property graph. Every Gremlin traversal is composed of a sequence of (potentially nested) steps. A step performs an atomic operation on the data stream. Every step is either a map-step (transforming the objects in the stream), a filter-step (removing objects from the stream), or a sideEffect-step (computing statistics about the stream). The Gremlin step library extends on these 3-fundamental operations to provide users a rich collection of steps that they can compose in order to ask any conceivable question they may have of their data for Gremlin is Turing Complete.
Explaining Gremlin
There are different level on which gremlin can be explained:
On this page the goal is to cover all 4 levels with a focus on Java being applied to the modern example.
Graph
A Graph G= (V, E) consist of a finite set of vertices V and a finite set of edges E ⊆ V×V.
The Modern example
The Gremlin#The Modern example consists of the vertices:
- person (name: marko, age:29)
- person (name: vadas, age:27)
- software (name: lop, lang: java)
- person (name: josh, age:32)
- software (name: ripple, lang: java)
- person (name: peter, age:35)
GraphTraversal
One of the core concepts of tinkerpop/gremlin is the GraphTraversal It's interface has a generic definition as:
public interface GraphTraversal<S,E> extends Traversal<S,E>
and at https://markorodriguez.com/ the Author Marko Rodriguez explains the ideas behind using an generic approach vor handling Graphs. The Java implementation is available on github.
S is a generic Start class, and E is a generic End class as explained in the Apache Tinkerpop documentation.
A Graph Traversal Source is the starting point for working with a graph. The convention is to name this starting point
g
or
g()
JUnit Testcase
@Test
  public void testTraversal() {
    Graph graph = TinkerFactory.createModern();
    GraphTraversalSource g = graph.traversal();
    assertEquals(6,g.E().count().next().longValue());
    assertEquals(6,g.V().count().next().longValue());
  }
E() gives you access to the edges of a graph traversal. V() gives you access to the vertices of a graph traversal. In the above example we simply cound the edges and vertices and check our assumption that there are 6 edges and 6 vertices in the modern example graph.
Steps
As explained in Gremlin_Basics: "The Gremlin graph traversal language defines approximately 30 steps which can be understood as the instruction set of the Gremlin traversal machine. These steps are useful in practice, with typically only 10 or so of them being applied in the majority of cases. Each of the provided steps can be understood as being a specification of one of the 5 general types enumerated below".
 
General Steps
filter Step
Continues processing based on the given filter condition.
JUnit Test
  @Test
  public void testFilter() {
    assertEquals(3,g().V().filter(out()).count().next().longValue());
    assertEquals(4,g().V().filter(in()).count().next().longValue());
    assertEquals(5,g().E().filter(values("weight").
      is(P.gte(0.4))).count().next().longValue());
  }
There are 3 vertices having outgoing edges and 4 vertices having incoming edges in the modern example graph. There are 4 edges having a weight>=0.4;
map Step
A map step transforms the current step element to a new element (which may be empty). see also https://stackoverflow.com/questions/51015636/in-gremlin-how-does-map-really-work
JUnit Test
 @Test
  public void testMap() {
    assertEquals(6,g().V().map(values("name")).count().next().longValue());
    assertEquals(4,g().V().map(hasLabel("person")).count().next().longValue());
    assertEquals(2,g().V().map(has("lang","java")).count().next().longValue());
    List<Edge> outEdges = g().V().map(outE()).toList();
    assertEquals(3,outEdges.size());
    List<Object> edges = g().E().map(has("weight",0.4)).toList();
    assertEquals(2,edges.size());
    for (Object edge:edges) {
      assertTrue(edge instanceof Edge);
    }
  }
There are 6 vertices having a name property. There are 4 vertices with a "person" label. There are 2 vertices with the lang property having the value "java".There are 3 vertices having out edges. The toList() call returns a list of Edges. There are 2 edges having a weight of 0.4. The map step toList() returns a list of the edges for this last example (which are returned as generic objects).
flatMap Step
A flatMap step transforms the current step in a one to many fashion.
JUnit Test
 @Test
  public void testflatMap() {
    assertEquals(6,g().V().flatMap(values("name")).count().next().longValue());
    assertEquals(4,g().V().flatMap(hasLabel("person")).count().next().longValue());
    assertEquals(2,g().V().flatMap(has("lang","java")).count().next().longValue());
    List<Edge> outEdges = g().V().flatMap(outE()).toList();
    assertEquals(6,outEdges.size());
    List<Object> edges = g().E().flatMap(has("weight",0.4)).toList();
    assertEquals(2,edges.size());
    for (Object edge:edges) {
      assertTrue(edge instanceof Edge);
    }
  }
Note the difference to the testMap step. Only the outE() parameter behaves different. In the map() case only the first Edge is considered - in the flatMap case all edges are considered.
sideEffect Step
A sideEffect steps performs some operation on the traverser and passes it to the next step.
JUnit Test
  @Test
  public void testSideEffect() {
    assertEquals(6,g().V().sideEffect(addE("sideedge")).outE().
      hasLabel("sideedge").count().next().longValue());
  }
The sideffect in this example JUnit test case adds edges "on the fly".
branch Step
Split the traverser
JUnit Test
  @Test
  public void testBranch() {
   
  }
What links here
Links
- That Conf - Graph Database - What, Why, How - Presentation by Andrew Glassmann
- Practical Gremlin: An Apache TinkerPop Tutorial by Kelvin Lawrence see also https://github.com/krlawrence/graph
- https://github.com/bechbd/gremlin-ide
- Tinkerpop
Stackoverflow Questions
Recipes
Practical Gremlin: An Apache TinkerPop Tutorial by Kelvin Lawrence
Traversing Graphs with Gremlin
