Difference between revisions of "SimpleGraph"

From BITPlan Wiki
Jump to navigation Jump to search
 
(61 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Article|title=SimpleGraph|text=[https://github.com/BITPlan/com.bitplan.simplegraph SimpleGraph]
+
{{SimpleArticle|title=SimpleGraph|text={{OsProject|id=com.bitplan.simplegraph|owner=BITPlan|title=SimpleGraph System API wrapper|url=https://github.com/BITPlan/com.bitplan.simplegraph|version=0.0.5|date=2019-03-23}}
is an open source project that allows to wrap Systems APIs in a way that graph algorithms and storage can be applied. As an implementation Apache Gremlin/Tinkerpop is used.
+
is an open source project that allows to wrap Systems APIs in a way that graph algorithms and storage can be applied. As an implementation [http://tinkerpop.apache.org Apache Tinkerpop/Gremlin] is used.}}
= FileSystem example =
 
  
== Basics ==
+
{{:SimpleGraph/Links}}
A Filesystem is a graph. It consists of File and Directory nodes
+
 
<graphviz>
+
SimpleGraph uses the [http://www.enterpriseintegrationpatterns.com/ramblings/03_hubandspoke.html Hub and Spoke] and Adapter patterns heavily.
  digraph FileSystem {
+
 
    Directory -> File [ label="files" ]
+
http://www.enterpriseintegrationpatterns.com/img/IntegrationSpaghetti.gif
    Directory -> Directory [ label="files" ]
+
http://www.enterpriseintegrationpatterns.com/img/MessageBroker.gif
    File -> Directory [ label="parent" ]
+
 
    Directory -> Directory [ label="parent" ]
+
= Modules =
 +
There are currently {{#ask: [[Concept:SimpleGraphModule]]|format=count}} Modules available for SimpleGraph.
 +
Each module wraps an "external" API to make the functions and data behind that API available for graph processing with Apache Tinkerpop / Gremlin.
 +
== Module Hub and Spoke ==
 +
{{:SimpleGraphModuleHubAndSpoke}}
 +
 
 +
== Module Details ==
 +
{{SimpleGraphModuleMarkup|#userparam=intro}}
 +
{{#ask: [[Concept:SimpleGraphModule]]
 +
| mainlabel=SimpleGraphModule
 +
| ?SimpleGraphModule name = name
 +
| ?SimpleGraphModule modulename = modulename
 +
| ?SimpleGraphModule systemname = systemname
 +
| ?SimpleGraphModule logo = logo
 +
| ?SimpleGraphModule apiname = apiname
 +
| ?SimpleGraphModule apiurl = apiurl
 +
| ?SimpleGraphModule url = url
 +
| ?SimpleGraphModule documentation = documentation
 +
| sort=SimpleGraphModule name
 +
| format=template
 +
| link=none
 +
| userparam=row
 +
| named args=yes
 +
| template=SimpleGraphModuleMarkup
 +
}}
 +
{{SimpleGraphModuleMarkup|#userparam=outro}}
 +
 
 +
= Introduction =
 +
{{SimpleArticle|title=Motivation|text=Solving IT Problems across System boundaries can get very difficult. There may be a wealth of APIs which first looks helpful but given the diversity of approaches
 +
it a daunting task will need to be tackled to get reasonable results. More often than not projects are not even started since the cost/benefit ratio is not good enough.
 +
SimpleGraph aims to supply a unified graph API access to Systems for which this makes sense. In fact for most systems it makes sense to have a graph API. Quite a few problems will get much easier to solve if the subdivision of the problem is done with the goal to apply graph algorithms.
 +
See {{Link|target=SiGNaL#What_happens_if_you_view_the_world_as_a_graph.3F|title=What happens if you see the world as a graph?}}
 +
}}
 +
 
 +
{{SimpleArticle|title=Use Cases|text=
 +
== Mix and Match Office and other data ==
 +
Let's assume we have a business that works in the following manner:
 +
# There are pricelist for product categories in Microsoft Excel files
 +
# Product specifications are in PDF Format - the files are referenced in the Excel files
 +
# Invoices are written in Microsoft Word
 +
# There is an address book of customers in VCard format
 +
# Orders are handled via e-mail (that is automatically produced by the companies shop website)
 +
# Monthly reports are created with PowerPoint
 +
The monthly reports should in the future be modified to give geographical information. Basically a revenue/region sales count per product/region information is wanted.
 +
 
 +
The 1000 dollar question - can this report be created efficiently semi-manually with reasonable software effort?
 +
Sadly it sounds more like 10.000 or 100.000 dollar question.
 +
 
 +
There are at least 6 different systems involved.
 +
SimpleGraph simplifies accessing all 6 systems.
 +
}}
 +
{{SimpleArticle|title=Examples|text=
 +
== Air Routes ==
 +
The AirRoutes example is taken from [http://kelvinlawrence.net/book/Gremlin-Graph-Guide.html Practical Gremlin: An Apache TinkerPop Tutorial by Kelvin Lawrence] see also https://github.com/krlawrence/graph.
 +
<uml>
 +
  hide circle
 +
  package AirRoutes {
 +
  note top of airport: 3374
 +
  class airport { 
 +
    country
 +
    code
 +
    longest
 +
    city
 +
    elev
 +
    icao
 +
    lon
 +
    type
 +
    region
 +
    runways
 +
    lat
 +
    desc
 +
  }
 +
  note top of version: 1
 +
  class version { 
 +
    code
 +
    type
 +
    desc
 +
  }
 +
  note top of country: 237
 +
  class country { 
 +
    code
 +
    type
 +
    desc
 +
  }
 +
  note top of continent: 7
 +
  class continent { 
 +
    code
 +
    type
 +
    desc
 +
  }
 
   }
 
   }
</graphviz>
 
There is parent-child relation between Directories and their subfiles. A Directory may contain files and directories. A file is always leaf of the the tree. A directory may be a leaf if it's empty.
 
In our example we'll go from directories to the containing elements via the "files" edge/relation and from the files/directories to their parents via the "parent" edge.
 
== Example goal ==
 
We'd like to analyze some part of a filesystem and we'll use the "src" directory of the SimpleGraph project's source code as a starting point.
 
  
The graph below shows the source code structure for the SimpleGraph project with the root of the tree being the "src" directory. The nodes are clickable and will lead you to the corresponding file representation on github.
+
  airport --> airport: route
<graphviz>
+
   note on link: 43400
   digraph FileSystemGraph {
+
 
    rankdir="RL";
+
  continent --> airport: contains
        "header.txt" -> "etc" [ label="parent"]
+
  note on link: 6748
        "SimpleGraphImpl.java" -> "impl" [ label="parent"]
+
</uml>
        "test" -> "src" [ label="parent"]
+
 
        "java" -> "main" [ label="parent"]
+
== JUnit Test case ==
        "RythmContext.java" -> "rythm" [ label="parent"]
+
We'd like to read in the air-routes graph described above and create an Excel Work book from it.
        "etc" -> "src" [ label="parent"]
+
see [https://github.com/BITPlan/com.bitplan.simplegraph/blob/master/simplegraph-excel/src/test/java/com/bitplan/simplegraph/excel/TestExcelSystem.java TestExcelSystem.java]
        "rythm" -> "main" [ label="parent"]
+
=== Java Source Code ===
        "test.rythm" -> "rythm" [ label="parent"]
 
        "main" -> "src" [ label="parent"]
 
        "com" -> "java" [ label="parent"]
 
        "graphvizTree.rythm" -> "rythm" [ label="parent" URL="https://github.com/BITPlan/com.bitplan.simplegraph/blob/master/src/main/rythm/graphvizTree.rythm"]
 
        "air-routes-small.graphml" -> "test" [ label="parent"]
 
        "bitplan" -> "com" [ label="parent"]
 
        "java" -> "test" [ label="parent"]
 
        "filesystem" -> "bitplan" [ label="parent"]
 
        "air-routes.graphml" -> "test" [ label="parent"]
 
        "simplegraph" -> "bitplan" [ label="parent"]
 
        "com" -> "java" [ label="parent"]
 
        "rythm" -> "bitplan" [ label="parent"]
 
        "bitplan" -> "com" [ label="parent"]
 
        "FileSystem.java" -> "filesystem" [ label="parent"]
 
        "simplegraph" -> "bitplan" [ label="parent"]
 
        "FileNode.java" -> "filesystem" [ label="parent"]
 
        "BaseTest.java" -> "simplegraph" [ label="parent"]
 
        "impl" -> "simplegraph" [ label="parent"]
 
        "TestRythm.java" -> "simplegraph" [ label="parent"]
 
        "SimpleGraph.java" -> "simplegraph" [ label="parent"]
 
        "TestTinkerPop3.java" -> "simplegraph" [ label="parent"]
 
        "SimpleNode.java" -> "simplegraph" [ label="parent"]
 
        "TestFileSystem.java" -> "simplegraph" [ label="parent"]
 
        "SimpleSystem.java" -> "simplegraph" [ label="parent"]
 
        "TestDebug.java" -> "simplegraph" [ label="parent"]
 
        "SimpleSystemImpl.java" -> "impl" [ label="parent"]
 
        "TestSuite.java" -> "simplegraph" [ label="parent"]
 
        "SimpleNodeImpl.java" -> "impl" [ label="parent"]
 
  }
 
</graphviz>
 
== explanation ==
 
=== creating the graph ===
 
This graph visualization has been produced with the following Java lines which make sure that the
 
"src" Directory can be handled as a gremlin graph:
 
 
<source lang='java'>
 
<source lang='java'>
// create a new FileSystem acces supplying the result as a SimpleSystem API
+
  ExcelSystem es = new ExcelSystem();
SimpleSystem fs=new FileSystem();
+
  Graph graph = TestTinkerPop3.getAirRoutes();
// connect to this system with no extra information (e.g. no credentials) and move to the "src" node
+
  GraphTraversalSource g = graph.traversal();
SimpleNode start = fs.connect("").moveTo("src");
+
  Workbook wb = es.createWorkBook(g);
// do gremlin style out traversals recusively to any depth
+
  assertEquals(6, wb.getNumberOfSheets());
start.recursiveOut("files",Integer.MAX_VALUE);
+
  es.save(wb, testAirRouteFileName);
 
</source>
 
</source>
=== converting the graph to graphviz ===
+
View the resulting [[File:Air-routes.xlsx]] Excel file to
The graph is now available and can be traversed to create a graphviz version of it. We use
+
see what the tabular version of the graph looks like.
the {{Rythm}} template engine to do so. Within Rythm you can use Java code.
+
}}
 
+
= Links =
 +
* [https://stackoverflow.com/questions/tagged/simplegraph SimpleGraph Stackoverflow Questions]
 +
* [https://groups.google.com/forum/#!forum/simplegraph Google SimpleGraph Discussion Group]
  
 +
= Documentation =
 +
{{Article|title=Links|text=
 +
* {{Link|target=SimpleGraph-Installation}}
 +
* {{Link|target=SimpleGraph-Tutorial}}
 +
* {{Link|target=SimpleGraph-Core}}
 
}}
 
}}
 
 
[[Category:frontend]]
 
[[Category:frontend]]
 
[[Category:SiGNaL]]
 
[[Category:SiGNaL]]
 +
[[Category:SimpleGraph]]

Latest revision as of 12:02, 7 March 2021

SimpleGraph

OsProject
id  com.bitplan.simplegraph
state  
owner  BITPlan
title  SimpleGraph System API wrapper
url  https://github.com/BITPlan/com.bitplan.simplegraph
version  0.0.5
description  
date  2019-03-23
since  
until  

is an open source project that allows to wrap Systems APIs in a way that graph algorithms and storage can be applied. As an implementation Apache Tinkerpop/Gremlin is used.


Click here to comment see SimpleGraph

SimpleGraph uses the Hub and Spoke and Adapter patterns heavily.

IntegrationSpaghetti.gif MessageBroker.gif

Modules

There are currently 21 Modules available for SimpleGraph. Each module wraps an "external" API to make the functions and data behind that API available for graph processing with Apache Tinkerpop / Gremlin.

Module Hub and Spoke

Module Details

Module System wrapped API exposed Description
Circle-icons-calendar.svg CalDAV CalDAV ical4j library for parsing and building iCalendar data models makes Calendar data available via ical4j
Farm-Fresh vcard.png CardDAV CardDAV [ ] makes VCard data available
Microsoft Excel 2013-2019 logo.svg Excel Excel Apache POI XSSF/HSSF makes Microsoft Excel workbooks accessible via the Apache POI API
Folder.svg FileSystem FileSystem java.io.File makes your FileSystem accessible via the Java FileSystem API
SFA Polygon with hole.svg GeoJSON GeoJSON GeoJSON support for Google gson library makes GeoJSON data available
Octocat.png GitHub GitHub GitHub GraphQL Api v4 makes GitHub content accessible to Graph processing.
HTML5 logo and wordmark.svg HTML HTML HTML Cleaner makes HTML files accessible via HTML Cleaner parser
JSON vector logo.svg JSON JSON JSON makes JSON parse results accessible to Graph processing.
Java-Logo.svg Java Java javaparser makes Java code parse results accessible to Graph processing.
Email Icon.svg Mail Mail E-Mail access for rfc822 and MIME formatted Mailbox files (e.g. Thunderbird) makes Mail data available via Apache Mime4J
Map.png MapSystem MapSystem java.api.Map supplies a simple wrapper for a graph with nodes that have key/value pairs in form of HashMaps. We would not really need this since Apache Tinkerpop/Gremlin already supplies us with properties per node/vertex.

Still this system is useful as a helper system and to illustrate the wrapping concepts and possibilities of SimpleGraph

Mediawiki logo reworked.svg MediaWiki MediaWiki MediaWiki API makes MediaWiki site content accessible to Graph processing.

It exposes the MediaWiki API using the mediawiki-japi Library by BITPlan.

Pdf by mimooh.svg PDF PDF Apache PDFBox makes Portable Document Format (PDF) files accessible via the Apache PDFBox® API
Microsoft PowerPoint 2013-2019 logo.svg PowerPoint PowerPoint Apache POI XSLF/HSLF makes Microsoft PowerPoint presentations accessible via the Apache POI API
SemanticMediaWiki Logo.png SMW SMW SemanticMedia Wiki API makes Semantic MediaWiki accessible via the SMW API
Snmp.png SNMP SNMP SNMP4J Simple Network Monitoring Protocol SNMP Java API makes Simple Network Monitoring Protocol accessible via SNMP4J
Database.svg SQL SQL Java Database Connectivity (JDBC) API makes relational SQL databases accessible via the Java JDBC API.
TripleStore-Icon.png TripleStore TripleStore SiDIF-TripleStore makes BITPlan's SiDIF educational TripleStore accessible
Wikidata-logo-en.svg WikiData WikiData WikiData Toolkit makes WikiData data available via the Wikidata-Toolki API
Microsoft Word 2013-2019 logo.svg Word Word Apache POI XWPF/HWPF makes Microsoft Word Documents accessible via the Apache POI API
Xml logo.svg XML XML org.w3c.dom makes XML dom parse results accessible to Graph processing.

Introduction

Motivation

Solving IT Problems across System boundaries can get very difficult. There may be a wealth of APIs which first looks helpful but given the diversity of approaches it a daunting task will need to be tackled to get reasonable results. More often than not projects are not even started since the cost/benefit ratio is not good enough. SimpleGraph aims to supply a unified graph API access to Systems for which this makes sense. In fact for most systems it makes sense to have a graph API. Quite a few problems will get much easier to solve if the subdivision of the problem is done with the goal to apply graph algorithms. See What happens if you see the world as a graph?


Use Cases

Mix and Match Office and other data

Let's assume we have a business that works in the following manner:

  1. There are pricelist for product categories in Microsoft Excel files
  2. Product specifications are in PDF Format - the files are referenced in the Excel files
  3. Invoices are written in Microsoft Word
  4. There is an address book of customers in VCard format
  5. Orders are handled via e-mail (that is automatically produced by the companies shop website)
  6. Monthly reports are created with PowerPoint

The monthly reports should in the future be modified to give geographical information. Basically a revenue/region sales count per product/region information is wanted.

The 1000 dollar question - can this report be created efficiently semi-manually with reasonable software effort? Sadly it sounds more like 10.000 or 100.000 dollar question.

There are at least 6 different systems involved. SimpleGraph simplifies accessing all 6 systems.


Examples

Air Routes

The AirRoutes example is taken from Practical Gremlin: An Apache TinkerPop Tutorial by Kelvin Lawrence see also https://github.com/krlawrence/graph.

JUnit Test case

We'd like to read in the air-routes graph described above and create an Excel Work book from it. see TestExcelSystem.java

Java Source Code

  ExcelSystem es = new ExcelSystem();
  Graph graph = TestTinkerPop3.getAirRoutes();
  GraphTraversalSource g = graph.traversal();
  Workbook wb = es.createWorkBook(g);
  assertEquals(6, wb.getNumberOfSheets());
  es.save(wb, testAirRouteFileName);

View the resulting File:Air-routes.xlsx Excel file to see what the tabular version of the graph looks like.

Links

Documentation

Links


In Internet Explorer versions up to 8, things inside the canvas are inaccessible!