Talk:SimpleGraph-Excel: Difference between revisions

From BITPlan Wiki
Jump to navigation Jump to search
No edit summary
Line 48: Line 48:


== UML diagram ==
== UML diagram ==
<uml format='svg'>
<mermaid>
hide circle
classDiagram
 
    %% Excel Package
package Excel {
    class Workbook {
  class Workbook {
        +String name
    name
    }
  }
   
 
    class Sheet {
  class Sheet {
        +String name
    name
    }
  }
   
 
    class Row {
  class Row {
        +int rownumber
    rownumber
    }
  }
   
 
    class Cell {
  class Cell {
        +int columnnumber
    columnnumber
        +String value
    value
    }
  }
   
 
    class Range {
  class Range {
        +String startCell
    startCell
        +String endCell
    endCell
        +String name
    name
    }
  }
   
 
    class Formula {
  class Formula {
        +String expression
    expression
        +String result
    result
    }
  }
   
 
    class Reference {
  class Reference {
        +String address
    address
        +boolean isAbsolute
    isAbsolute
        +String sheetName
    sheetName
    }
  }
   
 
    class Comment {
  class Comment {
        +String author
     author
        +String text
     text
        +DateTime timestamp
     timestamp
    }
  }
   
 
    %% Formula AST Package
  Workbook "1" --> "n" Sheet
    class ASTNode {
  Sheet "1" --> "title 1" Row
        <<abstract>>
  Sheet "1" --> "n" Row
        +String nodeType
  Sheet "1" --> "n" Range
    }
  Row "1" --> "n" Cell
   
  Cell "0..1" --> "0..1" Formula
    class FunctionCall {
  Cell "0..1" --> "0..1" Comment
        +String functionName
  Range "1" --> "n" Cell
        +List arguments
  Formula "1" --> "n" Reference
    }
}
   
 
    class BinaryOperation {
package "Formula AST" {
        +String operator
  abstract class ASTNode {
        +ASTNode leftOperand
    nodeType
        +ASTNode rightOperand
  }
    }
 
   
  class FunctionCall {
    class LiteralValue {
    functionName
        +String value
    arguments
        +String dataType
  }
    }
 
   
  class BinaryOperation {
    %% TinkerPop Package
     operator
    class Property {
    leftOperand
        +String name
    rightOperand
        +String value
  }
    }
 
   
  class LiteralValue {
    class Vertex {
    value
        +String id
    dataType
        +String label
  }
    }
 
   
  ASTNode <|-- FunctionCall
    class Edge {
  ASTNode <|-- BinaryOperation
        +String id
  ASTNode <|-- LiteralValue
        +String label
  ASTNode <|-- Reference
    }
 
      
  FunctionCall "1" --> "n" ASTNode : arguments
     class Edges {
  BinaryOperation "1" --> "1" ASTNode : left
    }
  BinaryOperation "1" --> "1" ASTNode : right
   
  Formula "1" --> "1" ASTNode : rootNode
     class Vertices {
}
    }
 
   
package TinkerPop {
    %% Excel Package Relationships
  class Property {
    Workbook "1" --> "n" Sheet
     name
    Sheet "1" --> "1" Row : title
     value
    Sheet "1" --> "n" Row : data
  }
    Sheet "1" --> "n" Range
 
    Row "1" --> "n" Cell
  class Vertex {
    Cell "0..1" --> "0..1" Formula
     id
    Cell "0..1" --> "0..1" Comment
    label
    Range "1" --> "n" Cell
  }
    Formula "1" --> "n" Reference
 
   
  class Edge {
    %% Formula AST Relationships
    id
     ASTNode <|-- FunctionCall
    label
    ASTNode <|-- BinaryOperation
  }
    ASTNode <|-- LiteralValue
 
    ASTNode <|-- Reference
  class Edges {
    FunctionCall "1" --> "n" ASTNode : arguments
  }
    BinaryOperation "1" --> "1" ASTNode : left
 
    BinaryOperation "1" --> "1" ASTNode : right
  class Vertices {
    Formula "1" --> "1" ASTNode : rootNode
  }
      
 
     %% TinkerPop Relationships
  Edge --> "1 in" Vertex
     Edge --> "1" Vertex : in
  Edge --> "1 out" Vertex
    Edge --> "1" Vertex : out
  Vertex --> "n properties" Property
    Vertex --> "n" Property : properties
  Edge --> "n properties" Property
    Edge --> "n" Property : properties
  Edges --> "n edges" Edge
    Edges --> "n" Edge
  Vertices --> "n vertices" Vertex
    Vertices --> "n" Vertex
}
   
 
    %% Cross-package Mappings
Cell -- Property : maps to
    Cell .. Property : maps to
Edge -- Sheet : maps to
    Edge .. Sheet : maps to
Vertex -- Sheet : maps to
    Vertex .. Sheet : maps to
Range -- Vertices : can represent
    Range .. Vertices : can represent
Range -- Edges : can represent
    Range .. Edges : can represent
Formula -- Property : computed property
    Formula .. Property : computed property
Reference -- Vertex : references
    Reference .. Vertex : references
Reference -- Edge : references
    Reference .. Edge : references
Comment -- Property : metadata
    Comment .. Property : metadata
FunctionCall -- Property : graph operation
    FunctionCall .. Property : graph operation
BinaryOperation -- Property : computation
    BinaryOperation .. Property : computation
LiteralValue -- Property : constant value
    LiteralValue .. Property : constant value
</uml>
</mermaid>


== Bidirectional Transformation ==
== Bidirectional Transformation ==

Revision as of 10:08, 27 January 2026

Mental Model

Mapping rules

  1. Each Vertex type (by label) is converted to an Excel sheet
  2. Each Edge type (by label) is converted to an Excel sheet
  3. Each vertex is converted to a row
  4. Each vertex property is converted to a cell
  5. Each edge is converted to a row
  6. Each edge property is converted to a cell
  7. The header row for Vertices has a column for each property key
  8. The header for for Edges has a column for each property key + an in and and out column
  9. The in and out header columns for edges have the label of the corresponding vertex in parentheses

Extended Mapping Rules

Range Mapping

  1. Ranges can represent collections of vertices or edges
  2. Named ranges map to specific vertex/edge label types
  3. Each range spans multiple cells corresponding to multiple graph elements
  4. Range references in formulas map to graph traversals across multiple vertices/edges

Formula Mapping

  1. Formulas stored in cells can represent computed properties
  2. Formula results are mapped to property values in the graph
  3. Formulas can reference other vertices/edges through cell references
  4. Complex formulas map to graph traversal expressions

Reference Mapping

  1. Cell references in formulas map to vertex/edge relationships
  2. Absolute references ($A$1) map to fixed vertex IDs
  3. Relative references (A1) map to traversal patterns
  4. Cross-sheet references map to cross-label relationships between different vertex/edge types

Comment Mapping

  1. Comments on cells map to metadata properties on vertices/edges
  2. Comment author maps to a metadata property
  3. Comment text maps to annotation content
  4. Comment timestamps preserve temporal information in the graph

Formula AST Mapping

  1. The Abstract Syntax Tree of formulas maps to graph operations
  2. FunctionCall nodes map to graph traversal operations:
    1. SUM() → aggregate property values across connected edges
    2. VLOOKUP() → vertex lookup by property value
    3. IF() → conditional edge creation or property assignment
  3. BinaryOperation nodes map to property computations or edge weights:
    1. Arithmetic operators (+, -, *, /) → computed properties from multiple vertex properties
    2. Comparison operators (=, <, >) → edge filtering conditions
  4. LiteralValue nodes map to constant property values directly stored in the graph

UML diagram

classDiagram  %% Excel Package class Workbook { +String name } class Sheet { +String name } class Row { +int rownumber } class Cell { +int columnnumber +String value } class Range { +String startCell +String endCell +String name } class Formula { +String expression +String result } class Reference { +String address +boolean isAbsolute +String sheetName } class Comment { +String author +String text +DateTime timestamp }  %% Formula AST Package class ASTNode { <<abstract>> +String nodeType } class FunctionCall { +String functionName +List arguments } class BinaryOperation { +String operator +ASTNode leftOperand +ASTNode rightOperand } class LiteralValue { +String value +String dataType }  %% TinkerPop Package class Property { +String name +String value } class Vertex { +String id +String label } class Edge { +String id +String label } class Edges { } class Vertices { }  %% Excel Package Relationships Workbook "1" --> "n" Sheet Sheet "1" --> "1" Row : title Sheet "1" --> "n" Row : data Sheet "1" --> "n" Range Row "1" --> "n" Cell Cell "0..1" --> "0..1" Formula Cell "0..1" --> "0..1" Comment Range "1" --> "n" Cell Formula "1" --> "n" Reference  %% Formula AST Relationships ASTNode <|-- FunctionCall ASTNode <|-- BinaryOperation ASTNode <|-- LiteralValue ASTNode <|-- Reference FunctionCall "1" --> "n" ASTNode : arguments BinaryOperation "1" --> "1" ASTNode : left BinaryOperation "1" --> "1" ASTNode : right Formula "1" --> "1" ASTNode : rootNode  %% TinkerPop Relationships Edge --> "1" Vertex : in Edge --> "1" Vertex : out Vertex --> "n" Property : properties Edge --> "n" Property : properties Edges --> "n" Edge Vertices --> "n" Vertex  %% Cross-package Mappings Cell .. Property : maps to Edge .. Sheet : maps to Vertex .. Sheet : maps to Range .. Vertices : can represent Range .. Edges : can represent Formula .. Property : computed property Reference .. Vertex : references Reference .. Edge : references Comment .. Property : metadata FunctionCall .. Property : graph operation BinaryOperation .. Property : computation LiteralValue .. Property : constant value

Bidirectional Transformation

Excel to Graph

  1. Parse workbook structure
  2. Create vertex types from sheet names (non-edge sheets)
  3. Create edge types from sheet names (sheets with in/out columns)
  4. For each row in vertex sheets:
    1. Create vertex with label = sheet name
    2. Map each cell to a property
    3. Evaluate formulas and store results
    4. Attach comments as metadata properties
  5. For each row in edge sheets:
    1. Create edge with label = sheet name
    2. Resolve in/out references to vertices
    3. Map remaining cells to edge properties
  6. Process named ranges as vertex/edge collections
  7. Build formula AST for computed properties

Graph to Excel

  1. Create sheet for each unique vertex label
  2. Create sheet for each unique edge label
  3. For each vertex:
    1. Create row in corresponding sheet
    2. Create cell for each property
    3. Add comments from metadata properties
  4. For each edge:
    1. Create row in corresponding sheet
    2. Add in/out columns with vertex references
    3. Create cell for each property
  5. Create named ranges for logical vertex/edge groupings
  6. Reconstruct formulas from computed properties with AST

Example Mapping

Graph Example

// Vertices
person1 = g.addV('Person').property('name', 'Alice').property('age', 30).next()
person2 = g.addV('Person').property('name', 'Bob').property('age', 25).next()
company1 = g.addV('Company').property('name', 'TechCorp').property('revenue', 1000000).next()

// Edges
g.addE('WORKS_AT').from(person1).to(company1).property('salary', 75000).next()
g.addE('WORKS_AT').from(person2).to(company1).property('salary', 65000).next()
g.addE('KNOWS').from(person1).to(person2).property('since', 2020).next()

Excel Example

Sheet: Person

name age totalSalary (Formula)
Alice 30 =VLOOKUP(A2,WORKS_AT!A:C,3,FALSE)
Bob 25 =VLOOKUP(A3,WORKS_AT!A:C,3,FALSE)

Sheet: Company

name revenue avgEmployeeSalary (Formula)
TechCorp 1000000 =AVERAGE(WORKS_AT!C:C)

Sheet: WORKS_AT (Edge)

out (Person) in (Company) salary
Alice TechCorp 75000
Bob TechCorp 65000

Sheet: KNOWS (Edge)

out (Person) in (Person) since
Alice Bob 2020

Named Ranges:

  • AllEmployees = Person!A2:B3
  • CompanyFinancials = Company!A2:C2
  • Salaries = WORKS_AT!C2:C3

Comments:

  • Cell Person!A2: "CEO of department" (Author: Admin, Date: 2024-01-15)
  • Cell Company!B2: "Annual revenue in USD" (Author: Finance, Date: 2024-01-10)

Implementation Notes

  1. Formula evaluation must handle circular references between sheets
  2. Cross-sheet references require maintaining vertex/edge ID mappings
  3. Named ranges improve query performance by pre-defining common traversals
  4. Comments preserve audit trail and metadata
  5. AST parsing enables complex graph queries from Excel formulas
  6. Absolute vs relative references determine query pattern (fixed vs traversal)