Talk:SimpleGraph-Excel: Difference between revisions

From BITPlan Wiki
Jump to navigation Jump to search
Tag: Manual revert
 
(8 intermediate revisions by the same user not shown)
Line 48: Line 48:


== UML diagram ==
== UML diagram ==
<mermaid>
<uml format='svg'>
classDiagram
hide circle
     %% Excel Package
 
     class Workbook {
package Excel {
        +String name
  class Workbook {
     }
    +name: String
    +sheets: List<Sheet>
  }
  note right of Workbook
     **Top-level container**
    Represents an Excel file
     Contains multiple sheets
  end note
 
  class Sheet {
    +name: String
    +index: int
    +cells: Map<Address, Cell>
  }
  note right of Sheet
    **Worksheet/Tab**
    Named collection of cells
    Organized in grid (rows/columns)
  end note
 
  class Cell {
    +row: int
    +column: int
    +value: Object
    +formulaText: String
     +hasFormula(): boolean
  }
  note right of Cell
    **Core data unit**
    - **value**: Evaluated result (number, text, etc.)
    - **formulaText**: Original formula string (e.g., "=SUM(A1:A10)")
    - **formula**: Parsed AST for evaluation
      
      
     class Sheet {
     Example:
        +String name
    formulaText = "=A1+B1*2"
     }
    formula = BinaryOp(+, CellRef(A1), BinaryOp(*, CellRef(B1), Literal(2)))
    value = 15.5 (after evaluation)
  end note
 
  class CellStyle {
    +font: String
    +fontSize: int
    +color: Color
    +alignment: String
    +border: String
  }
  note bottom of CellStyle
    **Formatting properties**
    Visual appearance only,
    doesn't affect calculations
  end note
 
  class Comment {
    +text: String
    +author: String
     +timestamp: DateTime
  }
 
  abstract class AstNode {
    +evaluate(context): Object
    +toString(): String
  }
  note right of AstNode
    **Formula Abstract Syntax Tree**
    Parsed representation of formula
    for evaluation and manipulation
      
      
     class Row {
     Each node can evaluate itself
        +int rownumber
    given the spreadsheet context
    }
  end note
   
 
    class Cell {
  class Literal {
        +int columnnumber
    +value: Object
        +String value
  }
    }
  note bottom of Literal
   
     Constants: numbers (42),
     class CellRange {
     strings ("hello"),
        +String startCell
     booleans (TRUE)
        +String endCell
  end note
        +String name
 
     }
  class CellReference {
      
    +address: String
    class Formula {
    +isAbsolute: boolean
        +String expression
  }
        +String result
  note bottom of CellReference
    }
     References: A1 (relative),
   
     $A$1 (absolute),
    class CellReference {
     $A1 (mixed)
        +String address
  end note
        +boolean isAbsolute
 
        +String sheetName
  class Function {
    }
    +name: String
      
     +arguments: List<AstNode>
     class Comment {
  }
        +String author
  note bottom of Function
        +String text
     Built-in functions:
        +DateTime timestamp
     SUM, AVERAGE, IF,
     }
     VLOOKUP, etc.
   
  end note
    %% Formula AST Package
 
    class ASTNode {
  class BinaryOp {
        +String nodeType
    +operator: String
    }
    +left: AstNode
      
     +right: AstNode
    class FunctionCall {
  }
        +String functionName
  note bottom of BinaryOp
        +List arguments
     Operators: +, -, *, /,
    }
     ^, =, <>, <, >, etc.
   
  end note
     class BinaryOperation {
 
        +String operator
  ' Hierarchy
     }
  Workbook "1" *-- "1..*" Sheet : contains
      
  Sheet "1" *-- "*" Cell : contains
    class LiteralValue {
 
        +String value
  ' Cell properties
        +String dataType
  Cell "1" *-- "1" CellStyle : styled by
    }
  Cell "1" o-- "0..1" AstNode : formula >
   
  Cell "1" -- "0..*" Comment : annotated by
    %% TinkerPop Package
 
    class GraphProperty {
  note on link
        +String name
     formulaText stores original string
        +String value
     formula stores parsed AST
    }
     Both are optional (cells can be plain values)
      
  end note
    class GraphVertex {
 
        +String id
  ' Formula AST hierarchy
        +String label
  AstNode <|-- Literal
    }
  AstNode <|-- CellReference
   
  AstNode <|-- Function
     class GraphEdge {
  AstNode <|-- BinaryOp
        +String id
 
        +String label
  Function "1" o-- "*" AstNode : has args
     }
  BinaryOp "1" o-- "1" AstNode : left operand
   
  BinaryOp "1" o-- "1" AstNode : right operand
    class EdgeCollection {
}
    }
 
   
note bottom
    class VertexCollection {
  **Key Design Points:**
    }
  1. Cell stores both formula text (user input) and parsed AST
   
  2. AST enables evaluation, dependency tracking, and formula editing
    %% Excel Package Relationships
  3. Separation of data (value), presentation (style), and logic (formula)
    Workbook "1" --> "n" Sheet
  4. AstNode tree can be traversed for recalculation when dependencies change
    Sheet "1" --> "n" Row
end note
    Sheet "1" --> "n" CellRange
</uml>
    Row "1" --> "n" Cell
    Cell "0..1" --> "0..1" Formula
    Cell "0..1" --> "0..1" Comment
     CellRange "1" --> "n" Cell
     Formula "1" --> "n" CellReference
      
    %% Formula AST Relationships
    ASTNode <|-- FunctionCall
    ASTNode <|-- BinaryOperation
    ASTNode <|-- LiteralValue
    ASTNode <|-- CellReference
    FunctionCall "1" --> "n" ASTNode
    BinaryOperation "1" --> "2" ASTNode
    Formula "1" --> "1" ASTNode
   
    %% TinkerPop Relationships
    GraphEdge --> "1" GraphVertex : in
    GraphEdge --> "1" GraphVertex : out
    GraphVertex --> "n" GraphProperty
    GraphEdge --> "n" GraphProperty
    EdgeCollection --> "n" GraphEdge
    VertexCollection --> "n" GraphVertex
</mermaid>


== Bidirectional Transformation ==
== Bidirectional Transformation ==

Latest revision as of 16:13, 27 January 2026

Mental Model

Mapping rules

  1. Each Vertex type (by label) is converted to an Excel sheet
  2. Each Edge type (by label) is converted to an Excel sheet
  3. Each vertex is converted to a row
  4. Each vertex property is converted to a cell
  5. Each edge is converted to a row
  6. Each edge property is converted to a cell
  7. The header row for Vertices has a column for each property key
  8. The header for for Edges has a column for each property key + an in and and out column
  9. The in and out header columns for edges have the label of the corresponding vertex in parentheses

Extended Mapping Rules

Range Mapping

  1. Ranges can represent collections of vertices or edges
  2. Named ranges map to specific vertex/edge label types
  3. Each range spans multiple cells corresponding to multiple graph elements
  4. Range references in formulas map to graph traversals across multiple vertices/edges

Formula Mapping

  1. Formulas stored in cells can represent computed properties
  2. Formula results are mapped to property values in the graph
  3. Formulas can reference other vertices/edges through cell references
  4. Complex formulas map to graph traversal expressions

Reference Mapping

  1. Cell references in formulas map to vertex/edge relationships
  2. Absolute references ($A$1) map to fixed vertex IDs
  3. Relative references (A1) map to traversal patterns
  4. Cross-sheet references map to cross-label relationships between different vertex/edge types

Comment Mapping

  1. Comments on cells map to metadata properties on vertices/edges
  2. Comment author maps to a metadata property
  3. Comment text maps to annotation content
  4. Comment timestamps preserve temporal information in the graph

Formula AST Mapping

  1. The Abstract Syntax Tree of formulas maps to graph operations
  2. FunctionCall nodes map to graph traversal operations:
    1. SUM() → aggregate property values across connected edges
    2. VLOOKUP() → vertex lookup by property value
    3. IF() → conditional edge creation or property assignment
  3. BinaryOperation nodes map to property computations or edge weights:
    1. Arithmetic operators (+, -, *, /) → computed properties from multiple vertex properties
    2. Comparison operators (=, <, >) → edge filtering conditions
  4. LiteralValue nodes map to constant property values directly stored in the graph

UML diagram

Bidirectional Transformation

Excel to Graph

  1. Parse workbook structure
  2. Create vertex types from sheet names (non-edge sheets)
  3. Create edge types from sheet names (sheets with in/out columns)
  4. For each row in vertex sheets:
    1. Create vertex with label = sheet name
    2. Map each cell to a property
    3. Evaluate formulas and store results
    4. Attach comments as metadata properties
  5. For each row in edge sheets:
    1. Create edge with label = sheet name
    2. Resolve in/out references to vertices
    3. Map remaining cells to edge properties
  6. Process named ranges as vertex/edge collections
  7. Build formula AST for computed properties

Graph to Excel

  1. Create sheet for each unique vertex label
  2. Create sheet for each unique edge label
  3. For each vertex:
    1. Create row in corresponding sheet
    2. Create cell for each property
    3. Add comments from metadata properties
  4. For each edge:
    1. Create row in corresponding sheet
    2. Add in/out columns with vertex references
    3. Create cell for each property
  5. Create named ranges for logical vertex/edge groupings
  6. Reconstruct formulas from computed properties with AST

Example Mapping

Graph Example

// Vertices
person1 = g.addV('Person').property('name', 'Alice').property('age', 30).next()
person2 = g.addV('Person').property('name', 'Bob').property('age', 25).next()
company1 = g.addV('Company').property('name', 'TechCorp').property('revenue', 1000000).next()

// Edges
g.addE('WORKS_AT').from(person1).to(company1).property('salary', 75000).next()
g.addE('WORKS_AT').from(person2).to(company1).property('salary', 65000).next()
g.addE('KNOWS').from(person1).to(person2).property('since', 2020).next()

Excel Example

Sheet: Person

name age totalSalary (Formula)
Alice 30 =VLOOKUP(A2,WORKS_AT!A:C,3,FALSE)
Bob 25 =VLOOKUP(A3,WORKS_AT!A:C,3,FALSE)

Sheet: Company

name revenue avgEmployeeSalary (Formula)
TechCorp 1000000 =AVERAGE(WORKS_AT!C:C)

Sheet: WORKS_AT (Edge)

out (Person) in (Company) salary
Alice TechCorp 75000
Bob TechCorp 65000

Sheet: KNOWS (Edge)

out (Person) in (Person) since
Alice Bob 2020

Named Ranges:

  • AllEmployees = Person!A2:B3
  • CompanyFinancials = Company!A2:C2
  • Salaries = WORKS_AT!C2:C3

Comments:

  • Cell Person!A2: "CEO of department" (Author: Admin, Date: 2024-01-15)
  • Cell Company!B2: "Annual revenue in USD" (Author: Finance, Date: 2024-01-10)

Implementation Notes

  1. Formula evaluation must handle circular references between sheets
  2. Cross-sheet references require maintaining vertex/edge ID mappings
  3. Named ranges improve query performance by pre-defining common traversals
  4. Comments preserve audit trail and metadata
  5. AST parsing enables complex graph queries from Excel formulas
  6. Absolute vs relative references determine query pattern (fixed vs traversal)