Talk:SimpleGraph-Excel: Difference between revisions

From BITPlan Wiki
Jump to navigation Jump to search
Tag: Manual revert
 
(11 intermediate revisions by 2 users not shown)
Line 48: Line 48:


== UML diagram ==
== UML diagram ==
<uml>
<uml format='svg'>
hide circle
hide circle


package Excel {
package Excel {
   class Workbook {
   class Workbook {  
     name
     +name: String
    +sheets: List<Sheet>
   }
   }
  note right of Workbook
    **Top-level container**
    Represents an Excel file
    Contains multiple sheets
  end note
    
    
   class Sheet {
   class Sheet {
     name
     +name: String
  }
    +index: int
 
     +cells: Map<Address, Cell>
  class Row {
     rownumber
   }
   }
  note right of Sheet
    **Worksheet/Tab**
    Named collection of cells
    Organized in grid (rows/columns)
  end note
    
    
   class Cell {
   class Cell {
     columnnumber
     +row: int
     value
     +column: int
  }
     +value: Object
 
     +formulaText: String
  class Range {
     +hasFormula(): boolean
     startCell
    endCell
    name
  }
 
  class Formula {
     expression
     result
   }
   }
  note right of Cell
    **Core data unit**
    - **value**: Evaluated result (number, text, etc.)
    - **formulaText**: Original formula string (e.g., "=SUM(A1:A10)")
    - **formula**: Parsed AST for evaluation
   
    Example:
    formulaText = "=A1+B1*2"
    formula = BinaryOp(+, CellRef(A1), BinaryOp(*, CellRef(B1), Literal(2)))
    value = 15.5 (after evaluation)
  end note
    
    
   class Reference {
   class CellStyle {
     address
     +font: String
     isAbsolute
     +fontSize: int
     sheetName
     +color: Color
    +alignment: String
    +border: String
   }
   }
  note bottom of CellStyle
    **Formatting properties**
    Visual appearance only,
    doesn't affect calculations
  end note
    
    
   class Comment {
   class Comment {
     author
     +text: String
     text
     +author: String
     timestamp
     +timestamp: DateTime
   }
   }
    
    
  Workbook "1" --> "n" Sheet
   abstract class AstNode {
  Sheet "1" --> "title 1" Row
     +evaluate(context): Object
  Sheet "1" --> "n" Row
    +toString(): String
  Sheet "1" --> "n" Range
  Row "1" --> "n" Cell
  Cell "0..1" --> "0..1" Formula
  Cell "0..1" --> "0..1" Comment
  Range "1" --> "n" Cell
  Formula "1" --> "n" Reference
}
 
package "Formula AST" {
   abstract class ASTNode {
     nodeType
   }
   }
  note right of AstNode
    **Formula Abstract Syntax Tree**
    Parsed representation of formula
    for evaluation and manipulation
   
    Each node can evaluate itself
    given the spreadsheet context
  end note
    
    
   class FunctionCall {
   class Literal {
     functionName
     +value: Object
    arguments
   }
   }
  note bottom of Literal
    Constants: numbers (42),
    strings ("hello"),
    booleans (TRUE)
  end note
    
    
   class BinaryOperation {
   class CellReference {
     operator
     +address: String
     leftOperand
     +isAbsolute: boolean
    rightOperand
   }
   }
  note bottom of CellReference
    References: A1 (relative),
    $A$1 (absolute),
    $A1 (mixed)
  end note
    
    
   class LiteralValue {
   class Function {
     value
     +name: String
     dataType
     +arguments: List<AstNode>
   }
   }
  note bottom of Function
    Built-in functions:
    SUM, AVERAGE, IF,
    VLOOKUP, etc.
  end note
    
    
   ASTNode <|-- FunctionCall
   class BinaryOp {
  ASTNode <|-- BinaryOperation
    +operator: String
  ASTNode <|-- LiteralValue
    +left: AstNode
  ASTNode <|-- Reference
    +right: AstNode
 
  FunctionCall "1" --> "n" ASTNode : arguments
  BinaryOperation "1" --> "1" ASTNode : left
  BinaryOperation "1" --> "1" ASTNode : right
  Formula "1" --> "1" ASTNode : rootNode
}
 
package TinkerPop {
  class Property {
    name
    value
   }
   }
  note bottom of BinaryOp
    Operators: +, -, *, /,
    ^, =, <>, <, >, etc.
  end note
    
    
   class Vertex {
   ' Hierarchy
    id
  Workbook "1" *-- "1..*" Sheet : contains
    label
   Sheet "1" *-- "*" Cell : contains
   }
    
    
   class Edge {
   ' Cell properties
    id
  Cell "1" *-- "1" CellStyle : styled by
    label
  Cell "1" o-- "0..1" AstNode : formula >
   }
   Cell "1" -- "0..*" Comment : annotated by
    
    
   class Edges {
   note on link
   }
    formulaText stores original string
    formula stores parsed AST
    Both are optional (cells can be plain values)
   end note
    
    
   class Vertices {
   ' Formula AST hierarchy
   }
   AstNode <|-- Literal
  AstNode <|-- CellReference
  AstNode <|-- Function
  AstNode <|-- BinaryOp
    
    
   Edge --> "1 in" Vertex
   Function "1" o-- "*" AstNode : has args
  Edge --> "1 out" Vertex
   BinaryOp "1" o-- "1" AstNode : left operand
   Vertex --> "n properties" Property
   BinaryOp "1" o-- "1" AstNode : right operand
  Edge --> "n properties" Property
   Edges --> "n edges" Edge
  Vertices --> "n vertices" Vertex
}
}


Cell -- Property : maps to
note bottom
Edge -- Sheet : maps to
  **Key Design Points:**
Vertex -- Sheet : maps to
  1. Cell stores both formula text (user input) and parsed AST
Range -- Vertices : can represent
  2. AST enables evaluation, dependency tracking, and formula editing
Range -- Edges : can represent
  3. Separation of data (value), presentation (style), and logic (formula)
Formula -- Property : computed property
  4. AstNode tree can be traversed for recalculation when dependencies change
Reference -- Vertex : references
end note
Reference -- Edge : references
Comment -- Property : metadata
FunctionCall -- Property : graph operation
BinaryOperation -- Property : computation
LiteralValue -- Property : constant value
</uml>
</uml>



Latest revision as of 16:13, 27 January 2026

Mental Model

Mapping rules

  1. Each Vertex type (by label) is converted to an Excel sheet
  2. Each Edge type (by label) is converted to an Excel sheet
  3. Each vertex is converted to a row
  4. Each vertex property is converted to a cell
  5. Each edge is converted to a row
  6. Each edge property is converted to a cell
  7. The header row for Vertices has a column for each property key
  8. The header for for Edges has a column for each property key + an in and and out column
  9. The in and out header columns for edges have the label of the corresponding vertex in parentheses

Extended Mapping Rules

Range Mapping

  1. Ranges can represent collections of vertices or edges
  2. Named ranges map to specific vertex/edge label types
  3. Each range spans multiple cells corresponding to multiple graph elements
  4. Range references in formulas map to graph traversals across multiple vertices/edges

Formula Mapping

  1. Formulas stored in cells can represent computed properties
  2. Formula results are mapped to property values in the graph
  3. Formulas can reference other vertices/edges through cell references
  4. Complex formulas map to graph traversal expressions

Reference Mapping

  1. Cell references in formulas map to vertex/edge relationships
  2. Absolute references ($A$1) map to fixed vertex IDs
  3. Relative references (A1) map to traversal patterns
  4. Cross-sheet references map to cross-label relationships between different vertex/edge types

Comment Mapping

  1. Comments on cells map to metadata properties on vertices/edges
  2. Comment author maps to a metadata property
  3. Comment text maps to annotation content
  4. Comment timestamps preserve temporal information in the graph

Formula AST Mapping

  1. The Abstract Syntax Tree of formulas maps to graph operations
  2. FunctionCall nodes map to graph traversal operations:
    1. SUM() → aggregate property values across connected edges
    2. VLOOKUP() → vertex lookup by property value
    3. IF() → conditional edge creation or property assignment
  3. BinaryOperation nodes map to property computations or edge weights:
    1. Arithmetic operators (+, -, *, /) → computed properties from multiple vertex properties
    2. Comparison operators (=, <, >) → edge filtering conditions
  4. LiteralValue nodes map to constant property values directly stored in the graph

UML diagram

Bidirectional Transformation

Excel to Graph

  1. Parse workbook structure
  2. Create vertex types from sheet names (non-edge sheets)
  3. Create edge types from sheet names (sheets with in/out columns)
  4. For each row in vertex sheets:
    1. Create vertex with label = sheet name
    2. Map each cell to a property
    3. Evaluate formulas and store results
    4. Attach comments as metadata properties
  5. For each row in edge sheets:
    1. Create edge with label = sheet name
    2. Resolve in/out references to vertices
    3. Map remaining cells to edge properties
  6. Process named ranges as vertex/edge collections
  7. Build formula AST for computed properties

Graph to Excel

  1. Create sheet for each unique vertex label
  2. Create sheet for each unique edge label
  3. For each vertex:
    1. Create row in corresponding sheet
    2. Create cell for each property
    3. Add comments from metadata properties
  4. For each edge:
    1. Create row in corresponding sheet
    2. Add in/out columns with vertex references
    3. Create cell for each property
  5. Create named ranges for logical vertex/edge groupings
  6. Reconstruct formulas from computed properties with AST

Example Mapping

Graph Example

// Vertices
person1 = g.addV('Person').property('name', 'Alice').property('age', 30).next()
person2 = g.addV('Person').property('name', 'Bob').property('age', 25).next()
company1 = g.addV('Company').property('name', 'TechCorp').property('revenue', 1000000).next()

// Edges
g.addE('WORKS_AT').from(person1).to(company1).property('salary', 75000).next()
g.addE('WORKS_AT').from(person2).to(company1).property('salary', 65000).next()
g.addE('KNOWS').from(person1).to(person2).property('since', 2020).next()

Excel Example

Sheet: Person

name age totalSalary (Formula)
Alice 30 =VLOOKUP(A2,WORKS_AT!A:C,3,FALSE)
Bob 25 =VLOOKUP(A3,WORKS_AT!A:C,3,FALSE)

Sheet: Company

name revenue avgEmployeeSalary (Formula)
TechCorp 1000000 =AVERAGE(WORKS_AT!C:C)

Sheet: WORKS_AT (Edge)

out (Person) in (Company) salary
Alice TechCorp 75000
Bob TechCorp 65000

Sheet: KNOWS (Edge)

out (Person) in (Person) since
Alice Bob 2020

Named Ranges:

  • AllEmployees = Person!A2:B3
  • CompanyFinancials = Company!A2:C2
  • Salaries = WORKS_AT!C2:C3

Comments:

  • Cell Person!A2: "CEO of department" (Author: Admin, Date: 2024-01-15)
  • Cell Company!B2: "Annual revenue in USD" (Author: Finance, Date: 2024-01-10)

Implementation Notes

  1. Formula evaluation must handle circular references between sheets
  2. Cross-sheet references require maintaining vertex/edge ID mappings
  3. Named ranges improve query performance by pre-defining common traversals
  4. Comments preserve audit trail and metadata
  5. AST parsing enables complex graph queries from Excel formulas
  6. Absolute vs relative references determine query pattern (fixed vs traversal)