Talk:SimpleGraph-Excel: Difference between revisions
Jump to navigation
Jump to search
Simon Bruns (talk | contribs) |
Tag: Manual revert |
||
| (11 intermediate revisions by 2 users not shown) | |||
| Line 48: | Line 48: | ||
== UML diagram == | == UML diagram == | ||
<uml> | <uml format='svg'> | ||
hide circle | hide circle | ||
package Excel { | package Excel { | ||
class Workbook { | class Workbook { | ||
name | +name: String | ||
+sheets: List<Sheet> | |||
} | } | ||
note right of Workbook | |||
**Top-level container** | |||
Represents an Excel file | |||
Contains multiple sheets | |||
end note | |||
class Sheet { | class Sheet { | ||
name | +name: String | ||
+index: int | |||
+cells: Map<Address, Cell> | |||
} | } | ||
note right of Sheet | |||
**Worksheet/Tab** | |||
Named collection of cells | |||
Organized in grid (rows/columns) | |||
end note | |||
class Cell { | class Cell { | ||
+row: int | |||
+column: int | |||
+value: Object | |||
+formulaText: String | |||
+hasFormula(): boolean | |||
} | } | ||
note right of Cell | |||
**Core data unit** | |||
- **value**: Evaluated result (number, text, etc.) | |||
- **formulaText**: Original formula string (e.g., "=SUM(A1:A10)") | |||
- **formula**: Parsed AST for evaluation | |||
Example: | |||
formulaText = "=A1+B1*2" | |||
formula = BinaryOp(+, CellRef(A1), BinaryOp(*, CellRef(B1), Literal(2))) | |||
value = 15.5 (after evaluation) | |||
end note | |||
class | class CellStyle { | ||
+font: String | |||
+fontSize: int | |||
+color: Color | |||
+alignment: String | |||
+border: String | |||
} | } | ||
note bottom of CellStyle | |||
**Formatting properties** | |||
Visual appearance only, | |||
doesn't affect calculations | |||
end note | |||
class Comment { | class Comment { | ||
+text: String | |||
+author: String | |||
timestamp | +timestamp: DateTime | ||
} | } | ||
abstract class AstNode { | |||
+evaluate(context): Object | |||
+toString(): String | |||
abstract class | |||
} | } | ||
note right of AstNode | |||
**Formula Abstract Syntax Tree** | |||
Parsed representation of formula | |||
for evaluation and manipulation | |||
Each node can evaluate itself | |||
given the spreadsheet context | |||
end note | |||
class | class Literal { | ||
+value: Object | |||
} | } | ||
note bottom of Literal | |||
Constants: numbers (42), | |||
strings ("hello"), | |||
booleans (TRUE) | |||
end note | |||
class | class CellReference { | ||
+address: String | |||
+isAbsolute: boolean | |||
} | } | ||
note bottom of CellReference | |||
References: A1 (relative), | |||
$A$1 (absolute), | |||
$A1 (mixed) | |||
end note | |||
class | class Function { | ||
+name: String | |||
+arguments: List<AstNode> | |||
} | } | ||
note bottom of Function | |||
Built-in functions: | |||
SUM, AVERAGE, IF, | |||
VLOOKUP, etc. | |||
end note | |||
class BinaryOp { | |||
+operator: String | |||
+left: AstNode | |||
+right: AstNode | |||
} | } | ||
note bottom of BinaryOp | |||
Operators: +, -, *, /, | |||
^, =, <>, <, >, etc. | |||
end note | |||
' Hierarchy | |||
Workbook "1" *-- "1..*" Sheet : contains | |||
Sheet "1" *-- "*" Cell : contains | |||
' Cell properties | |||
Cell "1" *-- "1" CellStyle : styled by | |||
Cell "1" o-- "0..1" AstNode : formula > | |||
Cell "1" -- "0..*" Comment : annotated by | |||
note on link | |||
formulaText stores original string | |||
formula stores parsed AST | |||
Both are optional (cells can be plain values) | |||
end note | |||
' Formula AST hierarchy | |||
AstNode <|-- Literal | |||
AstNode <|-- CellReference | |||
AstNode <|-- Function | |||
AstNode <|-- BinaryOp | |||
Function "1" o-- "*" AstNode : has args | |||
BinaryOp "1" o-- "1" AstNode : left operand | |||
BinaryOp "1" o-- "1" AstNode : right operand | |||
} | } | ||
note bottom | |||
**Key Design Points:** | |||
1. Cell stores both formula text (user input) and parsed AST | |||
2. AST enables evaluation, dependency tracking, and formula editing | |||
3. Separation of data (value), presentation (style), and logic (formula) | |||
4. AstNode tree can be traversed for recalculation when dependencies change | |||
end note | |||
</uml> | </uml> | ||
Latest revision as of 16:13, 27 January 2026
Mental Model
Mapping rules
- Each Vertex type (by label) is converted to an Excel sheet
- Each Edge type (by label) is converted to an Excel sheet
- Each vertex is converted to a row
- Each vertex property is converted to a cell
- Each edge is converted to a row
- Each edge property is converted to a cell
- The header row for Vertices has a column for each property key
- The header for for Edges has a column for each property key + an in and and out column
- The in and out header columns for edges have the label of the corresponding vertex in parentheses
Extended Mapping Rules
Range Mapping
- Ranges can represent collections of vertices or edges
- Named ranges map to specific vertex/edge label types
- Each range spans multiple cells corresponding to multiple graph elements
- Range references in formulas map to graph traversals across multiple vertices/edges
Formula Mapping
- Formulas stored in cells can represent computed properties
- Formula results are mapped to property values in the graph
- Formulas can reference other vertices/edges through cell references
- Complex formulas map to graph traversal expressions
Reference Mapping
- Cell references in formulas map to vertex/edge relationships
- Absolute references ($A$1) map to fixed vertex IDs
- Relative references (A1) map to traversal patterns
- Cross-sheet references map to cross-label relationships between different vertex/edge types
Comment Mapping
- Comments on cells map to metadata properties on vertices/edges
- Comment author maps to a metadata property
- Comment text maps to annotation content
- Comment timestamps preserve temporal information in the graph
Formula AST Mapping
- The Abstract Syntax Tree of formulas maps to graph operations
- FunctionCall nodes map to graph traversal operations:
- SUM() → aggregate property values across connected edges
- VLOOKUP() → vertex lookup by property value
- IF() → conditional edge creation or property assignment
- BinaryOperation nodes map to property computations or edge weights:
- Arithmetic operators (+, -, *, /) → computed properties from multiple vertex properties
- Comparison operators (=, <, >) → edge filtering conditions
- LiteralValue nodes map to constant property values directly stored in the graph
UML diagram
Bidirectional Transformation
Excel to Graph
- Parse workbook structure
- Create vertex types from sheet names (non-edge sheets)
- Create edge types from sheet names (sheets with in/out columns)
- For each row in vertex sheets:
- Create vertex with label = sheet name
- Map each cell to a property
- Evaluate formulas and store results
- Attach comments as metadata properties
- For each row in edge sheets:
- Create edge with label = sheet name
- Resolve in/out references to vertices
- Map remaining cells to edge properties
- Process named ranges as vertex/edge collections
- Build formula AST for computed properties
Graph to Excel
- Create sheet for each unique vertex label
- Create sheet for each unique edge label
- For each vertex:
- Create row in corresponding sheet
- Create cell for each property
- Add comments from metadata properties
- For each edge:
- Create row in corresponding sheet
- Add in/out columns with vertex references
- Create cell for each property
- Create named ranges for logical vertex/edge groupings
- Reconstruct formulas from computed properties with AST
Example Mapping
Graph Example
// Vertices
person1 = g.addV('Person').property('name', 'Alice').property('age', 30).next()
person2 = g.addV('Person').property('name', 'Bob').property('age', 25).next()
company1 = g.addV('Company').property('name', 'TechCorp').property('revenue', 1000000).next()
// Edges
g.addE('WORKS_AT').from(person1).to(company1).property('salary', 75000).next()
g.addE('WORKS_AT').from(person2).to(company1).property('salary', 65000).next()
g.addE('KNOWS').from(person1).to(person2).property('since', 2020).next()
Excel Example
Sheet: Person
| name | age | totalSalary (Formula) |
|---|---|---|
| Alice | 30 | =VLOOKUP(A2,WORKS_AT!A:C,3,FALSE) |
| Bob | 25 | =VLOOKUP(A3,WORKS_AT!A:C,3,FALSE) |
Sheet: Company
| name | revenue | avgEmployeeSalary (Formula) |
|---|---|---|
| TechCorp | 1000000 | =AVERAGE(WORKS_AT!C:C) |
Sheet: WORKS_AT (Edge)
| out (Person) | in (Company) | salary |
|---|---|---|
| Alice | TechCorp | 75000 |
| Bob | TechCorp | 65000 |
Sheet: KNOWS (Edge)
| out (Person) | in (Person) | since |
|---|---|---|
| Alice | Bob | 2020 |
Named Ranges:
- AllEmployees = Person!A2:B3
- CompanyFinancials = Company!A2:C2
- Salaries = WORKS_AT!C2:C3
Comments:
- Cell Person!A2: "CEO of department" (Author: Admin, Date: 2024-01-15)
- Cell Company!B2: "Annual revenue in USD" (Author: Finance, Date: 2024-01-10)
Implementation Notes
- Formula evaluation must handle circular references between sheets
- Cross-sheet references require maintaining vertex/edge ID mappings
- Named ranges improve query performance by pre-defining common traversals
- Comments preserve audit trail and metadata
- AST parsing enables complex graph queries from Excel formulas
- Absolute vs relative references determine query pattern (fixed vs traversal)