Difference between revisions of "SiDIF"
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | == What links here == | ||
+ | {{WhatLinksHere}} | ||
+ | |||
== Introduction == | == Introduction == | ||
The {{sidif}} is yet another format for exchanging data between computers. | The {{sidif}} is yet another format for exchanging data between computers. | ||
Line 45: | Line 48: | ||
</graphviz> | </graphviz> | ||
− | == SiDIF | + | == SiDIF Implementations == |
− | see https://github.com/BITPlan/org.sidif.triplestore | + | see |
− | == Comparison to other | + | * https://github.com/BITPlan/org.sidif.triplestore for the original Java |
+ | * https://github.com/WolfgangFahl/py-sidif for the more recent Python version | ||
+ | |||
+ | |||
+ | == Comparison to other Knowledge Representation Approaches == | ||
Most other Triple formats are fare more complex. | Most other Triple formats are fare more complex. | ||
+ | See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements | ||
+ | being expressed in SiDIF | ||
+ | |||
* https://en.wikipedia.org/wiki/Cyc | * https://en.wikipedia.org/wiki/Cyc | ||
e.g. explains the Cyc Statement: | e.g. explains the Cyc Statement: | ||
Line 58: | Line 68: | ||
Paris is capital of France | Paris is capital of France | ||
</pre> | </pre> | ||
+ | The following sections compare three approaches to knowledge representation: RDF, Gellish, and SiDIF, with particular focus on how they handle identity and relationships. | ||
+ | |||
+ | * Gellish: an information representation language, knowledge base and ontology | ||
+ | [[CiteRef::van renssenNonege]] | ||
+ | {{#scite: | ||
+ | |reference=van renssenNonege | ||
+ | |type=journal-article | ||
+ | |title=Gellish: an information representation language, knowledge base and ontology | ||
+ | |authors=A. van Renssen | ||
+ | |journal=ESSDERC 2003. Proceedings of the 33rd European Solid-State Device Research - ESSDERC '03 (IEEE Cat. No. 03EX704) | ||
+ | |publisher=IEEE | ||
+ | |pages=215-228 | ||
+ | |doi=10.1109/siit.2003.1251209 | ||
+ | |year=None | ||
+ | |retrieved-from=https://doi.org/ | ||
+ | |retrieved-on=2024-11-23 | ||
+ | }} | ||
+ | === Theoretical Foundations === | ||
+ | ==== RDF Theory ==== | ||
+ | RDF (Resource Description Framework) is based on: | ||
+ | * Statements modeled as triples (subject-predicate-object) | ||
+ | * Universal Resource Identifiers (URIs) as primary identification mechanism | ||
+ | * Graph-based data model | ||
+ | * Optional fourth element (graph) in RDF Quads for context | ||
+ | |||
+ | ==== Gellish Theory ==== | ||
+ | Gellish is structured as: | ||
+ | * Fixed tabular format with predefined columns | ||
+ | * Relationship-type encoding system | ||
+ | * Partially qualified naming scheme | ||
+ | * Language-aware design | ||
+ | |||
+ | ==== SiDIF Theory ==== | ||
+ | SiDIF uses: | ||
+ | * Natural language-style triple statements | ||
+ | * Explicit separation of identity aspects | ||
+ | * Flexible predicate structure | ||
+ | * Multi-level identification system | ||
+ | |||
+ | === Example Representations === | ||
+ | The same knowledge represented in each format: | ||
+ | |||
+ | ==== RDF Example ==== | ||
+ | <pre> | ||
+ | <http://example.org/sensors/12> rdf:type <http://example.org/onto/TemperatureSensor> . | ||
+ | <http://example.org/sensors/12> rdfs:label "Sensor 12" . | ||
+ | <http://example.org/sensors/12> <http://example.org/onto/location> "plant1.line3" . | ||
+ | </pre> | ||
+ | |||
+ | ==== Gellish Example ==== | ||
+ | <pre> | ||
+ | 1|English|Sensor 12|1|is a|2|TemperatureSensor|491197|specialization| | ||
+ | 2|English|Sensor 12|1|has|3|location|123456|plant1.line3| | ||
+ | </pre> | ||
+ | |||
+ | ==== SiDIF Example ==== | ||
+ | <pre> | ||
+ | "Sensor 12" isA TemperatureSensor | ||
+ | "plant1.line3.sensor12" is FQN of "Sensor 12" | ||
+ | "urn:plant1:sensor:12" is PID of "Sensor 12" | ||
+ | "http://plant1.company.com/sensors/12" is URI of "Sensor 12" | ||
+ | "opc://plant1/l3/s12" is OPC_URI of "Sensor 12" | ||
+ | </pre> | ||
+ | |||
+ | === Critical Analysis === | ||
+ | ==== RDF Limitations ==== | ||
+ | * Forces URI usage for identification | ||
+ | * Mixes identity with web location | ||
+ | * Complex syntax reduces readability | ||
+ | * Difficult to handle non-web identifiers | ||
+ | |||
+ | ==== Gellish Limitations ==== | ||
+ | * Rigid tabular structure | ||
+ | * Names not fully qualified | ||
+ | * Limited identifier flexibility | ||
+ | * Complex relationship encoding | ||
+ | |||
+ | ==== SiDIF Advantages ==== | ||
+ | * Separates different aspects of identity (name, FQN, PID, URI) | ||
+ | * Natural language readability | ||
+ | * Flexible identifier system | ||
+ | * Easy addition of new identifier types | ||
+ | |||
+ | === Conclusion === | ||
+ | While RDF and Gellish each have their strengths for specific use cases, SiDIF offers a more flexible and comprehensive approach to identity management. RDF's URI-centric approach limits its usefulness in non-web contexts, while Gellish's rigid structure makes it difficult to adapt to new requirements. SiDIF's separation of identity aspects (name, FQN, PID, URI) combined with its natural language syntax provides a more versatile and maintainable solution for knowledge representation. | ||
+ | |||
+ | Key benefits of the SiDIF approach: | ||
+ | * Clear separation of identity aspects | ||
+ | * Support for multiple identification systems | ||
+ | * Easy system evolution and maintenance | ||
+ | * Better human readability | ||
+ | * Formal precision through explicit identity qualification | ||
== Links == | == Links == | ||
− | * [https://github.com/BITPlan/org.sidif.triplestore/issues Issues ] | + | * [https://github.com/BITPlan/org.sidif.triplestore/issues org.sidif.triplestore Issues ] |
=== Syntax === | === Syntax === | ||
Line 206: | Line 308: | ||
<!-- Special token --> | <!-- Special token --> | ||
<TR> | <TR> | ||
+ | [[Category:SiDIF]] | ||
+ | |||
<TD> | <TD> | ||
<PRE> | <PRE> | ||
Line 264: | Line 368: | ||
</BODY> | </BODY> | ||
</HTML> | </HTML> | ||
+ | |||
+ | |||
+ | [[Category:frontend]] | ||
+ | [[Category:SiGNaL]] |
Latest revision as of 11:51, 23 November 2024
What links here
Introduction
The Simple Data Interchange Format (SiDIF) is yet another format for exchanging data between computers.
SiDIF isA DataInterchangeFormat
is a valid SiDIF content.
Examples
City Tokyo
City isA Concept Tokyo isA City webpage addsTo City "http://www.tokyo.jp" is webpage of Tokyo
is valid SiDIF.
SiDIF is based on Triples
Each Sidif statement has a three part structure:
- subject
- predicate
- object
that is called a Triple
Royal family
The Royal92 SiDIF was created via a GEDCOM import. Together with the Model SiDIF and the MetaModel SiDIF it is the basis for the content of the Royal Family wiki A good entry point to browse the structure of that Wiki is the Topic table E.g. you could follow the following links:
SiDIF Structure
SiDIF expressions
A SiDIF expression like
Tokyo isA City
consists of three parts:
- Tokyo is the subject
- isA is the predicate
- City is the object
Such a set of subject / predicate / object is called a Triple
graphical representation
SiDIF Implementations
see
- https://github.com/BITPlan/org.sidif.triplestore for the original Java
- https://github.com/WolfgangFahl/py-sidif for the more recent Python version
Comparison to other Knowledge Representation Approaches
Most other Triple formats are fare more complex. See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements being expressed in SiDIF
e.g. explains the Cyc Statement:
(#$capitalCity #$France #$Paris)
with "Paris is the capital of France." which in SiDIF would be:
Paris is capital of France
The following sections compare three approaches to knowledge representation: RDF, Gellish, and SiDIF, with particular focus on how they handle identity and relationships.
- Gellish: an information representation language, knowledge base and ontology
Theoretical Foundations
RDF Theory
RDF (Resource Description Framework) is based on:
- Statements modeled as triples (subject-predicate-object)
- Universal Resource Identifiers (URIs) as primary identification mechanism
- Graph-based data model
- Optional fourth element (graph) in RDF Quads for context
Gellish Theory
Gellish is structured as:
- Fixed tabular format with predefined columns
- Relationship-type encoding system
- Partially qualified naming scheme
- Language-aware design
SiDIF Theory
SiDIF uses:
- Natural language-style triple statements
- Explicit separation of identity aspects
- Flexible predicate structure
- Multi-level identification system
Example Representations
The same knowledge represented in each format:
RDF Example
<http://example.org/sensors/12> rdf:type <http://example.org/onto/TemperatureSensor> . <http://example.org/sensors/12> rdfs:label "Sensor 12" . <http://example.org/sensors/12> <http://example.org/onto/location> "plant1.line3" .
Gellish Example
1|English|Sensor 12|1|is a|2|TemperatureSensor|491197|specialization| 2|English|Sensor 12|1|has|3|location|123456|plant1.line3|
SiDIF Example
"Sensor 12" isA TemperatureSensor "plant1.line3.sensor12" is FQN of "Sensor 12" "urn:plant1:sensor:12" is PID of "Sensor 12" "http://plant1.company.com/sensors/12" is URI of "Sensor 12" "opc://plant1/l3/s12" is OPC_URI of "Sensor 12"
Critical Analysis
RDF Limitations
- Forces URI usage for identification
- Mixes identity with web location
- Complex syntax reduces readability
- Difficult to handle non-web identifiers
Gellish Limitations
- Rigid tabular structure
- Names not fully qualified
- Limited identifier flexibility
- Complex relationship encoding
SiDIF Advantages
- Separates different aspects of identity (name, FQN, PID, URI)
- Natural language readability
- Flexible identifier system
- Easy addition of new identifier types
Conclusion
While RDF and Gellish each have their strengths for specific use cases, SiDIF offers a more flexible and comprehensive approach to identity management. RDF's URI-centric approach limits its usefulness in non-web contexts, while Gellish's rigid structure makes it difficult to adapt to new requirements. SiDIF's separation of identity aspects (name, FQN, PID, URI) combined with its natural language syntax provides a more versatile and maintainable solution for knowledge representation.
Key benefits of the SiDIF approach:
- Clear separation of identity aspects
- Support for multiple identification systems
- Easy system evolution and maintenance
- Better human readability
- Formal precision through explicit identity qualification
Links
Syntax
BNF for SiDIF.jjt
TOKENS
/* WHITESPACE AND COMMENTS */ |
<DEFAULT> SKIP : { " " | "\n" | "\r" | "\r\n" | <"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")> } |
/* TOKENS for Productions */ |
<DEFAULT> TOKEN : { <IS: "is"> | <OF: "of"> | <HAS: "has"> } |
/* Literals */ |
<DEFAULT> TOKEN : { <INTEGER_LITERAL: <DECIMAL_LITERAL> (["l","L"])? | <HEX_LITERAL> (["l","L"])? | <OCTAL_LITERAL> (["l","L"])?> | <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*> | <#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+> | <#OCTAL_LITERAL: "0" (["0"-"7"])*> | <FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"]> | <#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+> | <CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'"> | <STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\""> | <DATETIME_LITERAL: <DATE_LITERAL> ((<WHITESPACE>)+ <TIME_LITERAL>)?> | <#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]> | <TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?> | <TRUE: "true"> | <FALSE: "false"> | <NULL: "null"> } |
<DEFAULT> TOKEN : { <#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f"> } |
<DEFAULT> TOKEN : { <URI: <SCHEME> (~[" ","\t","\n","\r"])+> | <#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:"> } |
/* IDENTIFIER */ |
<DEFAULT> TOKEN : { <IDENTIFIER: <LETTER> (<LETTER> | "_" | <DIGIT>)*> | <#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]> | <#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]> } |
// Catch-all tokens. Must be last. // Any non-whitespace. Causes a parser exception, rather than a // token manager error (with hidden line numbers). |
<DEFAULT> TOKEN : { <#UNKNOWN: (~[" ","\t","\n","\r","\f"])+> } |
NON-TERMINALS
[[Category:SiDIF]]
/******************************************* * THE SiDIF LANGUAGE GRAMMAR STARTS HERE * *******************************************/ /* just as list of links */ |
||
Links | ::= | ( Link | Value )+ <EOF> |
/** * a single link assignment */ |
||
Link | ::= | ( ( <IDENTIFIER> <IDENTIFIER> <IDENTIFIER> ) | ( <IDENTIFIER> <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) | ( <IDENTIFIER> <HAS> <IDENTIFIER> <IDENTIFIER> ) ) |
/** * Literal Value assignment */ |
||
Value | ::= | ( Literal <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) |
/** * Handle Literal values */ |
||
Literal | ::= | ( <INTEGER_LITERAL> | <FLOATING_POINT_LITERAL> | <CHARACTER_LITERAL> | <STRING_LITERAL> | <DATETIME_LITERAL> | <TIME_LITERAL> | <URI> | <TRUE> | <FALSE> | <NULL> ) |
References
- ^ A. van Renssen. (None) "Gellish: an information representation language, knowledge base and ontology" - 215-228 pages. doi: 10.1109/siit.2003.1251209