The Simple Data Interchange Format (SiDIF) is yet another format for exchanging data between computers.
SiDIF isA DataInterchangeFormat
is a valid SiDIF content.
City Tokyo
City isA Concept Tokyo isA City webpage addsTo City "" is webpage of Tokyo
is valid SiDIF.
SiDIF is based on Triples
Each Sidif statement has a three part structure:
- subject
- predicate
- object
that is called a Triple
Royal family
The Royal92 SiDIF was created via a GEDCOM import. Together with the Model SiDIF and the MetaModel SiDIF it is the basis for the content of the Royal Family wiki A good entry point to browse the structure of that Wiki is the Topic table E.g. you could follow the following links:
SiDIF Structure
SiDIF expressions
A SiDIF expression like
Tokyo isA City
consists of three parts:
- Tokyo is the subject
- isA is the predicate
- City is the object
Such a set of subject / predicate / object is called a Triple
graphical representation

SiDIF Implementations
- for the original Java
- for the more recent Python version
Comparison to other Knowledge Representation Approaches
Most other Triple formats are fare more complex. See e.g. for some example of triple formats with the same statements being expressed in SiDIF
e.g. explains the Cyc Statement:
(#$capitalCity #$France #$Paris)
with "Paris is the capital of France." which in SiDIF would be:
Paris is capital of France
The following sections compare three approaches to knowledge representation: RDF, Gellish, and SiDIF, with particular focus on how they handle identity and relationships.
- Gellish: an information representation language, knowledge base and ontology
Theoretical Foundations
RDF Theory
RDF (Resource Description Framework) is based on:
- Statements modeled as triples (subject-predicate-object)
- Universal Resource Identifiers (URIs) as primary identification mechanism
- Graph-based data model
- Optional fourth element (graph) in RDF Quads for context
Gellish Theory
Gellish is structured as:
- Fixed tabular format with predefined columns
- Relationship-type encoding system
- Partially qualified naming scheme
- Language-aware design
SiDIF Theory
SiDIF uses:
- Natural language-style triple statements
- Explicit separation of identity aspects
- Flexible predicate structure
- Multi-level identification system
Example Representations
The same knowledge represented in each format:
RDF Example
<> rdf:type <> . <> rdfs:label "Sensor 12" . <> <> "plant1.line3" .
Gellish Example
1|English|Sensor 12|1|is a|2|TemperatureSensor|491197|specialization| 2|English|Sensor 12|1|has|3|location|123456|plant1.line3|
SiDIF Example
"Sensor 12" isA TemperatureSensor "plant1.line3.sensor12" is FQN of "Sensor 12" "urn:plant1:sensor:12" is PID of "Sensor 12" "" is URI of "Sensor 12" "opc://plant1/l3/s12" is OPC_URI of "Sensor 12"
Critical Analysis
RDF Limitations
- Forces URI usage for identification
- Mixes identity with web location
- Complex syntax reduces readability
- Difficult to handle non-web identifiers
Gellish Limitations
- Rigid tabular structure
- Names not fully qualified
- Limited identifier flexibility
- Complex relationship encoding
SiDIF Advantages
- Separates different aspects of identity (name, FQN, PID, URI)
- Natural language readability
- Flexible identifier system
- Easy addition of new identifier types
While RDF and Gellish each have their strengths for specific use cases, SiDIF offers a more flexible and comprehensive approach to identity management. RDF's URI-centric approach limits its usefulness in non-web contexts, while Gellish's rigid structure makes it difficult to adapt to new requirements. SiDIF's separation of identity aspects (name, FQN, PID, URI) combined with its natural language syntax provides a more versatile and maintainable solution for knowledge representation.
Key benefits of the SiDIF approach:
- Clear separation of identity aspects
- Support for multiple identification systems
- Easy system evolution and maintenance
- Better human readability
- Formal precision through explicit identity qualification
BNF for SiDIF.jjt
<DEFAULT> SKIP : { " " | "\n" | "\r" | "\r\n" | <"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")> } |
/* TOKENS for Productions */ |
<DEFAULT> TOKEN : { <IS: "is"> | <OF: "of"> | <HAS: "has"> } |
/* Literals */ |
<DEFAULT> TOKEN : { <INTEGER_LITERAL: <DECIMAL_LITERAL> (["l","L"])? | <HEX_LITERAL> (["l","L"])? | <OCTAL_LITERAL> (["l","L"])?> | <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*> | <#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+> | <#OCTAL_LITERAL: "0" (["0"-"7"])*> | <FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"]> | <#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+> | <CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'"> | <STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\""> | <DATETIME_LITERAL: <DATE_LITERAL> ((<WHITESPACE>)+ <TIME_LITERAL>)?> | <#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]> | <TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?> | <TRUE: "true"> | <FALSE: "false"> | <NULL: "null"> } |
<DEFAULT> TOKEN : { <#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f"> } |
<DEFAULT> TOKEN : { <URI: <SCHEME> (~[" ","\t","\n","\r"])+> | <#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:"> } |
<DEFAULT> TOKEN : { <IDENTIFIER: <LETTER> (<LETTER> | "_" | <DIGIT>)*> | <#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]> | <#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]> } |
// Catch-all tokens. Must be last. // Any non-whitespace. Causes a parser exception, rather than a // token manager error (with hidden line numbers). |
<DEFAULT> TOKEN : { <#UNKNOWN: (~[" ","\t","\n","\r","\f"])+> } |
/******************************************* * THE SiDIF LANGUAGE GRAMMAR STARTS HERE * *******************************************/ /* just as list of links */ |
Links | ::= | ( Link | Value )+ <EOF> |
/** * a single link assignment */ |
/** * Literal Value assignment */ |
Value | ::= | ( Literal <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) |
/** * Handle Literal values */ |
- ^ A. van Renssen. (None) "Gellish: an information representation language, knowledge base and ontology" - 215-228 pages. doi: 10.1109/siit.2003.1251209