SiDIF

From BITPlan Wiki
Jump to navigation Jump to search

What links here

Introduction

The Simple Data Interchange Format (SiDIF) is yet another format for exchanging data between computers.

SiDIF isA DataInterchangeFormat

is a valid SiDIF content.

Examples

City Tokyo

City isA Concept
Tokyo isA City
webpage addsTo City
"http://www.tokyo.jp" is webpage of Tokyo

is valid SiDIF.

SiDIF is based on Triples

Each Sidif statement has a three part structure:

  1. subject
  2. predicate
  3. object

that is called a Triple

Royal family

The Royal92 SiDIF was created via a GEDCOM import. Together with the Model SiDIF and the MetaModel SiDIF it is the basis for the content of the Royal Family wiki A good entry point to browse the structure of that Wiki is the Topic table E.g. you could follow the following links:

  1. Person Concept derived from the Person Topic
  2. Help for the Person Topic

SiDIF Structure

SiDIF expressions

A SiDIF expression like

Tokyo isA City

consists of three parts:

  • Tokyo is the subject
  • isA is the predicate
  • City is the object

Such a set of subject / predicate / object is called a Triple

graphical representation

SiDIF Implementations

see

Comparison to other formats

Most other Triple formats are fare more complex. See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements being expressed in SiDIF

e.g. explains the Cyc Statement:

 (#$capitalCity #$France #$Paris)

with "Paris is the capital of France." which in SiDIF would be:

Paris is capital of France

Links

Syntax

BNF for SiDIF.jjt

BNF for SiDIF.jjt

TOKENS

/* WHITESPACE AND COMMENTS */
<DEFAULT> SKIP : {
" "
| "\n"
| "\r"
| "\r\n"
| <"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")>
}
/* TOKENS for Productions */
<DEFAULT> TOKEN : {
<IS: "is">
| <OF: "of">
| <HAS: "has">
}
/* Literals */
<DEFAULT> TOKEN : {
<INTEGER_LITERAL: <DECIMAL_LITERAL> (["l","L"])? | <HEX_LITERAL> (["l","L"])? | <OCTAL_LITERAL> (["l","L"])?>
| <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*>
| <#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+>
| <#OCTAL_LITERAL: "0" (["0"-"7"])*>
| <FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"]>
| <#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+>
| <CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'">
| <STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\"">
| <DATETIME_LITERAL: <DATE_LITERAL> ((<WHITESPACE>)+ <TIME_LITERAL>)?>
| <#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]>
| <TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?>
| <TRUE: "true">
| <FALSE: "false">
| <NULL: "null">
}
<DEFAULT> TOKEN : {
<#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f">
}
<DEFAULT> TOKEN : {
<URI: <SCHEME> (~[" ","\t","\n","\r"])+>
| <#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:">
}
/* IDENTIFIER */
<DEFAULT> TOKEN : {
<IDENTIFIER: <LETTER> (<LETTER> | "_" | <DIGIT>)*>
| <#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]>
| <#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]>
}
// Catch-all tokens. Must be last.
// Any non-whitespace. Causes a parser exception, rather than a
// token manager error (with hidden line numbers).
<DEFAULT> TOKEN : {
<#UNKNOWN: (~[" ","\t","\n","\r","\f"])+>
}

NON-TERMINALS

[[Category:SiDIF]]
/*******************************************
* THE SiDIF LANGUAGE GRAMMAR STARTS HERE *
*******************************************/
/* just as list of links */
Links ::= ( Link | Value )+ <EOF>
/**
* a single link assignment
*/
Link ::= ( ( <IDENTIFIER> <IDENTIFIER> <IDENTIFIER> ) | ( <IDENTIFIER> <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) | ( <IDENTIFIER> <HAS> <IDENTIFIER> <IDENTIFIER> ) )
/**
* Literal Value assignment
*/
Value ::= ( Literal <IS> <IDENTIFIER> <OF> <IDENTIFIER> )
/**
* Handle Literal values
*/
Literal ::= ( <INTEGER_LITERAL> | <FLOATING_POINT_LITERAL> | <CHARACTER_LITERAL> | <STRING_LITERAL> | <DATETIME_LITERAL> | <TIME_LITERAL> | <URI> | <TRUE> | <FALSE> | <NULL> )

References

  1. ^  A. van Renssen. (None) "Gellish: an information representation language, knowledge base and ontology" - 215-228 pages. doi: 10.1109/siit.2003.1251209