SiDIF

From BITPlan Wiki
Revision as of 20:48, 20 February 2023 by Wf (talk | contribs) (→‎Comparison to other formats)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

What links here

Introduction

The Simple Data Interchange Format (SiDIF) is yet another format for exchanging data between computers.

SiDIF isA DataInterchangeFormat

is a valid SiDIF content.

Examples

City Tokyo

City isA Concept
Tokyo isA City
webpage addsTo City
"http://www.tokyo.jp" is webpage of Tokyo

is valid SiDIF.

SiDIF is based on Triples

Each Sidif statement has a three part structure:

  1. subject
  2. predicate
  3. object

that is called a Triple

Royal family

The Royal92 SiDIF was created via a GEDCOM import. Together with the Model SiDIF and the MetaModel SiDIF it is the basis for the content of the Royal Family wiki A good entry point to browse the structure of that Wiki is the Topic table E.g. you could follow the following links:

  1. Person Concept derived from the Person Topic
  2. Help for the Person Topic

SiDIF Structure

SiDIF expressions

A SiDIF expression like

Tokyo isA City

consists of three parts:

  • Tokyo is the subject
  • isA is the predicate
  • City is the object

Such a set of subject / predicate / object is called a Triple

graphical representation

SiDIF Implementations

see

Comparison to other formats

Most other Triple formats are fare more complex. See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements being expressed in SiDIF

e.g. explains the Cyc Statement:

 (#$capitalCity #$France #$Paris)

with "Paris is the capital of France." which in SiDIF would be:

Paris is capital of France

Links

Syntax

BNF for SiDIF.jjt

BNF for SiDIF.jjt

TOKENS

/* WHITESPACE AND COMMENTS */
<DEFAULT> SKIP : {
" "
| "\n"
| "\r"
| "\r\n"
| <"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")>
}
/* TOKENS for Productions */
<DEFAULT> TOKEN : {
<IS: "is">
| <OF: "of">
| <HAS: "has">
}
/* Literals */
<DEFAULT> TOKEN : {
<INTEGER_LITERAL: <DECIMAL_LITERAL> (["l","L"])? | <HEX_LITERAL> (["l","L"])? | <OCTAL_LITERAL> (["l","L"])?>
| <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*>
| <#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+>
| <#OCTAL_LITERAL: "0" (["0"-"7"])*>
| <FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"]>
| <#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+>
| <CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'">
| <STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\"">
| <DATETIME_LITERAL: <DATE_LITERAL> ((<WHITESPACE>)+ <TIME_LITERAL>)?>
| <#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]>
| <TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?>
| <TRUE: "true">
| <FALSE: "false">
| <NULL: "null">
}
<DEFAULT> TOKEN : {
<#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f">
}
<DEFAULT> TOKEN : {
<URI: <SCHEME> (~[" ","\t","\n","\r"])+>
| <#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:">
}
/* IDENTIFIER */
<DEFAULT> TOKEN : {
<IDENTIFIER: <LETTER> (<LETTER> | "_" | <DIGIT>)*>
| <#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]>
| <#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]>
}
// Catch-all tokens. Must be last.
// Any non-whitespace. Causes a parser exception, rather than a
// token manager error (with hidden line numbers).
<DEFAULT> TOKEN : {
<#UNKNOWN: (~[" ","\t","\n","\r","\f"])+>
}

NON-TERMINALS

[[Category:SiDIF]]
/*******************************************
* THE SiDIF LANGUAGE GRAMMAR STARTS HERE *
*******************************************/
/* just as list of links */
Links ::= ( Link | Value )+ <EOF>
/**
* a single link assignment
*/
Link ::= ( ( <IDENTIFIER> <IDENTIFIER> <IDENTIFIER> ) | ( <IDENTIFIER> <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) | ( <IDENTIFIER> <HAS> <IDENTIFIER> <IDENTIFIER> ) )
/**
* Literal Value assignment
*/
Value ::= ( Literal <IS> <IDENTIFIER> <OF> <IDENTIFIER> )
/**
* Handle Literal values
*/
Literal ::= ( <INTEGER_LITERAL> | <FLOATING_POINT_LITERAL> | <CHARACTER_LITERAL> | <STRING_LITERAL> | <DATETIME_LITERAL> | <TIME_LITERAL> | <URI> | <TRUE> | <FALSE> | <NULL> )