Difference between revisions of "SiDIF"

From BITPlan Wiki
Jump to navigation Jump to search
(36 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
== What links here ==
 +
{{WhatLinksHere}}
 +
 
== Introduction ==
 
== Introduction ==
 
The {{sidif}} is yet another format for exchanging data between computers.
 
The {{sidif}} is yet another format for exchanging data between computers.
 +
<pre>SiDIF isA DataInterchangeFormat</pre> is a valid SiDIF content.
 
=== Examples ===
 
=== Examples ===
==== City Tokyo ===
+
==== City Tokyo ====
 
<pre>
 
<pre>
 
City isA Concept
 
City isA Concept
Line 9: Line 13:
 
"http://www.tokyo.jp" is webpage of Tokyo
 
"http://www.tokyo.jp" is webpage of Tokyo
 
</pre>
 
</pre>
 +
is valid SiDIF.
 +
=== SiDIF is based on Triples ===
 +
Each Sidif statement has a three part structure:
 +
# subject
 +
# predicate
 +
# object
 +
that is called a {{Link|target=Triple}}
 +
 +
==== Royal family ====
 +
The [http://royal-family.bitplan.com/index.php/Royal92#sidif Royal92 SiDIF] was created via a [http://royal-family.bitplan.com/index.php/GEDCOM_import GEDCOM import].
 +
Together with the [http://royal-family.bitplan.com/index.php/TopicGenerator2015/FamilyContext#FamilyContext_SiDIF_2 Model SiDIF] and the [http://royal-family.bitplan.com/index.php/TopicGenerator2015/MetaModel#MetaModel_SiDIF_2 MetaModel SiDIF] it is the basis for the content of the
 +
[http://royal-family.bitplan.com/index.php Royal Family wiki]
 +
A good entry point to browse the structure of that Wiki is [http://royal-family.bitplan.com/index.php/Main_Page#tab=Topics the Topic table]
 +
E.g. you could follow the following links:
 +
# [http://royal-family.bitplan.com/index.php/Concept:Person Person Concept derived from the Person Topic]
 +
# [http://royal-family.bitplan.com/index.php/Help:Person Help for the Person Topic]
 +
 
== SiDIF Structure ==
 
== SiDIF Structure ==
A "sentence" like  
+
=== SiDIF expressions ===
 +
A SiDIF expression like  
 
<pre>
 
<pre>
 
Tokyo isA City
 
Tokyo isA City
Line 18: Line 40:
 
* isA is the predicate
 
* isA is the predicate
 
* City is the object
 
* City is the object
==== isA ====
+
Such a set of subject / predicate / object is called a {{Link|target=Triple}}
<pre>SiDIF isA DataInterchangeFormat</pre> is a valid SiDIF content.
+
==== graphical representation ====
 +
<graphviz>
 +
digraph cityexample {
 +
  Tokyo->City [label="isA"];
 +
}
 +
</graphviz>
 +
 
 +
== SiDIF Implementations ==
 +
see
 +
* https://github.com/BITPlan/org.sidif.triplestore for the original Java
 +
* https://github.com/WolfgangFahl/py-sidif for the more recent Python version
 +
 
 +
== Comparison to other formats ==
 +
Most other Triple formats are fare more complex.
 +
See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements
 +
being expressed  in SiDIF
 +
 
 +
* https://en.wikipedia.org/wiki/Cyc
 +
e.g. explains the Cyc Statement:
 +
<pre>
 +
(#$capitalCity #$France #$Paris)
 +
</pre>
 +
with ''"Paris is the capital of France."'' which in SiDIF would be:
 +
<pre>
 +
Paris is capital of France
 +
</pre>
 +
 
 +
== Links ==
 +
* [https://github.com/BITPlan/org.sidif.triplestore/issues org.sidif.triplestore Issues ]
 +
 
 +
=== Syntax ===
 +
<HTML>
 +
<HEAD>
 +
<TITLE>BNF for SiDIF.jjt</TITLE>
 +
</HEAD>
 +
<BODY>
 +
<H1 ALIGN=CENTER>BNF for SiDIF.jjt</H1>
 +
<H2 ALIGN=CENTER>TOKENS</H2>
 +
<TABLE>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/* WHITESPACE AND COMMENTS */</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; SKIP : {
 +
" "
 +
| "\n"
 +
| "\r"
 +
| "\r\n"
 +
| &lt;"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/* TOKENS for Productions */</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;IS: "is"&gt;
 +
| &lt;OF: "of"&gt;
 +
| &lt;HAS: "has"&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/* Literals */</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;INTEGER_LITERAL: &lt;DECIMAL_LITERAL&gt; (["l","L"])? | &lt;HEX_LITERAL&gt; (["l","L"])? | &lt;OCTAL_LITERAL&gt; (["l","L"])?&gt;
 +
| &lt;#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*&gt;
 +
| &lt;#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+&gt;
 +
| &lt;#OCTAL_LITERAL: "0" (["0"-"7"])*&gt;
 +
| &lt;FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (&lt;EXPONENT&gt;)? (["f","F","d","D"])? | "." (["0"-"9"])+ (&lt;EXPONENT&gt;)? (["f","F","d","D"])? | (["0"-"9"])+ &lt;EXPONENT&gt; (["f","F","d","D"])? | (["0"-"9"])+ (&lt;EXPONENT&gt;)? ["f","F","d","D"]&gt;
 +
| &lt;#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+&gt;
 +
| &lt;CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'"&gt;
 +
| &lt;STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\""&gt;
 +
| &lt;DATETIME_LITERAL: &lt;DATE_LITERAL&gt; ((&lt;WHITESPACE&gt;)+ &lt;TIME_LITERAL&gt;)?&gt;
 +
| &lt;#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]&gt;
 +
| &lt;TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?&gt;
 +
| &lt;TRUE: "true"&gt;
 +
| &lt;FALSE: "false"&gt;
 +
| &lt;NULL: "null"&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f"&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;URI: &lt;SCHEME&gt; (~[" ","\t","\n","\r"])+&gt;
 +
| &lt;#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:"&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/* IDENTIFIER */</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;IDENTIFIER: &lt;LETTER&gt; (&lt;LETTER&gt; | "_" | &lt;DIGIT&gt;)*&gt;
 +
| &lt;#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]&gt;
 +
| &lt;#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
// Catch-all tokens. Must be last.
 +
// Any non-whitespace. Causes a parser exception, rather than a
 +
// token manager error (with hidden line numbers).
 +
</PRE>
 +
</TD>
 +
</TR>
 +
<!-- Token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
&lt;DEFAULT&gt; TOKEN : {
 +
&lt;#UNKNOWN: (~[" ","\t","\n","\r","\f"])+&gt;
 +
}
 +
</PRE>
 +
</TD>
 +
</TR>
 +
</TABLE>
 +
<H2 ALIGN=CENTER>NON-TERMINALS</H2>
 +
<TABLE>
 +
<!-- Special token -->
 +
<TR>
 +
[[Category:SiDIF]]
 +
 
 +
<TD>
 +
<PRE>
 +
/*******************************************
 +
* THE SiDIF LANGUAGE GRAMMAR STARTS HERE *
 +
*******************************************/
 +
/* just as list of links */</PRE>
 +
</TD>
 +
</TR>
 +
<TR>
 +
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod1">Links</A></TD>
 +
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
 +
<TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod2">Link</A> | <A HREF="#prod3">Value</A> )+ &lt;EOF&gt;</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/**
 +
* a single link assignment
 +
*/</PRE>
 +
</TD>
 +
</TR>
 +
<TR>
 +
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod2">Link</A></TD>
 +
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
 +
<TD ALIGN=LEFT VALIGN=BASELINE>( ( &lt;IDENTIFIER&gt; &lt;IDENTIFIER&gt; &lt;IDENTIFIER&gt; ) | ( &lt;IDENTIFIER&gt; &lt;IS&gt; &lt;IDENTIFIER&gt; &lt;OF&gt; &lt;IDENTIFIER&gt; ) | ( &lt;IDENTIFIER&gt; &lt;HAS&gt; &lt;IDENTIFIER&gt; &lt;IDENTIFIER&gt; ) )</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/**
 +
* Literal Value assignment
 +
*/</PRE>
 +
</TD>
 +
</TR>
 +
<TR>
 +
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod3">Value</A></TD>
 +
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
 +
<TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod4">Literal</A> &lt;IS&gt; &lt;IDENTIFIER&gt; &lt;OF&gt; &lt;IDENTIFIER&gt; )</TD>
 +
</TR>
 +
<!-- Special token -->
 +
<TR>
 +
<TD>
 +
<PRE>
 +
/**
 +
* Handle Literal values
 +
*/</PRE>
 +
</TD>
 +
</TR>
 +
<TR>
 +
<TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod4">Literal</A></TD>
 +
<TD ALIGN=CENTER VALIGN=BASELINE>::=</TD>
 +
<TD ALIGN=LEFT VALIGN=BASELINE>( &lt;INTEGER_LITERAL&gt; | &lt;FLOATING_POINT_LITERAL&gt; | &lt;CHARACTER_LITERAL&gt; | &lt;STRING_LITERAL&gt; | &lt;DATETIME_LITERAL&gt; | &lt;TIME_LITERAL&gt; | &lt;URI&gt; | &lt;TRUE&gt; | &lt;FALSE&gt; | &lt;NULL&gt; )</TD>
 +
</TR>
 +
</TABLE>
 +
</BODY>
 +
</HTML>
 +
[[Category:frontend]]
 +
[[Category:SiGNaL]]

Revision as of 20:48, 20 February 2023

What links here

Introduction

The Simple Data Interchange Format (SiDIF) is yet another format for exchanging data between computers.

SiDIF isA DataInterchangeFormat

is a valid SiDIF content.

Examples

City Tokyo

City isA Concept
Tokyo isA City
webpage addsTo City
"http://www.tokyo.jp" is webpage of Tokyo

is valid SiDIF.

SiDIF is based on Triples

Each Sidif statement has a three part structure:

  1. subject
  2. predicate
  3. object

that is called a Triple

Royal family

The Royal92 SiDIF was created via a GEDCOM import. Together with the Model SiDIF and the MetaModel SiDIF it is the basis for the content of the Royal Family wiki A good entry point to browse the structure of that Wiki is the Topic table E.g. you could follow the following links:

  1. Person Concept derived from the Person Topic
  2. Help for the Person Topic

SiDIF Structure

SiDIF expressions

A SiDIF expression like

Tokyo isA City

consists of three parts:

  • Tokyo is the subject
  • isA is the predicate
  • City is the object

Such a set of subject / predicate / object is called a Triple

graphical representation

SiDIF Implementations

see

Comparison to other formats

Most other Triple formats are fare more complex. See e.g. https://github.com/BITPlan/org.sidif.triplestore/tree/master/src/test/resources/sidif.canonical for some example of triple formats with the same statements being expressed in SiDIF

e.g. explains the Cyc Statement:

 (#$capitalCity #$France #$Paris)

with "Paris is the capital of France." which in SiDIF would be:

Paris is capital of France

Links

Syntax

BNF for SiDIF.jjt

BNF for SiDIF.jjt

TOKENS

/* WHITESPACE AND COMMENTS */
<DEFAULT> SKIP : {
" "
| "\n"
| "\r"
| "\r\n"
| <"#" (~["\n","\r"])* ("\n" | "\r" | "\r\n")>
}
/* TOKENS for Productions */
<DEFAULT> TOKEN : {
<IS: "is">
| <OF: "of">
| <HAS: "has">
}
/* Literals */
<DEFAULT> TOKEN : {
<INTEGER_LITERAL: <DECIMAL_LITERAL> (["l","L"])? | <HEX_LITERAL> (["l","L"])? | <OCTAL_LITERAL> (["l","L"])?>
| <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*>
| <#HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+>
| <#OCTAL_LITERAL: "0" (["0"-"7"])*>
| <FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"]>
| <#EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+>
| <CHARACTER_LITERAL: "\'" (~["\'","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"])) "\'">
| <STRING_LITERAL: "\"" (~["\"","\\"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\"">
| <DATETIME_LITERAL: <DATE_LITERAL> ((<WHITESPACE>)+ <TIME_LITERAL>)?>
| <#DATE_LITERAL: ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"] "-" ["0"-"9"] ["0"-"9"]>
| <TIME_LITERAL: ["0"-"9"] ["0"-"9"] ":" ["0"-"9"] ["0"-"9"] (":" ["0"-"9"] ["0"-"9"])?>
| <TRUE: "true">
| <FALSE: "false">
| <NULL: "null">
}
<DEFAULT> TOKEN : {
<#WHITESPACE: " " | "\t" | "\n" | "\r" | "\f">
}
<DEFAULT> TOKEN : {
<URI: <SCHEME> (~[" ","\t","\n","\r"])+>
| <#SCHEME: "aaa:" | "aaas:" | "about:" | "acap:" | "acct:" | "cap:" | "cid:" | "coap:" | "coaps:" | "crid:" | "data:" | "dav:" | "dict:" | "dns:" | "file:" | "ftp:" | "geo:" | "go:" | "gopher:" | "h323:" | "http:" | "https:" | "iax:" | "icap:" | "im:" | "imap:" | "info:" | "ipp:" | "ipps:" | "iris:" | "iris.beep:" | "iris.xpc:" | "iris.xpcs:" | "iris.lwz:" | "jabber:" | "ldap:" | "mailto:" | "mid:" | "msrp:" | "msrps:" | "mtqp:" | "mupdate:" | "news:" | "nfs:" | "ni:" | "nih:" | "nntp:" | "opaquelocktoken:" | "pkcs11:" | "pop:" | "pres:" | "reload:" | "rtsp:" | "rtsps:" | "rtspu:" | "service:" | "session:" | "shttp:" | "sieve:" | "sip:" | "sips:" | "sms:" | "snmp:" | "soap.beep:" | "soap.beeps:" | "stun:" | "stuns:" | "tag:" | "tel:" | "telnet:" | "tftp:" | "thismessage:" | "tn3270:" | "tip:" | "turn:" | "turns:" | "tv:" | "urn:" | "vemmi:" | "ws:" | "wss:" | "xcon:" | "xcon-userid:" | "xmlrpc.beep:" | "xmlrpc.beeps:" | "xmpp:" | "z39.50r:" | "z39.50s:">
}
/* IDENTIFIER */
<DEFAULT> TOKEN : {
<IDENTIFIER: <LETTER> (<LETTER> | "_" | <DIGIT>)*>
| <#LETTER: ["$","A"-"Z","a"-"z","\u00c0"-"\u00d6","\u00d8"-"\u00f6","\u00f8"-"\u00ff","\u0100"-"\u1fff","\u3040"-"\u318f","\u3300"-"\u337f","\u3400"-"\u3d2d","\u4e00"-"\u9fff","\uf900"-"\ufaff"]>
| <#DIGIT: ["0"-"9","\u0660"-"\u0669","\u06f0"-"\u06f9","\u0966"-"\u096f","\u09e6"-"\u09ef","\u0a66"-"\u0a6f","\u0ae6"-"\u0aef","\u0b66"-"\u0b6f","\u0be7"-"\u0bef","\u0c66"-"\u0c6f","\u0ce6"-"\u0cef","\u0d66"-"\u0d6f","\u0e50"-"\u0e59","\u0ed0"-"\u0ed9","\u1040"-"\u1049"]>
}
// Catch-all tokens. Must be last.
// Any non-whitespace. Causes a parser exception, rather than a
// token manager error (with hidden line numbers).
<DEFAULT> TOKEN : {
<#UNKNOWN: (~[" ","\t","\n","\r","\f"])+>
}

NON-TERMINALS

[[Category:SiDIF]]
/*******************************************
* THE SiDIF LANGUAGE GRAMMAR STARTS HERE *
*******************************************/
/* just as list of links */
Links ::= ( Link | Value )+ <EOF>
/**
* a single link assignment
*/
Link ::= ( ( <IDENTIFIER> <IDENTIFIER> <IDENTIFIER> ) | ( <IDENTIFIER> <IS> <IDENTIFIER> <OF> <IDENTIFIER> ) | ( <IDENTIFIER> <HAS> <IDENTIFIER> <IDENTIFIER> ) )
/**
* Literal Value assignment
*/
Value ::= ( Literal <IS> <IDENTIFIER> <OF> <IDENTIFIER> )
/**
* Handle Literal values
*/
Literal ::= ( <INTEGER_LITERAL> | <FLOATING_POINT_LITERAL> | <CHARACTER_LITERAL> | <STRING_LITERAL> | <DATETIME_LITERAL> | <TIME_LITERAL> | <URI> | <TRUE> | <FALSE> | <NULL> )

References

  1. ^  A. van Renssen. (None) "Gellish: an information representation language, knowledge base and ontology" - 215-228 pages. doi: 10.1109/siit.2003.1251209