Difference between revisions of "Functional Dependencies"

From BITPlan Wiki
Jump to navigation Jump to search
 
(42 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
<math>AB\to CD</math>
 
<math>AB\to CD</math>
  
= Example =
+
= Simple hierarchical Example =
 
== 9 single entries for 2022 sorted by social media followers ==
 
== 9 single entries for 2022 sorted by social media followers ==
  
=== query ===
+
=== SPARQL query ===
 +
{{HideShow|SPARQLquerySingles2022|
 
<source lang='sparql'>
 
<source lang='sparql'>
 
# WF 2022-06-08
 
# WF 2022-06-08
Line 58: Line 59:
  
 
</source>
 
</source>
 
+
}}
 
[https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!]
 
[https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!]
  
=== result ===
+
=== result as of 2022-06-13 ===
 
{| class="wikitable" style="text-align: left;"
 
{| class="wikitable" style="text-align: left;"
 
|+ <!-- caption -->
 
|+ <!-- caption -->
Line 114: Line 115:
 
* <math>F\to GH</math>
 
* <math>F\to GH</math>
 
<math>R={{\{A,B,C,D,E,F,G,H,I,J\}},{A\to BCFIJ,C\to DE,F\to GH}}</math>
 
<math>R={{\{A,B,C,D,E,F,G,H,I,J\}},{A\to BCFIJ,C\to DE,F\to GH}}</math>
 +
=== functional dependencies diagram ===
 +
The following diagram shows the functional dependencies of this example. The nodes represent the attributes and the edges the dependencies.
 +
<graphviz format='png'>
 +
#generated by /Users/wf/Documents/pyworkspace/dbis-functional-dependencies/functional_dependencies/BCNF.py on 2022-06-14T09:48:59.758052
 +
digraph functionalDependencySet{
 +
  A [shape=box label="A≡single"]
 +
  B [shape=box label="B≡language"]
 +
  C [shape=box label="C≡collectionId"]
 +
  D [shape=box label="D≡collection"]
 +
  E [shape=box label="E≡collectionType"]
 +
  F [shape=box label="F≡performerId"]
 +
  G [shape=box label="G≡performer"]
 +
  H [shape=box label="H≡followerCount"]
 +
  I [shape=box label="I≡youtubeVideoId"]
 +
  J [shape=box label="J≡publicationYear"]
 +
  A->B
 +
  A->C
 +
  A->F
 +
  A->I
 +
  A->J
 +
  C->D
 +
  C->E
 +
  F->G
 +
  F->H
 +
}</graphviz>
 +
 +
== Attribute Closures ==
 +
The attribute closure is the set of all attributes that are "reachable/determinable" from a given attribute node. In this example the results are quite obvious since the
 +
functional dependencies form a tree and the reachable nodes are all nodes of the subtree of a given attribute.
 +
 +
{| class="wikitable" style="text-align: left;"
 +
|+ <!-- caption -->
 +
|-
 +
! attribute  !! closure
 +
|-
 +
| A          || ABCDEFGHIJ
 +
|-
 +
| B          || B
 +
|-
 +
| C          || CDE
 +
|-
 +
| D          || D
 +
|-
 +
| E          || E
 +
|-
 +
| F          || FGH
 +
|-
 +
| G          || G
 +
|-
 +
| H          || H
 +
|-
 +
| I          || I
 +
|-
 +
| J          || J
 +
|}
 +
 +
== Keys for the example ==
 +
There is only a single [https://en.wikipedia.org/wiki/Candidate_key candidate key]: A≡single
 +
 +
It is the only attribute that has the full set of attributes as it's closure.
 +
A≡single is also a [https://en.wikipedia.org/wiki/Super_key super key] and [https://en.wikipedia.org/wiki/Primary_key primary key]
 +
 +
With a slightly modified query we'll see that wikidata uses a [https://en.wikipedia.org/wiki/Surrogate_key surrogate key] as the primary key for all items.
 +
Querying Wikidata's graph data more often than not leads to results that need to be normalized to be useful in the context of relational databases.
 +
 +
Extending the above query with a column for the  wikidata Id of the single shows how the surrogate key is related to the single entries.
 +
=== 9 single entries for 2022 sorted by social media followers with wikidata Id link ===
 +
 +
==== query ====
 +
{{HideShow|SPARQL_Query_With_WikidataId|
 +
<source lang='sparql'>
 +
# WF 2022-06-08
 +
# Singles published in 2022
 +
SELECT
 +
?singleId ?single ?language ?collectionId ?collection ?collectionType ?performerId ?performer ?followerCount ?youtubeVideoId  (year(?publicationDate) as ?publicationYear)
 +
WHERE
 +
{
 +
  # instanceof
 +
  ?singleId wdt:P31 wd:Q134556. # Must be of a single
 +
  ?singleId rdfs:label ?single.
 +
  FILTER(LANG(?single)='en')
 +
 +
  #https://www.wikidata.org/wiki/Property:P361 - part of / ist Teil von
 +
  ?singleId wdt:P361 ?collectionId.
 +
  ?collectionId rdfs:label ?collection.
 +
  FILTER(LANG(?collection)='en')
 +
 +
  # https://www.wikidata.org/wiki/Property:P407 language/ Sprache des Werkes
 +
  ?singleId wdt:P407 ?languageId.
 +
  ?languageId rdfs:label ?language.
 +
  FILTER(LANG(?language)='en')
 +
 +
  # https://www.wikidata.org/wiki/Property:P1651 YouTube-Video-Kennung
 +
  ?singleId wdt:P1651 ?youtubeVideoId.
 +
 +
  ?collectionId wdt:P31 ?collectionTypeId.
 +
  ?collectionTypeId rdfs:label ?collectionType.
 +
  FILTER(LANG(?collectionType)='en').
 +
 +
  # https://www.wikidata.org/wiki/Property:P175 Performer /Interpret
 +
  ?singleId wdt:P175 ?performerId.
 +
  ?performerId rdfs:label ?performer
 +
  FILTER(LANG(?performer)='en')
 +
 +
  # https://www.wikidata.org/wiki/Property:P8687 followers / Follower
 +
  #{
 +
    #SELECT (max(?followerCount) as ?follower)
 +
    #WHERE {
 +
    ?performerId wdt:P8687 ?followerCount.
 +
    #}
 +
    #GROUP BY ?performerId
 +
  #}
 +
  #https://www.wikidata.org/wiki/Property:P577 Publication date / Veröffentlichungsdatum
 +
  ?singleId wdt:P577 ?publicationDate.
 +
  FILTER(year(?publicationDate) = 2022).
 +
} ORDER BY  DESC(?followerCount)
 +
LIMIT 9
 +
 +
</source>
 +
}}
 +
[https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3FsingleId%20%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!]
 +
=== result ===
 +
{| class="wikitable" style="text-align: left;"
 +
|+ <!-- caption -->
 +
|-
 +
! singleId                                  !! single                                  !! language  !! collectionId                              !! collection          !! collectionType  !! performerId                              !! performer    !! align="right"|  followerCount !! youtubeVideoId  !! align="right"|  publicationYear
 +
|-
 +
| http://www.wikidata.org/entity/Q108914561 || Let Somebody Go                        || English    || http://www.wikidata.org/entity/Q107597380 || Music of the Spheres || album            || http://www.wikidata.org/entity/Q83287    || Selena Gomez || align="right"|        65629044 || EptPhiK_q0E      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112316936 || Skin of My Teeth                        || English    || http://www.wikidata.org/entity/Q112270102 || Holy Fvck            || album            || http://www.wikidata.org/entity/Q41173    || Demi Lovato  || align="right"|        54350780 || HgrC_h8-2FM      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || English    || http://www.wikidata.org/entity/Q111849557 || Proof                || album            || http://www.wikidata.org/entity/Q13580495 || BTS          || align="right"|        45300000 || kXpOEzNZ8hQ      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || Korean    || http://www.wikidata.org/entity/Q111849557 || Proof                || album            || http://www.wikidata.org/entity/Q13580495 || BTS          || align="right"|        45300000 || kXpOEzNZ8hQ      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q111397510 || As It Was                              || English    || http://www.wikidata.org/entity/Q111343757 || Harry's House        || album            || http://www.wikidata.org/entity/Q3626966  || Harry Styles || align="right"|        36156370 || H5v3kku4y6Q      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112075432 || Late Night Talking                      || English    || http://www.wikidata.org/entity/Q111343757 || Harry's House        || album            || http://www.wikidata.org/entity/Q3626966  || Harry Styles || align="right"|        36156370 || RwT77rlp2CE      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || English    || http://www.wikidata.org/entity/Q111849557 || Proof                || album            || http://www.wikidata.org/entity/Q13580495 || BTS          || align="right"|        27300000 || kXpOEzNZ8hQ      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || Korean    || http://www.wikidata.org/entity/Q111849557 || Proof                || album            || http://www.wikidata.org/entity/Q13580495 || BTS          || align="right"|        27300000 || kXpOEzNZ8hQ      || align="right"|              2022
 +
|-
 +
| http://www.wikidata.org/entity/Q108914561 || Let Somebody Go                        || English    || http://www.wikidata.org/entity/Q107597380 || Music of the Spheres || album            || http://www.wikidata.org/entity/Q45188    || Coldplay    || align="right"|        23541189 || EptPhiK_q0E      || align="right"|              2022
 +
|}
 +
 +
= Normal forms =
 +
== 1st Normal Form - 1NF ==
 +
''A relation is in first normal form if and only if no attribute domain has relations as elements'' [https://en.wikipedia.org/wiki/First_normal_form Wikipedia Article on First normal form]
 +
* attribute domain means the type of attribute (think of attribute a cell of a table in spreadsheet and the domain as the type you selected for the cells of a column)
 +
* the type not having releations is sometime also describe as "atomic" meaning having a base type such a string, float, integer and not tuples
 +
* an example for a cell having a relations a elements would be having LAT and LON of a geocoordinate in the same cell
 +
* if you have lists of values or even full tables in a cell than the relation is surely not in 1st normal Form 1NF
 +
* if a single cell in a column violates 1NF the whole columns type is violating 1NF which makes the whole table violate 1NF
 +
* in modern systems where you want to be lenient with user input its often hard to enforce 1NF - database designers might try to "cutting corners" around this by treating the type
 +
as string event if it contains relations data such as a full list of comma separated elements, a json or xml encoded or even uuencoded binary string.
 +
 +
The example above is in first normal form the types of the columns are:
 +
* string: A≡single, B≡language, D≡collection, E≡collectionType, G≡performer, I≡youtubeVideoId
 +
* integer: H≡followerCount, J≡publicationYear
 +
* url: C≡collectionId, F≡performerId
 +
 +
== 2nd Normal Form ==
 +
== 3rd Normal Form ==
 +
 +
== Boyce Codd Normal Form ==
  
 
= Synthesis =
 
= Synthesis =
Line 119: Line 286:
  
 
= Links =
 
= Links =
 +
== Wikipedia ==
 +
The following graph shows some relevant wikipedia articles. If the nodes are not clickable in your browser you might want to open the svg image in a separate tab to get a clickable version of this graph.
 +
<graphviz format='svg'>
 +
digraph fd {
 +
  rankdir="LR"
 +
  dt [ label="Database theory" URL="https://en.wikipedia.org/wiki/Category:Database_theory"]
 +
  dt -> fd
 +
  fd [ label="Functional Dependencies" URL="https://en.wikipedia.org/wiki/Functional_dependency"] 
 +
  fd -> dbn
 +
  fd -> key
 +
  fd -> aa
 +
  aa [ label="Armstrong's axioms" URL="https://en.wikipedia.org/wiki/Armstrong%27s_axioms" ]
 +
  dbn [ label="Database normalization" URL="https://en.wikipedia.org/wiki/Database_normalization"]
 +
  dbn -> nf1
 +
  dbn -> nf2
 +
  dbn -> synthesis -> nf3
 +
  dbn -> decompose -> nf4
 +
  synthesis [ label="Synthesis Algorithm" ]
 +
  decompose [ label="Decomposition Algorithm" ]
 +
  nf1 [ label="1st normal form" URL="https://en.wikipedia.org/wiki/First_normal_form"]
 +
  nf2 [ label="2nd normal form "URL="https://en.wikipedia.org/wiki/Second_normal_form"]
 +
  nf3 [ label="3rd normal form" URL="https://en.wikipedia.org/wiki/Third_normal_form"]
 +
  nf4 [ label="Boyce-Code normal form" URL="https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form"]
 +
  ljd [ label="Lossless Join Decomposition" URL="https://en.wikipedia.org/wiki/Lossless_join_decomposition"]
 +
  nf4 -> ljd
 +
  key [ label="Unique key" URL="https://en.wikipedia.org/wiki/Unique_key" ]
 +
  pk [ label="Primary key" URL="https://en.wikipedia.org/wiki/Primary_key" ]
 +
  ck [ label="Candidate key" URL="https://en.wikipedia.org/wiki/Candidate_key"]
 +
  suk [ label="Super key" URL="https://en.wikipedia.org/wiki/Superkey" ]
 +
  sgk [ label="Surrogate key" URL="https://en.wikipedia.org/wiki/Surrogate_key"]
 +
  key->pk
 +
  key->ck
 +
  key->sgk
 +
  key->suk
 +
  {rank = same; dt; aa;fd;dbn;key}
 +
}
 +
</graphviz>
 
* https://en.wikipedia.org/wiki/Category:Database_theory
 
* https://en.wikipedia.org/wiki/Category:Database_theory
 
* https://en.wikipedia.org/wiki/Functional_dependency
 
* https://en.wikipedia.org/wiki/Functional_dependency
Line 129: Line 333:
 
* https://en.wikipedia.org/wiki/Candidate_key
 
* https://en.wikipedia.org/wiki/Candidate_key
 
* https://en.wikipedia.org/wiki/Surrogate_key
 
* https://en.wikipedia.org/wiki/Surrogate_key
 +
* https://en.wikipedia.org/wiki/Superkey
 +
 +
== Other ==
 
* https://stackoverflow.com/questions/tagged/functional-dependencies
 
* https://stackoverflow.com/questions/tagged/functional-dependencies
 +
* https://www.geeksforgeeks.org/introduction-of-database-normalization/
 +
 
[[Category:Tutorial]]
 
[[Category:Tutorial]]
 +
[[Category:DBIS-VL]]

Latest revision as of 16:27, 16 February 2023

Notation

In Database theory: [math]\{A,B\}\to \{C,D\}[/math] is abbreviated to: [math]AB\to CD[/math]

Simple hierarchical Example

9 single entries for 2022 sorted by social media followers

SPARQL query

try it!

result as of 2022-06-13

A B C D E F G H I J
single language collectionId collection collectionType performerId performer followerCount youtubeVideoId publicationYear
Let Somebody Go English http://www.wikidata.org/entity/Q107597380 Music of the Spheres album http://www.wikidata.org/entity/Q83287 Selena Gomez 65629044 EptPhiK_q0E 2022
As It Was English http://www.wikidata.org/entity/Q111343757 Harry's House album http://www.wikidata.org/entity/Q3626966 Harry Styles 36156370 H5v3kku4y6Q 2022
Let Somebody Go English http://www.wikidata.org/entity/Q107597380 Music of the Spheres album http://www.wikidata.org/entity/Q45188 Coldplay 23541189 EptPhiK_q0E 2022
Bam Bam English http://www.wikidata.org/entity/Q56071495 Camila Cabello singles discography singles discography http://www.wikidata.org/entity/Q47447 Ed Sheeran 17661915 -8VfKZCOo_I 2022
Bam Bam English http://www.wikidata.org/entity/Q111083420 Familia album http://www.wikidata.org/entity/Q47447 Ed Sheeran 17661915 -8VfKZCOo_I 2022
Sacrifice English http://www.wikidata.org/entity/Q110400486 Dawn FM album http://www.wikidata.org/entity/Q2121062 The Weeknd 15549868 VafTMsrnSTU 2022
Out of Time English http://www.wikidata.org/entity/Q110400486 Dawn FM album http://www.wikidata.org/entity/Q2121062 The Weeknd 15549868 2fDzCWNS3ig 2022
Bam Bam English http://www.wikidata.org/entity/Q56071495 Camila Cabello singles discography singles discography http://www.wikidata.org/entity/Q18810940 Camila Cabello 12717885 -8VfKZCOo_I 2022
Bam Bam English http://www.wikidata.org/entity/Q111083420 Familia album http://www.wikidata.org/entity/Q18810940 Camila Cabello 12717885 -8VfKZCOo_I 2022

Functional Dependencies

The functional dependencies in this example are derived from the structure of the SPARQL query

  • A≡single
  • B≡language
  • C≡collectionId
  • D≡collection
  • E≡collectionType
  • F≡performerId
  • G≡performer
  • H≡followerCount
  • I≡youtubeVideoId
  • J≡publicationYear
  • [math]A\to BCFIJ[/math]
  • [math]C\to DE[/math]
  • [math]F\to GH[/math]

[math]R={{\{A,B,C,D,E,F,G,H,I,J\}},{A\to BCFIJ,C\to DE,F\to GH}}[/math]

functional dependencies diagram

The following diagram shows the functional dependencies of this example. The nodes represent the attributes and the edges the dependencies.

Attribute Closures

The attribute closure is the set of all attributes that are "reachable/determinable" from a given attribute node. In this example the results are quite obvious since the functional dependencies form a tree and the reachable nodes are all nodes of the subtree of a given attribute.

attribute closure
A ABCDEFGHIJ
B B
C CDE
D D
E E
F FGH
G G
H H
I I
J J

Keys for the example

There is only a single candidate key: A≡single

It is the only attribute that has the full set of attributes as it's closure. A≡single is also a super key and primary key

With a slightly modified query we'll see that wikidata uses a surrogate key as the primary key for all items. Querying Wikidata's graph data more often than not leads to results that need to be normalized to be useful in the context of relational databases.

Extending the above query with a column for the wikidata Id of the single shows how the surrogate key is related to the single entries.

9 single entries for 2022 sorted by social media followers with wikidata Id link

query

try it!

result

singleId single language collectionId collection collectionType performerId performer followerCount youtubeVideoId publicationYear
http://www.wikidata.org/entity/Q108914561 Let Somebody Go English http://www.wikidata.org/entity/Q107597380 Music of the Spheres album http://www.wikidata.org/entity/Q83287 Selena Gomez 65629044 EptPhiK_q0E 2022
http://www.wikidata.org/entity/Q112316936 Skin of My Teeth English http://www.wikidata.org/entity/Q112270102 Holy Fvck album http://www.wikidata.org/entity/Q41173 Demi Lovato 54350780 HgrC_h8-2FM 2022
http://www.wikidata.org/entity/Q112317311 Yet to Come (The Most Beautiful Moment) English http://www.wikidata.org/entity/Q111849557 Proof album http://www.wikidata.org/entity/Q13580495 BTS 45300000 kXpOEzNZ8hQ 2022
http://www.wikidata.org/entity/Q112317311 Yet to Come (The Most Beautiful Moment) Korean http://www.wikidata.org/entity/Q111849557 Proof album http://www.wikidata.org/entity/Q13580495 BTS 45300000 kXpOEzNZ8hQ 2022
http://www.wikidata.org/entity/Q111397510 As It Was English http://www.wikidata.org/entity/Q111343757 Harry's House album http://www.wikidata.org/entity/Q3626966 Harry Styles 36156370 H5v3kku4y6Q 2022
http://www.wikidata.org/entity/Q112075432 Late Night Talking English http://www.wikidata.org/entity/Q111343757 Harry's House album http://www.wikidata.org/entity/Q3626966 Harry Styles 36156370 RwT77rlp2CE 2022
http://www.wikidata.org/entity/Q112317311 Yet to Come (The Most Beautiful Moment) English http://www.wikidata.org/entity/Q111849557 Proof album http://www.wikidata.org/entity/Q13580495 BTS 27300000 kXpOEzNZ8hQ 2022
http://www.wikidata.org/entity/Q112317311 Yet to Come (The Most Beautiful Moment) Korean http://www.wikidata.org/entity/Q111849557 Proof album http://www.wikidata.org/entity/Q13580495 BTS 27300000 kXpOEzNZ8hQ 2022
http://www.wikidata.org/entity/Q108914561 Let Somebody Go English http://www.wikidata.org/entity/Q107597380 Music of the Spheres album http://www.wikidata.org/entity/Q45188 Coldplay 23541189 EptPhiK_q0E 2022

Normal forms

1st Normal Form - 1NF

A relation is in first normal form if and only if no attribute domain has relations as elements Wikipedia Article on First normal form

  • attribute domain means the type of attribute (think of attribute a cell of a table in spreadsheet and the domain as the type you selected for the cells of a column)
  • the type not having releations is sometime also describe as "atomic" meaning having a base type such a string, float, integer and not tuples
  • an example for a cell having a relations a elements would be having LAT and LON of a geocoordinate in the same cell
  • if you have lists of values or even full tables in a cell than the relation is surely not in 1st normal Form 1NF
  • if a single cell in a column violates 1NF the whole columns type is violating 1NF which makes the whole table violate 1NF
  • in modern systems where you want to be lenient with user input its often hard to enforce 1NF - database designers might try to "cutting corners" around this by treating the type

as string event if it contains relations data such as a full list of comma separated elements, a json or xml encoded or even uuencoded binary string.

The example above is in first normal form the types of the columns are:

  • string: A≡single, B≡language, D≡collection, E≡collectionType, G≡performer, I≡youtubeVideoId
  • integer: H≡followerCount, J≡publicationYear
  • url: C≡collectionId, F≡performerId

2nd Normal Form

3rd Normal Form

Boyce Codd Normal Form

Synthesis

Links

Wikipedia

The following graph shows some relevant wikipedia articles. If the nodes are not clickable in your browser you might want to open the svg image in a separate tab to get a clickable version of this graph.

Other