Difference between revisions of "Functional Dependencies"
(36 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
<math>AB\to CD</math> | <math>AB\to CD</math> | ||
− | = Example = | + | = Simple hierarchical Example = |
== 9 single entries for 2022 sorted by social media followers == | == 9 single entries for 2022 sorted by social media followers == | ||
− | === query === | + | === SPARQL query === |
+ | {{HideShow|SPARQLquerySingles2022| | ||
<source lang='sparql'> | <source lang='sparql'> | ||
# WF 2022-06-08 | # WF 2022-06-08 | ||
Line 58: | Line 59: | ||
</source> | </source> | ||
− | + | }} | |
[https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!] | [https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!] | ||
− | === result === | + | === result as of 2022-06-13 === |
{| class="wikitable" style="text-align: left;" | {| class="wikitable" style="text-align: left;" | ||
|+ <!-- caption --> | |+ <!-- caption --> | ||
Line 116: | Line 117: | ||
=== functional dependencies diagram === | === functional dependencies diagram === | ||
The following diagram shows the functional dependencies of this example. The nodes represent the attributes and the edges the dependencies. | The following diagram shows the functional dependencies of this example. The nodes represent the attributes and the edges the dependencies. | ||
− | <graphviz> | + | <graphviz format='png'> |
#generated by /Users/wf/Documents/pyworkspace/dbis-functional-dependencies/functional_dependencies/BCNF.py on 2022-06-14T09:48:59.758052 | #generated by /Users/wf/Documents/pyworkspace/dbis-functional-dependencies/functional_dependencies/BCNF.py on 2022-06-14T09:48:59.758052 | ||
digraph functionalDependencySet{ | digraph functionalDependencySet{ | ||
Line 140: | Line 141: | ||
}</graphviz> | }</graphviz> | ||
− | = Attribute Closures = | + | == Attribute Closures == |
The attribute closure is the set of all attributes that are "reachable/determinable" from a given attribute node. In this example the results are quite obvious since the | The attribute closure is the set of all attributes that are "reachable/determinable" from a given attribute node. In this example the results are quite obvious since the | ||
functional dependencies form a tree and the reachable nodes are all nodes of the subtree of a given attribute. | functional dependencies form a tree and the reachable nodes are all nodes of the subtree of a given attribute. | ||
Line 169: | Line 170: | ||
| J || J | | J || J | ||
|} | |} | ||
+ | |||
+ | == Keys for the example == | ||
+ | There is only a single [https://en.wikipedia.org/wiki/Candidate_key candidate key]: A≡single | ||
+ | |||
+ | It is the only attribute that has the full set of attributes as it's closure. | ||
+ | A≡single is also a [https://en.wikipedia.org/wiki/Super_key super key] and [https://en.wikipedia.org/wiki/Primary_key primary key] | ||
+ | |||
+ | With a slightly modified query we'll see that wikidata uses a [https://en.wikipedia.org/wiki/Surrogate_key surrogate key] as the primary key for all items. | ||
+ | Querying Wikidata's graph data more often than not leads to results that need to be normalized to be useful in the context of relational databases. | ||
+ | |||
+ | Extending the above query with a column for the wikidata Id of the single shows how the surrogate key is related to the single entries. | ||
+ | === 9 single entries for 2022 sorted by social media followers with wikidata Id link === | ||
+ | |||
+ | ==== query ==== | ||
+ | {{HideShow|SPARQL_Query_With_WikidataId| | ||
+ | <source lang='sparql'> | ||
+ | # WF 2022-06-08 | ||
+ | # Singles published in 2022 | ||
+ | SELECT | ||
+ | ?singleId ?single ?language ?collectionId ?collection ?collectionType ?performerId ?performer ?followerCount ?youtubeVideoId (year(?publicationDate) as ?publicationYear) | ||
+ | WHERE | ||
+ | { | ||
+ | # instanceof | ||
+ | ?singleId wdt:P31 wd:Q134556. # Must be of a single | ||
+ | ?singleId rdfs:label ?single. | ||
+ | FILTER(LANG(?single)='en') | ||
+ | |||
+ | #https://www.wikidata.org/wiki/Property:P361 - part of / ist Teil von | ||
+ | ?singleId wdt:P361 ?collectionId. | ||
+ | ?collectionId rdfs:label ?collection. | ||
+ | FILTER(LANG(?collection)='en') | ||
+ | |||
+ | # https://www.wikidata.org/wiki/Property:P407 language/ Sprache des Werkes | ||
+ | ?singleId wdt:P407 ?languageId. | ||
+ | ?languageId rdfs:label ?language. | ||
+ | FILTER(LANG(?language)='en') | ||
+ | |||
+ | # https://www.wikidata.org/wiki/Property:P1651 YouTube-Video-Kennung | ||
+ | ?singleId wdt:P1651 ?youtubeVideoId. | ||
+ | |||
+ | ?collectionId wdt:P31 ?collectionTypeId. | ||
+ | ?collectionTypeId rdfs:label ?collectionType. | ||
+ | FILTER(LANG(?collectionType)='en'). | ||
+ | |||
+ | # https://www.wikidata.org/wiki/Property:P175 Performer /Interpret | ||
+ | ?singleId wdt:P175 ?performerId. | ||
+ | ?performerId rdfs:label ?performer | ||
+ | FILTER(LANG(?performer)='en') | ||
+ | |||
+ | # https://www.wikidata.org/wiki/Property:P8687 followers / Follower | ||
+ | #{ | ||
+ | #SELECT (max(?followerCount) as ?follower) | ||
+ | #WHERE { | ||
+ | ?performerId wdt:P8687 ?followerCount. | ||
+ | #} | ||
+ | #GROUP BY ?performerId | ||
+ | #} | ||
+ | #https://www.wikidata.org/wiki/Property:P577 Publication date / Veröffentlichungsdatum | ||
+ | ?singleId wdt:P577 ?publicationDate. | ||
+ | FILTER(year(?publicationDate) = 2022). | ||
+ | } ORDER BY DESC(?followerCount) | ||
+ | LIMIT 9 | ||
+ | |||
+ | </source> | ||
+ | }} | ||
+ | [https://query.wikidata.org/#%23%20WF%202022-06-08%0A%23%20Singles%20published%20in%202022%0ASELECT%0A%3FsingleId%20%3Fsingle%20%3Flanguage%20%3FcollectionId%20%3Fcollection%20%3FcollectionType%20%3FperformerId%20%3Fperformer%20%3FfollowerCount%20%3FyoutubeVideoId%20%20%28year%28%3FpublicationDate%29%20as%20%3FpublicationYear%29%0AWHERE%0A%7B%0A%20%20%23%20instanceof%0A%20%20%3FsingleId%20wdt%3AP31%20wd%3AQ134556.%20%23%20Must%20be%20of%20a%20single%0A%20%20%3FsingleId%20rdfs%3Alabel%20%3Fsingle.%0A%20%20FILTER%28LANG%28%3Fsingle%29%3D%27en%27%29%0A%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP361%20-%20part%20of%20/%20ist%20Teil%20von%0A%20%20%3FsingleId%20wdt%3AP361%20%3FcollectionId.%0A%20%20%3FcollectionId%20rdfs%3Alabel%20%3Fcollection.%0A%20%20FILTER%28LANG%28%3Fcollection%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP407%20language/%20Sprache%20des%20Werkes%0A%20%20%3FsingleId%20wdt%3AP407%20%3FlanguageId.%0A%20%20%3FlanguageId%20rdfs%3Alabel%20%3Flanguage.%0A%20%20FILTER%28LANG%28%3Flanguage%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP1651%20YouTube-Video-Kennung%0A%20%20%3FsingleId%20wdt%3AP1651%20%3FyoutubeVideoId.%0A%0A%20%20%3FcollectionId%20wdt%3AP31%20%3FcollectionTypeId.%0A%20%20%3FcollectionTypeId%20rdfs%3Alabel%20%3FcollectionType.%0A%20%20FILTER%28LANG%28%3FcollectionType%29%3D%27en%27%29.%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP175%20Performer%20/Interpret%0A%20%20%3FsingleId%20wdt%3AP175%20%3FperformerId.%0A%20%20%3FperformerId%20rdfs%3Alabel%20%3Fperformer%0A%20%20FILTER%28LANG%28%3Fperformer%29%3D%27en%27%29%0A%0A%20%20%23%20https%3A//www.wikidata.org/wiki/Property%3AP8687%20followers%20/%20Follower%0A%20%20%23%7B%0A%20%20%20%20%20%23SELECT%20%28max%28%3FfollowerCount%29%20as%20%3Ffollower%29%0A%20%20%20%20%20%23WHERE%20%7B%0A%20%20%20%20%20%3FperformerId%20wdt%3AP8687%20%3FfollowerCount.%0A%20%20%20%20%20%23%7D%0A%20%20%20%20%20%23GROUP%20BY%20%3FperformerId%0A%20%20%23%7D%0A%20%20%23https%3A//www.wikidata.org/wiki/Property%3AP577%20Publication%20date%20/%20Ver%C3%B6ffentlichungsdatum%0A%20%20%3FsingleId%20wdt%3AP577%20%3FpublicationDate.%0A%20%20FILTER%28year%28%3FpublicationDate%29%20%3D%202022%29.%0A%7D%20ORDER%20BY%20%20DESC%28%3FfollowerCount%29%0ALIMIT%209%0A try it!] | ||
+ | === result === | ||
+ | {| class="wikitable" style="text-align: left;" | ||
+ | |+ <!-- caption --> | ||
+ | |- | ||
+ | ! singleId !! single !! language !! collectionId !! collection !! collectionType !! performerId !! performer !! align="right"| followerCount !! youtubeVideoId !! align="right"| publicationYear | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q108914561 || Let Somebody Go || English || http://www.wikidata.org/entity/Q107597380 || Music of the Spheres || album || http://www.wikidata.org/entity/Q83287 || Selena Gomez || align="right"| 65629044 || EptPhiK_q0E || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112316936 || Skin of My Teeth || English || http://www.wikidata.org/entity/Q112270102 || Holy Fvck || album || http://www.wikidata.org/entity/Q41173 || Demi Lovato || align="right"| 54350780 || HgrC_h8-2FM || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || English || http://www.wikidata.org/entity/Q111849557 || Proof || album || http://www.wikidata.org/entity/Q13580495 || BTS || align="right"| 45300000 || kXpOEzNZ8hQ || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || Korean || http://www.wikidata.org/entity/Q111849557 || Proof || album || http://www.wikidata.org/entity/Q13580495 || BTS || align="right"| 45300000 || kXpOEzNZ8hQ || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q111397510 || As It Was || English || http://www.wikidata.org/entity/Q111343757 || Harry's House || album || http://www.wikidata.org/entity/Q3626966 || Harry Styles || align="right"| 36156370 || H5v3kku4y6Q || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112075432 || Late Night Talking || English || http://www.wikidata.org/entity/Q111343757 || Harry's House || album || http://www.wikidata.org/entity/Q3626966 || Harry Styles || align="right"| 36156370 || RwT77rlp2CE || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || English || http://www.wikidata.org/entity/Q111849557 || Proof || album || http://www.wikidata.org/entity/Q13580495 || BTS || align="right"| 27300000 || kXpOEzNZ8hQ || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q112317311 || Yet to Come (The Most Beautiful Moment) || Korean || http://www.wikidata.org/entity/Q111849557 || Proof || album || http://www.wikidata.org/entity/Q13580495 || BTS || align="right"| 27300000 || kXpOEzNZ8hQ || align="right"| 2022 | ||
+ | |- | ||
+ | | http://www.wikidata.org/entity/Q108914561 || Let Somebody Go || English || http://www.wikidata.org/entity/Q107597380 || Music of the Spheres || album || http://www.wikidata.org/entity/Q45188 || Coldplay || align="right"| 23541189 || EptPhiK_q0E || align="right"| 2022 | ||
+ | |} | ||
+ | |||
+ | = Normal forms = | ||
+ | == 1st Normal Form - 1NF == | ||
+ | ''A relation is in first normal form if and only if no attribute domain has relations as elements'' [https://en.wikipedia.org/wiki/First_normal_form Wikipedia Article on First normal form] | ||
+ | * attribute domain means the type of attribute (think of attribute a cell of a table in spreadsheet and the domain as the type you selected for the cells of a column) | ||
+ | * the type not having releations is sometime also describe as "atomic" meaning having a base type such a string, float, integer and not tuples | ||
+ | * an example for a cell having a relations a elements would be having LAT and LON of a geocoordinate in the same cell | ||
+ | * if you have lists of values or even full tables in a cell than the relation is surely not in 1st normal Form 1NF | ||
+ | * if a single cell in a column violates 1NF the whole columns type is violating 1NF which makes the whole table violate 1NF | ||
+ | * in modern systems where you want to be lenient with user input its often hard to enforce 1NF - database designers might try to "cutting corners" around this by treating the type | ||
+ | as string event if it contains relations data such as a full list of comma separated elements, a json or xml encoded or even uuencoded binary string. | ||
+ | |||
+ | The example above is in first normal form the types of the columns are: | ||
+ | * string: A≡single, B≡language, D≡collection, E≡collectionType, G≡performer, I≡youtubeVideoId | ||
+ | * integer: H≡followerCount, J≡publicationYear | ||
+ | * url: C≡collectionId, F≡performerId | ||
+ | |||
+ | == 2nd Normal Form == | ||
+ | == 3rd Normal Form == | ||
+ | |||
+ | == Boyce Codd Normal Form == | ||
= Synthesis = | = Synthesis = | ||
Line 174: | Line 286: | ||
= Links = | = Links = | ||
+ | == Wikipedia == | ||
+ | The following graph shows some relevant wikipedia articles. If the nodes are not clickable in your browser you might want to open the svg image in a separate tab to get a clickable version of this graph. | ||
+ | <graphviz format='svg'> | ||
+ | digraph fd { | ||
+ | rankdir="LR" | ||
+ | dt [ label="Database theory" URL="https://en.wikipedia.org/wiki/Category:Database_theory"] | ||
+ | dt -> fd | ||
+ | fd [ label="Functional Dependencies" URL="https://en.wikipedia.org/wiki/Functional_dependency"] | ||
+ | fd -> dbn | ||
+ | fd -> key | ||
+ | fd -> aa | ||
+ | aa [ label="Armstrong's axioms" URL="https://en.wikipedia.org/wiki/Armstrong%27s_axioms" ] | ||
+ | dbn [ label="Database normalization" URL="https://en.wikipedia.org/wiki/Database_normalization"] | ||
+ | dbn -> nf1 | ||
+ | dbn -> nf2 | ||
+ | dbn -> synthesis -> nf3 | ||
+ | dbn -> decompose -> nf4 | ||
+ | synthesis [ label="Synthesis Algorithm" ] | ||
+ | decompose [ label="Decomposition Algorithm" ] | ||
+ | nf1 [ label="1st normal form" URL="https://en.wikipedia.org/wiki/First_normal_form"] | ||
+ | nf2 [ label="2nd normal form "URL="https://en.wikipedia.org/wiki/Second_normal_form"] | ||
+ | nf3 [ label="3rd normal form" URL="https://en.wikipedia.org/wiki/Third_normal_form"] | ||
+ | nf4 [ label="Boyce-Code normal form" URL="https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form"] | ||
+ | ljd [ label="Lossless Join Decomposition" URL="https://en.wikipedia.org/wiki/Lossless_join_decomposition"] | ||
+ | nf4 -> ljd | ||
+ | key [ label="Unique key" URL="https://en.wikipedia.org/wiki/Unique_key" ] | ||
+ | pk [ label="Primary key" URL="https://en.wikipedia.org/wiki/Primary_key" ] | ||
+ | ck [ label="Candidate key" URL="https://en.wikipedia.org/wiki/Candidate_key"] | ||
+ | suk [ label="Super key" URL="https://en.wikipedia.org/wiki/Superkey" ] | ||
+ | sgk [ label="Surrogate key" URL="https://en.wikipedia.org/wiki/Surrogate_key"] | ||
+ | key->pk | ||
+ | key->ck | ||
+ | key->sgk | ||
+ | key->suk | ||
+ | {rank = same; dt; aa;fd;dbn;key} | ||
+ | } | ||
+ | </graphviz> | ||
* https://en.wikipedia.org/wiki/Category:Database_theory | * https://en.wikipedia.org/wiki/Category:Database_theory | ||
* https://en.wikipedia.org/wiki/Functional_dependency | * https://en.wikipedia.org/wiki/Functional_dependency | ||
Line 184: | Line 333: | ||
* https://en.wikipedia.org/wiki/Candidate_key | * https://en.wikipedia.org/wiki/Candidate_key | ||
* https://en.wikipedia.org/wiki/Surrogate_key | * https://en.wikipedia.org/wiki/Surrogate_key | ||
+ | * https://en.wikipedia.org/wiki/Superkey | ||
+ | |||
+ | == Other == | ||
* https://stackoverflow.com/questions/tagged/functional-dependencies | * https://stackoverflow.com/questions/tagged/functional-dependencies | ||
+ | * https://www.geeksforgeeks.org/introduction-of-database-normalization/ | ||
+ | |||
[[Category:Tutorial]] | [[Category:Tutorial]] | ||
+ | [[Category:DBIS-VL]] |
Latest revision as of 16:27, 16 February 2023
Notation
In Database theory: [math]\{A,B\}\to \{C,D\}[/math] is abbreviated to: [math]AB\to CD[/math]
Simple hierarchical Example
9 single entries for 2022 sorted by social media followers
SPARQL query
result as of 2022-06-13
A | B | C | D | E | F | G | H | I | J |
---|---|---|---|---|---|---|---|---|---|
single | language | collectionId | collection | collectionType | performerId | performer | followerCount | youtubeVideoId | publicationYear |
Let Somebody Go | English | http://www.wikidata.org/entity/Q107597380 | Music of the Spheres | album | http://www.wikidata.org/entity/Q83287 | Selena Gomez | 65629044 | EptPhiK_q0E | 2022 |
As It Was | English | http://www.wikidata.org/entity/Q111343757 | Harry's House | album | http://www.wikidata.org/entity/Q3626966 | Harry Styles | 36156370 | H5v3kku4y6Q | 2022 |
Let Somebody Go | English | http://www.wikidata.org/entity/Q107597380 | Music of the Spheres | album | http://www.wikidata.org/entity/Q45188 | Coldplay | 23541189 | EptPhiK_q0E | 2022 |
Bam Bam | English | http://www.wikidata.org/entity/Q56071495 | Camila Cabello singles discography | singles discography | http://www.wikidata.org/entity/Q47447 | Ed Sheeran | 17661915 | -8VfKZCOo_I | 2022 |
Bam Bam | English | http://www.wikidata.org/entity/Q111083420 | Familia | album | http://www.wikidata.org/entity/Q47447 | Ed Sheeran | 17661915 | -8VfKZCOo_I | 2022 |
Sacrifice | English | http://www.wikidata.org/entity/Q110400486 | Dawn FM | album | http://www.wikidata.org/entity/Q2121062 | The Weeknd | 15549868 | VafTMsrnSTU | 2022 |
Out of Time | English | http://www.wikidata.org/entity/Q110400486 | Dawn FM | album | http://www.wikidata.org/entity/Q2121062 | The Weeknd | 15549868 | 2fDzCWNS3ig | 2022 |
Bam Bam | English | http://www.wikidata.org/entity/Q56071495 | Camila Cabello singles discography | singles discography | http://www.wikidata.org/entity/Q18810940 | Camila Cabello | 12717885 | -8VfKZCOo_I | 2022 |
Bam Bam | English | http://www.wikidata.org/entity/Q111083420 | Familia | album | http://www.wikidata.org/entity/Q18810940 | Camila Cabello | 12717885 | -8VfKZCOo_I | 2022 |
Functional Dependencies
The functional dependencies in this example are derived from the structure of the SPARQL query
- A≡single
- B≡language
- C≡collectionId
- D≡collection
- E≡collectionType
- F≡performerId
- G≡performer
- H≡followerCount
- I≡youtubeVideoId
- J≡publicationYear
- [math]A\to BCFIJ[/math]
- [math]C\to DE[/math]
- [math]F\to GH[/math]
[math]R={{\{A,B,C,D,E,F,G,H,I,J\}},{A\to BCFIJ,C\to DE,F\to GH}}[/math]
functional dependencies diagram
The following diagram shows the functional dependencies of this example. The nodes represent the attributes and the edges the dependencies.
Attribute Closures
The attribute closure is the set of all attributes that are "reachable/determinable" from a given attribute node. In this example the results are quite obvious since the functional dependencies form a tree and the reachable nodes are all nodes of the subtree of a given attribute.
attribute | closure |
---|---|
A | ABCDEFGHIJ |
B | B |
C | CDE |
D | D |
E | E |
F | FGH |
G | G |
H | H |
I | I |
J | J |
Keys for the example
There is only a single candidate key: A≡single
It is the only attribute that has the full set of attributes as it's closure. A≡single is also a super key and primary key
With a slightly modified query we'll see that wikidata uses a surrogate key as the primary key for all items. Querying Wikidata's graph data more often than not leads to results that need to be normalized to be useful in the context of relational databases.
Extending the above query with a column for the wikidata Id of the single shows how the surrogate key is related to the single entries.
9 single entries for 2022 sorted by social media followers with wikidata Id link
query
result
Normal forms
1st Normal Form - 1NF
A relation is in first normal form if and only if no attribute domain has relations as elements Wikipedia Article on First normal form
- attribute domain means the type of attribute (think of attribute a cell of a table in spreadsheet and the domain as the type you selected for the cells of a column)
- the type not having releations is sometime also describe as "atomic" meaning having a base type such a string, float, integer and not tuples
- an example for a cell having a relations a elements would be having LAT and LON of a geocoordinate in the same cell
- if you have lists of values or even full tables in a cell than the relation is surely not in 1st normal Form 1NF
- if a single cell in a column violates 1NF the whole columns type is violating 1NF which makes the whole table violate 1NF
- in modern systems where you want to be lenient with user input its often hard to enforce 1NF - database designers might try to "cutting corners" around this by treating the type
as string event if it contains relations data such as a full list of comma separated elements, a json or xml encoded or even uuencoded binary string.
The example above is in first normal form the types of the columns are:
- string: A≡single, B≡language, D≡collection, E≡collectionType, G≡performer, I≡youtubeVideoId
- integer: H≡followerCount, J≡publicationYear
- url: C≡collectionId, F≡performerId
2nd Normal Form
3rd Normal Form
Boyce Codd Normal Form
Synthesis
Links
Wikipedia
The following graph shows some relevant wikipedia articles. If the nodes are not clickable in your browser you might want to open the svg image in a separate tab to get a clickable version of this graph.
- https://en.wikipedia.org/wiki/Category:Database_theory
- https://en.wikipedia.org/wiki/Functional_dependency
- https://en.wikipedia.org/wiki/Database_normalization
- https://en.wikipedia.org/wiki/First_normal_form
- https://en.wikipedia.org/wiki/Second_normal_form
- https://en.wikipedia.org/wiki/Third_normal_form
- https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form
- https://en.wikipedia.org/wiki/Lossless_join_decomposition
- https://en.wikipedia.org/wiki/Candidate_key
- https://en.wikipedia.org/wiki/Surrogate_key
- https://en.wikipedia.org/wiki/Superkey