diff --git a/00_Introduction.qmd b/00_Introduction.qmd index 39bebacea7e4e158aad13e98070b87dee237eb14..34bd408e8c71528e05a7f9dcd6fe809214d97b94 100644 --- a/00_Introduction.qmd +++ b/00_Introduction.qmd @@ -173,7 +173,7 @@ business value ::: {.column width="47.5%"} -- The examination form of this lecture will a sma ll project be to answer a couple of questions as a team for a given dataset using the methodolog learnt over the course of this lecture +- The examination form of this lecture will a small project be to answer a couple of questions as a team for a given dataset using the methodolog learnt over the course of this lecture - More details will follow ::: @@ -271,7 +271,7 @@ plot(teams_code_new)#, col = c("white", "grey60")) ::: {.column width="40%"} - +{#fig-jupyter} ::: ::: @@ -317,7 +317,7 @@ sum_xy(5,6) ::: columns ::: {.column width="47.5%"} - +{#fig-jupyter1} ::: ::: {.column width="5%"} @@ -335,7 +335,7 @@ sum_xy(5,6) ::: columns ::: {.column width="47.5%"} - +{#fig-jupyter2} ::: ::: {.column width="5%"} @@ -354,7 +354,7 @@ sum_xy(5,6) ::: columns ::: {.column width="47.5%"} - +{#fig-jupyter3} ::: ::: {.column width="5%"} @@ -374,7 +374,7 @@ sum_xy(5,6) ::: columns ::: {.column width="47.5%"} - +{#fig-jupyter4} ::: ::: {.column width="5%"} diff --git a/01_data.qmd b/01_data.qmd index 0a65d605b7ab55123da8f0f50a6fe725c6552b6b..241d9a29b285b9fbba57e06333ad77cc431c1acb 100644 --- a/01_data.qmd +++ b/01_data.qmd @@ -2,7 +2,7 @@ --- title: "Data Literacy" -subtitle: "Chapter 1: Data" +subtitle: "Chapter 1: Data and Data Bases" author: Prof. Dr. Michael Bücker number-offset: [1,0] bibliography: references.bib @@ -35,8 +35,12 @@ bibliography: references.bib - Most important aspect of analog data: stepless - Digitization of analog data: analog information is measured in short time intervals and digital information is determined for each measured value - The quality of this conversion depends on the short time interval between two measurements and on the accuracy of the measurement + {#fig-quantized} + + + ::: :::: @@ -69,9 +73,9 @@ Please watch the following video: ## Data types - +{#fig-pythondatatypes} -# Data types in Python +## Data types in Python :::: {.columns} @@ -128,27 +132,27 @@ type(s) ## From data to wisdom (1/4) {#sec-datawisdom} - +{#fig-datapyramid1} ## [-@sec-datawisdom] From data to wisdom (2/4) {.unnumbered} - +{#fig-datapyramid2} ## [-@sec-datawisdom] From data to wisdom (3/4) {.unnumbered} - +{#fig-datapyramid3} ## [-@sec-datawisdom] From data to wisdom (4/4) {.unnumbered} - +{#fig-datapyramid4} ## Data characteristics - +{#fig-datchar} ## Data types @@ -161,16 +165,166 @@ type(s) ## Motivation -## Relational data +:::: {.columns} + +::: {.column width="47.5%"} + +- **Structured Storage**: Organizes data in a defined manner, allowing for relationship establishment between data types. +- **Data Integrity and Accuracy**: Ensures data remains accurate and consistent through integrity constraints and validation mechanisms. +- **Ease of Data Retrieval**: Facilitates data extraction through sophisticated querying and reporting capabilities. +- **Data Security**: Provides robust protection features to safeguard sensitive data through access controls. +- **Concurrency Control**: Supports simultaneous data access by multiple users while maintaining data consistency. + + +::: + +::: {.column width="5%"} + +::: + +::: {.column width="47.5%"} +- **Data Backup and Recovery**: Offers built-in features to protect against data loss and enables data restoration. +- **Scalability and Performance**: Efficiently handles growing data and transactions, ensuring application responsiveness. +- **Compliance and Auditing**: Supports regulatory compliance and provides auditing tools for tracking data access. +- **Cost Efficiency**: Reduces total ownership cost through consolidated data management and automation. +- **Data Analysis and Decision-Making**: Enables data mining and analysis for informed decision-making and insights. + +::: +:::: + ## Relational data models +- **Definition**: A relational data model organizes data into tables (or relations) where each table represents a different entity, and each row in a table represents a unique instance of that entity. Columns within the tables represent attributes of the entities. + +- **Normalization**: A technique used to minimize data redundancy and avoid undesirable characteristics like insertion, update, and deletion anomalies by organizing data in a way that eliminates repeating groups and ensures data dependencies make sense. + +- **ACID Properties**: + - **Atomicity**: Ensures that all parts of a transaction are completed successfully or not at all. + - **Consistency**: Ensures that the database remains in a consistent state before and after the transaction. + - **Isolation**: Ensures that transactions are securely and independently processed at the same time without interference. + - **Durability**: Ensures that the effects of a transaction are permanent and can withstand system failures. + + +- **Schema**: Defines the structure of the relational database including tables, fields, and the relationships between them. The schema acts as a blueprint for how data is organized and how relationships between data are handled. + + + + +## Relational data schemas + +A Relational Model is a type of database model based on the concept of relations, which are akin to tables of data. In a relation, data is organized in tuples (rows) and attributes (columns). + +:::: {.columns} + +::: {.column width="47.5%"} + + +#### 1. Relations (Tables) +- A **Relation** is a set of tuples. +- Each **Tuple** represents a single item. +- Each **Attribute** in a tuple has a specific data type. + + +#### 2. Relationship cardinalities +- **One-to-One (1:1):** Each item in one relation is linked to exactly one item in another relation. +- **One-to-Many (1:M):** One item in a relation can be linked to many items in another relation. +- **Many-to-Many (M:M):** Items in one relation can be linked to multiple items in another relation. + +::: + +::: {.column width="5%"} + +::: + +::: {.column width="47.5%"} + + +#### 3. Keys +- **Primary Key:** A unique identifier for each tuple within a relation. +- **Foreign Key:** A field in one relation that refers to the primary key in another relation. + +#### 4. Integrity Constraints +- **Entity Integrity:** E.g. no primary key value can be null. +- **Referential Integrity:** Ensures that relationships between relations are maintained. +- ... +::: +:::: + + + + +## Visualization of relational data models + + +:::: {.columns} + +::: {.column width="47.5%"} + + +{#fig-relmod} + + +::: + +::: {.column width="5%"} + +::: + +::: {.column width="47.5%"} + +- In a visualization of relational data models, each **table** is represented by a box with the table's name on top and the list of **columns/attributes** below +- Special columns like **primary and foreign keys** are marked +- **Relationships** are represented by connections between the tables with respective notations for the **cardinalities** (see [@fig-cardinalities]) + +{#fig-cardinalities} +::: +:::: + ## Accessing data bases ## Working with data bases - SQL ## Other types of data bases +Traditional Relational Database Management Systems (RDBMS) have been the standard for data storage and management. However, with the advent of big data and real-time applications, other database models have emerged to address specific needs. + +:::: {.columns} + +::: {.column width="47.5%"} +#### 1. NoSQL Databases +- **Key-Value Stores:** Simple and highly scalable, e.g., Redis, DynamoDB. +- **Document Stores:** Store, retrieve, and manage document-oriented information, e.g., MongoDB, CouchDB. +- **Column-family Stores:** Ideal for handling large data sets, e.g., Cassandra, HBase. +- **Graph Databases:** Excellent for managing interconnected data, e.g., Neo4j, Amazon Neptune. + +#### 2. NewSQL Databases +- Aim to provide the scalability of NoSQL databases while maintaining the ACID properties of relational databases, e.g. Google Spanner, CockroachDB. +::: + +::: {.column width="5%"} + +::: + +::: {.column width="47.5%"} + + +#### 3. In-Memory Databases (IMDBs) +- Store data in the main memory (instead of disk) for faster data access, e.g., Redis, SAP HANA. + +#### 4. Time Series Databases (TSDBs) +- Optimized for handling time-series data, e.g., InfluxDB, Prometheus. + +#### 5. Multi-model Databases +- Support multiple data models within a single, integrated backend, e.g., ArangoDB, OrientDB. + +::: +:::: + + + + + # References {.unnumbered .scrollable} ::: {#refs} diff --git a/output/00_Introduction.html b/output/00_Introduction.html index 443639522e7010a891b6baa2d810944461e6bf1b..f887020d8b920acf60629bd8efd64636b94f18a6 100644 --- a/output/00_Introduction.html +++ b/output/00_Introduction.html @@ -608,7 +608,7 @@ Chief Economist at Google</p> <div class="columns"> <div class="column" style="width:47.5%;"> <ul> -<li>The examination form of this lecture will a sma ll project be to answer a couple of questions as a team for a given dataset using the methodolog learnt over the course of this lecture</li> +<li>The examination form of this lecture will a small project be to answer a couple of questions as a team for a given dataset using the methodolog learnt over the course of this lecture</li> <li>More details will follow</li> </ul> </div><div class="column" style="width:5%;"> @@ -677,7 +677,12 @@ Chief Economist at Google</p> </div><div class="column" style="width:5%;"> </div><div class="column" style="width:40%;"> +<div id="fig-jupyter" class="quarto-figure quarto-figure-center"> +<figure> <p><img data-src="https://docs.oracle.com/de-de/iaas/Content/bigdata/images/jupyterhub-launch.png"></p> +<figcaption>Figure 4: The user interface of JupyterHub</figcaption> +</figure> +</div> </div> </div> </section> @@ -711,7 +716,12 @@ Chief Economist at Google</p> <h3><span class="header-section-number">0.4.3</span> Accessing the FH Münster Jupyter Hub (1/4)</h3> <div class="columns"> <div class="column" style="width:47.5%;"> +<div id="fig-jupyter1" class="quarto-figure quarto-figure-center"> +<figure> <p><img data-src="img/jupyter1.png"></p> +<figcaption>Figure 5: The login screen for the JupyterHub at FH Münster</figcaption> +</figure> +</div> </div><div class="column" style="width:5%;"> </div><div class="column" style="width:47.5%;"> @@ -726,7 +736,12 @@ Chief Economist at Google</p> <h3><a href="#/sec-jupyterhub">0.4.3</a> Accessing the FH Münster Jupyter Hub (2/4)</h3> <div class="columns"> <div class="column" style="width:47.5%;"> +<div id="fig-jupyter2" class="quarto-figure quarto-figure-center"> +<figure> <p><img data-src="img/jupyter2.png"></p> +<figcaption>Figure 6: Selection of the server for this class</figcaption> +</figure> +</div> </div><div class="column" style="width:5%;"> </div><div class="column" style="width:47.5%;"> @@ -742,7 +757,12 @@ Chief Economist at Google</p> <h3><a href="#/sec-jupyterhub">0.4.3</a> Accessing the FH Münster Jupyter Hub (3/4)</h3> <div class="columns"> <div class="column" style="width:47.5%;"> +<div id="fig-jupyter3" class="quarto-figure quarto-figure-center"> +<figure> <p><img data-src="img/jupyter3.png"></p> +<figcaption>Figure 7: Spinning up the server</figcaption> +</figure> +</div> </div><div class="column" style="width:5%;"> </div><div class="column" style="width:47.5%;"> @@ -759,7 +779,12 @@ Chief Economist at Google</p> <h3><a href="#/sec-jupyterhub">0.4.3</a> Accessing the FH Münster Jupyter Hub (4/4)</h3> <div class="columns"> <div class="column" style="width:47.5%;"> +<div id="fig-jupyter4" class="quarto-figure quarto-figure-center"> +<figure> <p><img data-src="img/jupyter4.png"></p> +<figcaption>Figure 8: The user interface after the login</figcaption> +</figure> +</div> </div><div class="column" style="width:5%;"> </div><div class="column" style="width:47.5%;"> @@ -793,7 +818,7 @@ Chief Economist at Google</p> <div id="fig-stackoverflow" class="quarto-figure quarto-figure-center"> <figure> <p><img data-src="img/stackoverflow.png"></p> -<figcaption>Figure 4: Questions and answers on <a href="https://stackoverflow.com/questions/tagged/pandas">Stack Overflow</a> with regards to the Python library <a href="https://pandas.pydata.org/">pandas</a></figcaption> +<figcaption>Figure 9: Questions and answers on <a href="https://stackoverflow.com/questions/tagged/pandas">Stack Overflow</a> with regards to the Python library <a href="https://pandas.pydata.org/">pandas</a></figcaption> </figure> </div> </div> @@ -813,7 +838,7 @@ Chief Economist at Google</p> <div id="fig-youtube" class="quarto-figure quarto-figure-center"> <figure> <p><img data-src="img/youtube.png"></p> -<figcaption>Figure 5: The <a href="https://www.youtube.com/@coreyms">coreyms</a> channel on YouTube with many Python coding instruction videos</figcaption> +<figcaption>Figure 10: The <a href="https://www.youtube.com/@coreyms">coreyms</a> channel on YouTube with many Python coding instruction videos</figcaption> </figure> </div> </div> @@ -847,7 +872,7 @@ Chief Economist at Google</p> <div id="fig-chatgpt" class="quarto-figure quarto-figure-center"> <figure> <p><img data-src="img/chatgpt.png"></p> -<figcaption>Figure 6: ChatGPT writing and testing Python code</figcaption> +<figcaption>Figure 11: ChatGPT writing and testing Python code</figcaption> </figure> </div> </div> diff --git a/output/01_data.html b/output/01_data.html index 28cb3abcdd6e089a21ddd05e9d980d55c09509f3..7be3567a5acf715ad571bf9a470493d7607ec7dd 100644 --- a/output/01_data.html +++ b/output/01_data.html @@ -412,7 +412,7 @@ <section id="title-slide" data-background-image="img/title.png" data-background-size="cover" class="quarto-title-block center"> <h1 class="title">Data Literacy</h1> - <p class="subtitle">Chapter 1: Data</p> + <p class="subtitle">Chapter 1: Data and Data Bases</p> <div class="quarto-title-authors"> <div class="quarto-title-author"> @@ -427,8 +427,7 @@ Prof. Dr. Michael Bücker <h2 id="toc-title">Table of contents</h2> <ul> <li><a href="#/data" id="/toc-data"><span class="header-section-number">1.1</span> Data</a></li> -<li><a href="#/data-types-in-python" id="/toc-data-types-in-python"><span class="header-section-number">1.2</span> Data types in Python</a></li> -<li><a href="#/databases" id="/toc-databases"><span class="header-section-number">1.3</span> Databases</a></li> +<li><a href="#/databases" id="/toc-databases"><span class="header-section-number">1.2</span> Databases</a></li> <li><a href="#/references" id="/toc-references">References</a></li> </ul> </nav> @@ -455,8 +454,14 @@ Prof. Dr. Michael Bücker <ul> <li>Most important aspect of analog data: stepless</li> <li>Digitization of analog data: analog information is measured in short time intervals and digital information is determined for each measured value</li> -<li>The quality of this conversion depends on the short time interval between two measurements and on the accuracy of the measurement <img data-src="https://upload.wikimedia.org/wikipedia/commons/7/70/Quantized.signal.svg" id="fig-quantized" alt="An example for quantization of an analog signal"></li> +<li>The quality of this conversion depends on the short time interval between two measurements and on the accuracy of the measurement</li> </ul> +<div id="fig-quantized" class="quarto-figure quarto-figure-center"> +<figure> +<p><img data-src="https://upload.wikimedia.org/wikipedia/commons/7/70/Quantized.signal.svg"></p> +<figcaption>Figure 1.1: An example for quantization of an analog signal</figcaption> +</figure> +</div> </div> </div> </section> @@ -498,10 +503,9 @@ Prof. Dr. Michael Bücker <section id="data-types" class="slide level3" data-number="1.1.5"> <h3><span class="header-section-number">1.1.5</span> Data types</h3> -<img data-src="https://i.imgur.com/6cg2E9Q.png" class="r-stretch"></section></section> -<section> -<section id="data-types-in-python" class="title-slide slide level2 center" data-number="1.2"> -<h2><span class="header-section-number">1.2</span> Data types in Python</h2> +<img data-src="https://i.imgur.com/6cg2E9Q.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.2: Python data types</p></section> +<section id="data-types-in-python" class="slide level3" data-number="1.1.6"> +<h3><span class="header-section-number">1.1.6</span> Data types in Python</h3> <div class="columns"> <div class="column" style="width:47.5%;"> <ul> @@ -515,7 +519,7 @@ Prof. Dr. Michael Bücker <div class="cell-output cell-output-stdout"> <pre><code>1</code></pre> </div> -<div class="cell-output cell-output-display" data-execution_count="5"> +<div class="cell-output cell-output-display" data-execution_count="33"> <pre><code>int</code></pre> </div> </div> @@ -530,7 +534,7 @@ Prof. Dr. Michael Bücker <div class="cell-output cell-output-stdout"> <pre><code>1.1</code></pre> </div> -<div class="cell-output cell-output-display" data-execution_count="6"> +<div class="cell-output cell-output-display" data-execution_count="34"> <pre><code>float</code></pre> </div> </div> @@ -548,7 +552,7 @@ Prof. Dr. Michael Bücker <div class="cell-output cell-output-stdout"> <pre><code>True</code></pre> </div> -<div class="cell-output cell-output-display" data-execution_count="7"> +<div class="cell-output cell-output-display" data-execution_count="35"> <pre><code>bool</code></pre> </div> </div> @@ -563,60 +567,182 @@ Prof. Dr. Michael Bücker <div class="cell-output cell-output-stdout"> <pre><code>Text</code></pre> </div> -<div class="cell-output cell-output-display" data-execution_count="8"> +<div class="cell-output cell-output-display" data-execution_count="36"> <pre><code>str</code></pre> </div> </div> </div> </div> </section> -<section id="sec-datawisdom" class="slide level3" data-number="1.2.1"> -<h3><span class="header-section-number">1.2.1</span> From data to wisdom (1/4)</h3> +<section id="sec-datawisdom" class="slide level3" data-number="1.1.7"> +<h3><span class="header-section-number">1.1.7</span> From data to wisdom (1/4)</h3> -<img data-src="img/pyramid1.png" class="r-stretch"></section> +<img data-src="img/pyramid1.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.3: The data pyramid (part 1)</p></section> <section id="sec-datawisdom-from-data-to-wisdom-24" class="slide level3 unnumbered"> -<h3><a href="#/sec-datawisdom">1.2.1</a> From data to wisdom (2/4)</h3> +<h3><a href="#/sec-datawisdom">1.1.7</a> From data to wisdom (2/4)</h3> -<img data-src="img/pyramid2.png" class="r-stretch"></section> +<img data-src="img/pyramid2.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.4: The data pyramid (part 2)</p></section> <section id="sec-datawisdom-from-data-to-wisdom-34" class="slide level3 unnumbered"> -<h3><a href="#/sec-datawisdom">1.2.1</a> From data to wisdom (3/4)</h3> +<h3><a href="#/sec-datawisdom">1.1.7</a> From data to wisdom (3/4)</h3> -<img data-src="img/pyramid3.png" class="r-stretch"></section> +<img data-src="img/pyramid3.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.5: The data pyramid (part 3)</p></section> <section id="sec-datawisdom-from-data-to-wisdom-44" class="slide level3 unnumbered"> -<h3><a href="#/sec-datawisdom">1.2.1</a> From data to wisdom (4/4)</h3> +<h3><a href="#/sec-datawisdom">1.1.7</a> From data to wisdom (4/4)</h3> -<img data-src="img/pyramid4.png" class="r-stretch"></section> -<section id="data-characteristics" class="slide level3" data-number="1.2.2"> -<h3><span class="header-section-number">1.2.2</span> Data characteristics</h3> +<img data-src="img/pyramid4.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.6: The data pyramid (part 4)</p></section> +<section id="data-characteristics" class="slide level3" data-number="1.1.8"> +<h3><span class="header-section-number">1.1.8</span> Data characteristics</h3> -<img data-src="img/datacharacteristics.png" class="r-stretch"></section> -<section id="data-types-1" class="slide level3" data-number="1.2.3"> -<h3><span class="header-section-number">1.2.3</span> Data types</h3> +<img data-src="img/datacharacteristics.png" class="r-stretch quarto-figure-center"><p class="caption">Figure 1.7: Types of data characteristics</p></section> +<section id="data-types-1" class="slide level3" data-number="1.1.9"> +<h3><span class="header-section-number">1.1.9</span> Data types</h3> </section></section> <section> -<section id="databases" class="title-slide slide level2 center" data-background-color="#0014a0" data-number="1.3"> -<h2><span class="header-section-number">1.3</span> Databases</h2> +<section id="databases" class="title-slide slide level2 center" data-background-color="#0014a0" data-number="1.2"> +<h2><span class="header-section-number">1.2</span> Databases</h2> <div class="footer"> </div> </section> -<section id="motivation" class="slide level3" data-number="1.3.1"> -<h3><span class="header-section-number">1.3.1</span> Motivation</h3> +<section id="motivation" class="slide level3" data-number="1.2.1"> +<h3><span class="header-section-number">1.2.1</span> Motivation</h3> +<div class="columns"> +<div class="column" style="width:47.5%;"> +<ul> +<li><strong>Structured Storage</strong>: Organizes data in a defined manner, allowing for relationship establishment between data types.</li> +<li><strong>Data Integrity and Accuracy</strong>: Ensures data remains accurate and consistent through integrity constraints and validation mechanisms.</li> +<li><strong>Ease of Data Retrieval</strong>: Facilitates data extraction through sophisticated querying and reporting capabilities.</li> +<li><strong>Data Security</strong>: Provides robust protection features to safeguard sensitive data through access controls.</li> +<li><strong>Concurrency Control</strong>: Supports simultaneous data access by multiple users while maintaining data consistency.</li> +</ul> +</div><div class="column" style="width:5%;"> + +</div><div class="column" style="width:47.5%;"> +<ul> +<li><strong>Data Backup and Recovery</strong>: Offers built-in features to protect against data loss and enables data restoration.</li> +<li><strong>Scalability and Performance</strong>: Efficiently handles growing data and transactions, ensuring application responsiveness.</li> +<li><strong>Compliance and Auditing</strong>: Supports regulatory compliance and provides auditing tools for tracking data access.</li> +<li><strong>Cost Efficiency</strong>: Reduces total ownership cost through consolidated data management and automation.</li> +<li><strong>Data Analysis and Decision-Making</strong>: Enables data mining and analysis for informed decision-making and insights.</li> +</ul> +</div> +</div> +</section> +<section id="relational-data-models" class="slide level3" data-number="1.2.2"> +<h3><span class="header-section-number">1.2.2</span> Relational data models</h3> +<ul> +<li><p><strong>Definition</strong>: A relational data model organizes data into tables (or relations) where each table represents a different entity, and each row in a table represents a unique instance of that entity. Columns within the tables represent attributes of the entities.</p></li> +<li><p><strong>Normalization</strong>: A technique used to minimize data redundancy and avoid undesirable characteristics like insertion, update, and deletion anomalies by organizing data in a way that eliminates repeating groups and ensures data dependencies make sense.</p></li> +<li><p><strong>ACID Properties</strong>:</p> +<ul> +<li><strong>Atomicity</strong>: Ensures that all parts of a transaction are completed successfully or not at all.</li> +<li><strong>Consistency</strong>: Ensures that the database remains in a consistent state before and after the transaction.</li> +<li><strong>Isolation</strong>: Ensures that transactions are securely and independently processed at the same time without interference.</li> +<li><strong>Durability</strong>: Ensures that the effects of a transaction are permanent and can withstand system failures.</li> +</ul></li> +<li><p><strong>Schema</strong>: Defines the structure of the relational database including tables, fields, and the relationships between them. The schema acts as a blueprint for how data is organized and how relationships between data are handled.</p></li> +</ul> </section> -<section id="relational-data" class="slide level3" data-number="1.3.2"> -<h3><span class="header-section-number">1.3.2</span> Relational data</h3> +<section id="relational-data-schemas" class="slide level3" data-number="1.2.3"> +<h3><span class="header-section-number">1.2.3</span> Relational data schemas</h3> +<p>A Relational Model is a type of database model based on the concept of relations, which are akin to tables of data. In a relation, data is organized in tuples (rows) and attributes (columns).</p> +<div class="columns"> +<div class="column" style="width:47.5%;"> +<h5 id="relations-tables">1. Relations (Tables)</h5> +<ul> +<li>A <strong>Relation</strong> is a set of tuples.</li> +<li>Each <strong>Tuple</strong> represents a single item.</li> +<li>Each <strong>Attribute</strong> in a tuple has a specific data type.</li> +</ul> +<h5 id="relationship-cardinalities">2. Relationship cardinalities</h5> +<ul> +<li><strong>One-to-One (1:1):</strong> Each item in one relation is linked to exactly one item in another relation.</li> +<li><strong>One-to-Many (1:M):</strong> One item in a relation can be linked to many items in another relation.</li> +<li><strong>Many-to-Many (M:M):</strong> Items in one relation can be linked to multiple items in another relation.</li> +</ul> +</div><div class="column" style="width:5%;"> + +</div><div class="column" style="width:47.5%;"> +<h5 id="keys">3. Keys</h5> +<ul> +<li><strong>Primary Key:</strong> A unique identifier for each tuple within a relation.</li> +<li><strong>Foreign Key:</strong> A field in one relation that refers to the primary key in another relation.</li> +</ul> +<h5 id="integrity-constraints">4. Integrity Constraints</h5> +<ul> +<li><strong>Entity Integrity:</strong> E.g. no primary key value can be null.</li> +<li><strong>Referential Integrity:</strong> Ensures that relationships between relations are maintained.</li> +<li>…</li> +</ul> +</div> +</div> </section> -<section id="relational-data-models" class="slide level3" data-number="1.3.3"> -<h3><span class="header-section-number">1.3.3</span> Relational data models</h3> +<section id="visualization-of-relational-data-models" class="slide level3" data-number="1.2.4"> +<h3><span class="header-section-number">1.2.4</span> Visualization of relational data models</h3> +<div class="columns"> +<div class="column" style="width:47.5%;"> +<div id="fig-relmod" class="quarto-figure quarto-figure-center"> +<figure> +<p><img data-src="https://dev.mysql.com/doc/employee/en/images/employees-schema.png"></p> +<figcaption>Figure 1.8: Exmaple for the visualization of a relational data model</figcaption> +</figure> +</div> +</div><div class="column" style="width:5%;"> + +</div><div class="column" style="width:47.5%;"> +<ul> +<li>In a visualization of relational data models, each <strong>table</strong> is represented by a box with the table’s name on top and the list of <strong>columns/attributes</strong> below</li> +<li>Special columns like <strong>primary and foreign keys</strong> are marked</li> +<li><strong>Relationships</strong> are represented by connections between the tables with respective notations for the <strong>cardinalities</strong> (see <a href="#/visualization-of-relational-data-models">Figure 1.9</a>)</li> +</ul> +<div id="fig-cardinalities" class="quarto-figure quarto-figure-center"> +<figure> +<p><img data-src="https://d2slcw3kip6qmk.cloudfront.net/marketing/pages/chart/erd-symbols/ERD-Notation.PNG"></p> +<figcaption>Figure 1.9: Notation of relationship cardinalities</figcaption> +</figure> +</div> +</div> +</div> </section> -<section id="accessing-data-bases" class="slide level3" data-number="1.3.4"> -<h3><span class="header-section-number">1.3.4</span> Accessing data bases</h3> +<section id="accessing-data-bases" class="slide level3" data-number="1.2.5"> +<h3><span class="header-section-number">1.2.5</span> Accessing data bases</h3> </section> -<section id="working-with-data-bases---sql" class="slide level3" data-number="1.3.5"> -<h3><span class="header-section-number">1.3.5</span> Working with data bases - SQL</h3> +<section id="working-with-data-bases---sql" class="slide level3" data-number="1.2.6"> +<h3><span class="header-section-number">1.2.6</span> Working with data bases - SQL</h3> </section> -<section id="other-types-of-data-bases" class="slide level3" data-number="1.3.6"> -<h3><span class="header-section-number">1.3.6</span> Other types of data bases</h3> +<section id="other-types-of-data-bases" class="slide level3" data-number="1.2.7"> +<h3><span class="header-section-number">1.2.7</span> Other types of data bases</h3> +<p>Traditional Relational Database Management Systems (RDBMS) have been the standard for data storage and management. However, with the advent of big data and real-time applications, other database models have emerged to address specific needs.</p> +<div class="columns"> +<div class="column" style="width:47.5%;"> +<h5 id="nosql-databases">1. NoSQL Databases</h5> +<ul> +<li><strong>Key-Value Stores:</strong> Simple and highly scalable, e.g., Redis, DynamoDB.</li> +<li><strong>Document Stores:</strong> Store, retrieve, and manage document-oriented information, e.g., MongoDB, CouchDB.</li> +<li><strong>Column-family Stores:</strong> Ideal for handling large data sets, e.g., Cassandra, HBase.</li> +<li><strong>Graph Databases:</strong> Excellent for managing interconnected data, e.g., Neo4j, Amazon Neptune.</li> +</ul> +<h5 id="newsql-databases">2. NewSQL Databases</h5> +<ul> +<li>Aim to provide the scalability of NoSQL databases while maintaining the ACID properties of relational databases, e.g. Google Spanner, CockroachDB.</li> +</ul> +</div><div class="column" style="width:5%;"> + +</div><div class="column" style="width:47.5%;"> +<h5 id="in-memory-databases-imdbs">3. In-Memory Databases (IMDBs)</h5> +<ul> +<li>Store data in the main memory (instead of disk) for faster data access, e.g., Redis, SAP HANA.</li> +</ul> +<h5 id="time-series-databases-tsdbs">4. Time Series Databases (TSDBs)</h5> +<ul> +<li>Optimized for handling time-series data, e.g., InfluxDB, Prometheus.</li> +</ul> +<h5 id="multi-model-databases">5. Multi-model Databases</h5> +<ul> +<li>Support multiple data models within a single, integrated backend, e.g., ArangoDB, OrientDB.</li> +</ul> +</div> +</div> </section></section> <section id="references" class="title-slide slide level2 unnumbered scrollable smaller"> <h2>References</h2>