====== Decision Support Databases (662AA) A.Y. 2021/22 ====== The course presents the main approaches to the design and implementation of decision support databases, and the characteristics of business intelligence tools and computer based information systems used to produce summary information to facilitate appropriate decision-making processes and make them more quick and objectives. Particular attention will be paid to themes such as conceptual and logical Data Warehouses design, data analysis using analytic SQL, algorithms for selecting materialized views, data warehouse systems technology (indexes, star query optimization, physical design, query rewrite methods to use materialized views). A part of the course will be dedicated to a collection of case studies. =====Instructor===== * **Salvatore Ruggieri** (Lectures) * Università di Pisa * [[http://pages.di.unipi.it/ruggieri/]] * [[salvatore.ruggieri@unipi.it]] * **Office hours:** Tuesdays h 14:00 - 16:00 or by appointment, Department of Computer Science, room 321/DO. * **Office hours only via Skype or Teams by appointment. Skype contact: salvatore.ruggieri** =====Classes===== Lessons will be also live-streamed on the [[https://teams.microsoft.com/l/team/19%3a3WUVDFKLmNbj2SXQnQzK1_BADY_6B1nZNbVXjg4qo8Y1%40thread.tacv2/conversations?groupId=a4c2f53f-0175-451f-96dc-f9f0bf8a1819&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams space]].\\ ^ Day of Week ^ Hour ^ Room ^ | Tuesday | 9:00 - 11:00 | Fib M1 | | Friday | 16:00 - 18:00 | Fib C | =====Mandatory teaching material ===== * **[DW]** A. Albano, S. Ruggieri. [[http://fondamentidibasididati.it/wp-content/uploads/2020/11/DWessential-2021-C3-12-21.pdf|Decision Support Databases Essentials]], University of Pisa, 2 December 2021. * **[DB]** A. Albano. [[http://fondamentidibasididati.it/wp-content/uploads/2020/11/DBEssential-2021-C30-11-21.pdf|DB Essentials]] and [[http://fondamentidibasididati.it/wp-content/uploads/2020/11/DBEssential-2020-Soluzioni-C30-11-21.pdf|solutions to exercises]], University of Pisa, 1 December 2020. This is a self-contained excerpt (in English) from the book [[http://fondamentidibasididati.it|Fondamenti di basi di dati]] (in Italian, free download). * Examples of [[http://patterns.di.unipi.it/dsd/DSDsamples.pdf|written exams with solutions]] and [[http://patterns.di.unipi.it/dsd/dsd2020sample.pdf|written exam]]. =====Software===== * [[http://fondamentidibasididati.it/index.php/about-jrs/|JRS]] for practicing with logical and physical SQL query plans. JRS requires [[https://www.oracle.com/java/technologies/downloads/#java8|Java SE 8]] (need to register to download) * [[https://docs.microsoft.com/en-us/sql/azure-data-studio/download|Azure Data Studio]] client for connecting to SQL Server DBMS Foodmart database * [[https://start.unipi.it/en/help-ict/vpn/|Access to University digital services through VPN]] connect to unipi VPN (unless you are already in the unipi.it network) for accessing the Foodmart database =====Preliminary program and calendar===== * [[https://esami.unipi.it/programma.php?c=52424&aa=2021|Preliminary program]]. * [[https://didattica.di.unipi.it/en/master-programme-in-data-science-and-business-informatics/academic-calendar-2021-2022/|Calendar of lessons]]. =====Exams===== __//There are no mid-terms//.__ The exam consists of a written part and an oral part. The written part consists of open questions, small exercises, and a Data Warehouse design problem. Each question is assigned a grade, summing up to 30 points. Students are admitted to the oral part if they receive a grade of at least 18 points. Oral consists of critical discussion of the written part and of open questions and problem solving on the topics of the course. Registration to exams is mandatory (**look at the deadline for registering!**): [[https://esami.unipi.it/esami2/|register here]]\\ ^ Date ^ Hour ^ Room ^ Notes ^ =====Class calendar ===== Lessons will be also live-streamed on the [[https://teams.microsoft.com/l/team/19%3a3WUVDFKLmNbj2SXQnQzK1_BADY_6B1nZNbVXjg4qo8Y1%40thread.tacv2/conversations?groupId=a4c2f53f-0175-451f-96dc-f9f0bf8a1819&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams space]].\\ Recordings and teaching material are password protected. Ask the teacher for credentials.\\ To watch the recordings, please right click on the link and download the whole file. To watch the files locally to your computer, you can use e.g. [[http://www.videolan.org/vlc/|VLC media player]]. **2021-01.** //Tuesday 14 September 2021, 9-11// **[DW: 1.1-1.2]** [[http://patterns.di.unipi.it/dsd/video/dsd01_20210914.mp4|rec01 audio-video (.mp4)]] Course overview. Need for Strategic Information. Information Systems in Organizations: Operational and Decision support. Data driven Decision support systems and Business Intelligence applications. From data to information for decision making. Types of data synthesis: Reports, Multidimensional data analysis, Exploratory data analysis. **2021-02.** //Friday 17 September 2021, 16-18// **[DW: 1.3-1.7]** [[http://patterns.di.unipi.it/dsd/video/dsd02_20180919.flv|rec02 audio-video (.flv) past years]] The data warehouse (DW) and DW architectures. What to model in a DW: Facts, measures, dimensions and dimensional hierarchies. Examples of data analysis. Exercises on data analysis in SQL. **2021-03.** //Tuesday 21 September 2021, 9-11// **[DB: 1.1, 2.1-2.5]** [[http://patterns.di.unipi.it/dsd/video/dsd03_20210921.mp4|rec03 audio-video (.mp4)]] Recalls: the Object Data Model. [[http://patterns.di.unipi.it/dsd/dsd.03.assignments.pdf|Exercises at home for the lesson 2021-05]]. **2021-04.** //Friday 24 September 2021, 16-18// **[DW: 2.1]** [[http://patterns.di.unipi.it/dsd/video/dsd04_20170929.flv|rec04 audio-video (.flv) past years]] DW modeling. A conceptual multidimensional data model. Representation of Fact, measures, dimensions, attributes and dimensional hierarchies. Key steps in conceptual design from business questions. How to identify fact types and fact granularity and measure types. How to identify dimensions, dimensional attributes and hierarchies. Examples. [[http://patterns.di.unipi.it/dsd/dsd.04.assignments.pdf|Exercises at home for the lesson 2021-05]]. **2021-05.** //Tuesday 28 September 2021, 9-11// **[DW: 2.1, A.1]** [[http://patterns.di.unipi.it/dsd/video/dsd05_20210928.mp4|rec05 audio-video (.mp4)]] The example of a data model for Master program exams. Presentation and discussion of the Hospital case study. [[http://patterns.di.unipi.it/dsd/dsd.05.assignments.pdf|Exercises at home for the lesson 2021-07]]. **2021-06.** //Friday 1 October 2021, 16-18// **[DB: 3.1-3.2]** [[http://patterns.di.unipi.it/dsd/video/dsd06_20211001.mp4|rec06 audio-video (.mp4)]] Recalls: the relational model and relational algebra. Exercises. [[http://patterns.di.unipi.it/dsd/dsd.06.assignments.pdf|Exercises at home for the lesson 2021-08]]. **2021-07.** //Tuesday 5 October 2021, 9-11// **[DW: 2.1, 2.2, A.1, B.1]** [[http://patterns.di.unipi.it/dsd/video/dsd07_20211005.mp4|rec07 audio-video (.mp4)]] More about data mart conceptual design, changing dimensions and advanced data model features. From Conceptual design to relational logical design. Star model, snowflake, and constellation. Logical schema of the Hospital case study. [[http://patterns.di.unipi.it/dsd/dsd.07.assignments.pdf|Exercises at home for the lesson 2021-09]]. **2021-08.** //Friday 8 October 2021, 16-18// **[DB: 3.2-3.4]** [[http://patterns.di.unipi.it/dsd/video/dsd08_20211008.mp4|rec08 audio-video (.mp4)]] Recalls: the relational model and relational algebra. Logical trees. [[http://patterns.di.unipi.it/dsd/dsd.08.exercises.pdf|Exercises with JRS]]. [[http://patterns.di.unipi.it/dsd/dsd.08.assignments.pdf|Exercises at home for the lesson 2021-09]]. **2021-09.** //Tuesday 12 October 2021, 9-11// **[DW: A.2, B.2]** [[http://patterns.di.unipi.it/dsd/video/dsd09_20211012.mp4|rec09 audio-video (.mp4)]] Discussion of students' solutions of conceptual and logical design case studies. **2021-10.** //Friday 15 October 2021, 16-18// **[DW: 3.1-3.5]** [[http://patterns.di.unipi.it/dsd/video/dsd10_20211015.mp4|rec10 audio-video (.mp4)]] Data Warehouse design approaches. Data mart logical design. **2021-11.** //Tuesday 19 October 2021, 9-11// **[DW: 3.1-3.5]**[[http://patterns.di.unipi.it/dsd/video/dsd11_20211019.mp4|rec11 audio-video (.mp4)]] Slowly changing dimensions, fast changing dimensions, shared dimensions. Recursive hierarchies. Multivalued dimensions. [[http://patterns.di.unipi.it/dsd/dsd.11.assignments.pdf|Exercises at home for the lesson 2021-12]]. **2021-12.** //Friday 22 October 2021, 16-18// **[DW: 4.1-4.8]** [[http://patterns.di.unipi.it/dsd/video/dsd12_20211022.mp4|rec12 audio-video (.mp4)]] A DW to support Analytical CRM Analysis. Wrap up on DW design. [[http://patterns.di.unipi.it/dsd/dsd.12.assignments.pdf|Exercises at home for the lesson 2021-14]]. **2021-13.** //Tuesday 26 October 2021, 9-11// **[DW: 2.3, 2.4]**[[http://patterns.di.unipi.it/dsd/video/dsd13_20211026.mp4|rec13 audio-video (.mp4)]] Multidimensional Cube model: OLAP Operations. The extended cube and the lattice of cuboids. Pivot tables in Excel. PowerPivot.\\ **Additional learning material:** * G. Harvey. Excel 2013 All-in-One For Dummies, 2013. [[http://patterns.di.unipi.it/dsd/PivotTable2013BookVIIchpt2.pdf|Chp. VII-2]] and [[http://patterns.di.unipi.it/dsd/HerbalTeas.xlsx|example data for pivot table]]. * [[https://support.office.com/en-us/article/power-pivot-overview-and-learning-f9001958-7901-4caa-ad80-028a6d2432ed|Power Pivot overview]]. **2021-14.** //Friday 29 October 2021, 16-18// **[DB: 4.1-4.2,5.1-5.11]** [[http://patterns.di.unipi.it/dsd/video/dsd14_20211029.mp4|rec14 audio-video (.mp4)]] Recalls on: DBMS, from SQL to extended relational algebra. Exercises. [[http://patterns.di.unipi.it/dsd/dsd.14.assignments.pdf|Exercises at home for the lesson 2021-15]]. **2021-15.** //Tuesday 2 November 2021, 9-11// **[DW: 5.1-5.3]** [[http://patterns.di.unipi.it/dsd/video/dsd15_20211102.mp4|rec15 audio-video (.mp4)]] OLAP systems. Data Analysis Using SQL. Simple reports. Examples. Moderately Difficult Reports. Solutions in SQL. [[http://patterns.di.unipi.it/dsd/dsd.15.foodmart.pdf|Foodmart datawarehouse schema]]. **2021-16.** //Friday 5 November 2021, 16-18// **[DW: 5.4-5.5]** [[http://patterns.di.unipi.it/dsd/video/dsd16_20211105.mp4|rec16 audio-video (.mp4)]] Examples of variance reports. Very Difficult Reports without Analytic SQL. Example of reports with ranks. Analytic Functions with the use of partitions and running totals. Examples. [[http://patterns.di.unipi.it/dsd/dsd.16.assignments.pdf|Exercises at home for the lesson 2021-17]]. **2021-17.** //Tuesday 9 November 2021, 9-11// **[DW: 5.5-5.6]** [[http://patterns.di.unipi.it/dsd/video/dsd17_20211109.mp4|rec17 audio-video (.mp4)]] Analytic Functions with the use of moving windows. Examples. Exercises on Analytic SQL. [[http://patterns.di.unipi.it/dsd/dsd.17.assignments.pdf|Exercises during the lesson and at home]] and [[http://patterns.di.unipi.it/dsd/dsd.17.solutions.txt|solutions]]. **2021-18.** //Friday 12 November 2021, 16-18// **[DB: 6.1-6.6, 6.8, 7.1-7.2]** [[http://patterns.di.unipi.it/dsd/video/dsd18_20211112.mp4|rec18 audio-video (.mp4)]] Recalls of relational DBMS internals: Storage, Indexing and Query Evaluation. Physical operators and physical plans for projection, selection, joins and grouping. Examples. **2021-19.** //Tuesday 16 November 2021, 9-11// **[DW: 6.1-6.4]** [[http://patterns.di.unipi.it/dsd/video/dsd19_20211116.mp4|rec19 audio-video (.mp4)]] Data Warehouse Systems: Special-Purpose Indexes and Star Query Plan. Bitmap indexes. Join indexes. Star queries optimization and query plans. Examples. Table partitioning. **2021-20.** //Friday 19 November 2021, 16-18// **[DW: 7.1-7.7]**[[http://patterns.di.unipi.it/dsd/video/dsd20_20211119.mp4|rec20 audio-video (.mp4)]] The problem of materialized views selection. The lattice of views and the greedy algorithm HRU for the selection of materialized views. Examples. Other algorithms for the choice of the views to materialize with a workload and dimensional hierarchies. [[http://patterns.di.unipi.it/dsd/dsd.20.assignments.pdf|Exercises at home for the lesson 2021-21]]. **2021-21.** //Tuesday 23 November 2021, 9-11// **[DW: 8.1-8.2, DB: 3.5.1-3.5.4]** [[http://patterns.di.unipi.it/dsd/video/dsd21_20211123.mp4|rec21 audio-video (.mp4)]] Recalls of functional dependency properties and how they are used to reason about the properties of the result of a query. Properties of the group-by operator. **2021-22.** //Friday 26 November 2021, 16-18// **[DW: 8.3-8.6]** [[http://patterns.di.unipi.it/dsd/video/dsd22_20211126.mp4|rec22 audio-video (.mp4)]] The problem of evaluating the group-by before the join operator. First case: Invariant grouping. Examples. Other cases: double grouping, grouping and counting. Examples with star queries. **2021-23.** //Tuesday 30 November 2021, 9-11// **[DW: 9.1-9.4]** [[http://patterns.di.unipi.it/dsd/video/dsd23_20211130.mp4|rec23 audio-video (.mp4)]] The problem of query rewrite to use a materialized view. Hypothesis and two approaches: With a compensation on the logical view plan, and with a transformation of logical query plan. Examples. **2021-24.** //Friday 3 December 2021, 16-18// **[DW: 6.5-6.8]** [[http://patterns.di.unipi.it/dsd/video/dsd24_20211203.mp4|rec24 audio-video (.mp4)]] Data Warehousing trends: column-oriented DW, main-memory DW, Big Data framework. =====Previous years===== * [[mds:dsd:2020|Decision Support Databases A.Y. 2020/21]] (special edition)