Lecture contents: Motivation, BI Architectures, BI Modeling, Star Schema, Multi-Dimensional Models, Special DW/BI Operators, Optimization of DWs: partitioning, aggregates, histograms and other query optimization techniques, Special index methods, Smart implementation of operators, Back room operations, ETL processes: data extraction, data cleansing, data loading, Column-Oriented Databases in Business Intelligence, Data Warehousing Appliances, Cloud Data Analytics
Exercise contents: Loading Data (ETL), Indexing Data, Working with Multi-Dimensional Data (CUBE, Special Query Languages), Visualization, Finding Interesting Data Faster (Histograms, Skyline), Using the Cloud for Map-Reduce, Storing Data in Columns Instead of Rows, In-Memory Solutions
Available on the university network only. Currently these are the slides from the previous issue held in 2011. Slides will be updated as the course proceeds.Preliminary version of the slides (PDF, 5MB)
The exercise sheets will be published via TUCaN. Due to the practical nature of the exercises most of them require a computer to be worked on. As we are going to start working on them in the classroom, please bring your laptop to the exercise session.
A list of papers discussed throughout the course. (Continously updated!)
- Goetz Graefe. A survey of b-tree locking techniques. ACM Transactions on Databases Systems, 35(5):16:1-16:26, July 2010. Available online at http://dl.acm.org/citation.cfm?id=1806908
- Ming-Chuan Wu and Alejandro P. Buchmann. 1998. Encoded Bitmap Indexing for Data Warehouses. In Proceedings of the Fourteenth International Conference on Data Engineering (ICDE '98). IEEE Computer Society, Washington, DC, USA, 220-230. (PDF, 500KB)
Jim Gray, Adam Bosworth, Andrew Layman, and Hamid Pirahesh. 1996. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In Proceedings of the Twelfth International Conference on Data Engineering (ICDE '96), Stanley Y. W. Su (Ed.). IEEE Computer Society, Washington, DC, USA, 152-159.
Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, and Hamid Pirahesh. 1997. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Data Min. Knowl. Discov. 1, 1 (January 1997) (PDF, 672 KB)
- Michael Stonebraker, Daniel J. Abadi, Adam Batkit, et al. C-Store: A Column-oriented DBMS. In Proceedings of the 31st international conference on Very Large Data Bases (VLDB '05). VLDB Endowment, 553-564. Available online at http://dl.acm.org/citation.cfm?id=1083658.
Stephan Borzsonyi, Donald Kossmann, and Konrad Stocker. 2001. The Skyline Operator. In Proceedings of the 17th International Conference on Data Engineering. IEEE Computer Society, Washington, DC, USA, 421-430.
(you may also coose to consider the following additional references: "Progressive skyline computation in database systems" and "R-trees: a dynamic index structure for spatial searching")
Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis: Approximate Query Answering using Histograms. IEEE Data Eng. Bull. 22(4): 5-14 (1999)
Yannis Ioannidis. 2003. The history of histograms (abridged). In Proceedings of the 29th international conference on Very large data bases - Volume 29 (VLDB '2003) (PDF, 348KB)
- Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Sha, Janet L. Wiener, Goetz Greafe: Query Processing Techniques for Solid State Drives. SIGMOD '09 (PDF, 372KB)
- Daniel J. Abadi, Samuel R. Madden, Nabil Hachem: Column-Stores vs. Row-Stores: How Different Are They Really. SIGMOD '08 (PDF, 414KB)
- Alfons Kemper, Thomas Neumann: HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. ICDE 2011: 195-206 (PDF, 780KB)
Additional ReferencesYou might find it useful to additionally consider the following OPTIONAL refrences (will be continously updated!):
- This chapter highlights aspects of the OLAP operators and their implementation in different database systems. O'Reilly. SQL in a Nutshell. Chapter 4
- Dimitris Papadias, Yufei Tao, Greg Fu, and Bernhard Seeger. 2005. Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 1 (March 2005), 41-82. (PDF, 890KB)
- Antonin Guttman. 1984. R-trees: a dynamic index structure for spatial searching. SIGMOD Rec. 14, 2 (June 1984), 47-57 (PDF, 1.1MB)