Indice

Algorithm Engineering A.A. 2009-2010

Thanks to P. Sanders Teacher: Paolo Ferragina

CFU: 6.

Question time: Monday, 16-19, Room 295, Dept of Computer Science.

Time table: Monday, 9-11 (Room C1); Wednesday, 14-16 (Room B).


Goals

In this course we will study, design and analyze (theoretically and experimentally) advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. These algorithmic tools will be designed and analyzed in several models of computation— such as RAM, 2-level memory, cache-oblivious, streaming— in order to take into account the architectural features and the memory hierarchy of modern PCs.

Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces algorithmic solutions aimed at minimizing the use of some computational resources like time, space, communication, I/O, etc. Some of these solutions will be discussed at an experimental level, in order to introduce proper engineering and tuning tools for algorithmic development.

Exam

Dates Room
11/01/2010 Text
11/02/2010 Text
1/06/2010 Text
22/06/2010 Text
15/07/2010 Text
10/09/2010 room C, hr 15.00, Polo Fibonacci

Background

If you wish to refresh your mind on Algorithms and Data Structures, I suggest you to follow the Video Lectures by Erik Demaine and Charles Leiserson, specifically Lectures 1-5, 7 and 10. There it is missing the part on basic graph problems (representation, DFS, BFS, topological sort) which you may browse in any book, such as Introduction to Algorithms by Cormen-Leiserson-Rivest-Stein, third edition.

Books, Notes, etc.

Since the course covers many different algorithmic themes and will deal with advanced results and issues which are not yet part of books, I'll distribute notes and copies of papers/books which will constitute the reading material of the course.

I'll project no slides for teaching, and use just the old-fashioned blackboard.

Here follows a list of books from which I've taken some parts (details in the lectures):

List of Lectures

Date Lecture Biblio Scribe
Background: If you wish to refresh your mind on basic Algorithms and Data Structures, please look at the above section “Background”. Here it is the lecture's scheme to take notes.
21/09/2009 Introduction to the course: algorithm, asymptotics, computational models, experiments. Algorithm Engineering as a methodology: What's new? An example: max-subArray sum. ✔ [B]: column 7 “Algorithm design techniques”.
23/09/2009 Back of the Envelope calculations on the impact of Disks. Uniform sampling of M integers in [1,N]. ✔ [B]: column 11 “Searching”.
✔ Look also at this post and this other post for the reservoir sampling.
Material first lecture my notes
28/09/2009 No lecture today !
30/09/2009 Sorting on 2-level memories: multi-way mergesort, snow plow, and the complexity of sorting vs permuting. Lower bound notes and Sect 5.7.1 of [MS09]
05/10/2009 Sorting on 2-level memories: sample-sort with multi-pivots. Sorting with multi-disks (sketch of Greed sort). Chap 9 of [PS]
Material second lecture my notes
07/10/2009 String sort: qsort-based, multi-key quicksort, radix sort Chap 9 of [PS]
12/10/2009 String search: array of pointers, Compacted trie, Blind Trie, LPFC Chap 28 of [MS05]
Material third lecture my notes
14/10/2009 Scan-based text search: Karp-Rabin fingerprint, search with one pattern, search with many patterns. Sect 4.4 in [G]
19/10/2009 Scan-based text search: Knuth-Morris-Pratt automata. The tool AGREP. pp 149-154 in [M] and Sect 4-4.2 in [G]
Material fourth lecture my notes
21/10/2009 Esercitazione.
26/10/2009 Competitive analysis and on-line algorithms: search in a list with biased accesses. Move-To-Front: definition and analysis. Paging: LRU, FIFO and LFU.
28/10/2009 Competitive analysis and optimality of LRU. On-line randomized algorithms and oblivious adversary. Marking Algorithm and its analysis.
Material fifth lecture
02/11/2009 no lecture today!
04/11/2009 no lecture today!
09/11/2009 Search Engines: definition, structure. Inverted Lists: compressed storage (gamma, delta, Rice, interpolative, byte-aligned, <s,c>, PForDelta). Query resolution: two terms, more terms.
11/11/2009 Search Engines: text-based and link-based ranking.
Material sixth lecture my notes
16/11/2009 Data compression: notation and terminology. Entropy. Huffman: classic and canonical.
18/11/2009 Data compression: Arithmetic.
20/11/2009 Data compression: Lempel-Ziv (LZ77, gzip, LZ78), Burrows-Wheeler Transform (bzip), MTF, RLE.
Material seventh lecture. Study Chap 4 of [WMB] except sections “computing huffman length”, “maintaining cumulative counts”, “prediction by partial matching”, “dynamic markov compression”, “word-based compression”, “lzw”, “synchronization”. In the paper on “MTF analysis”, study only the proof of theorem 1. my notes
23/11/09 no lecture today!
25/11/09 no lecture today!
27/11/09 Suffix Trees and Suffix Arrays: definition, properties and example of applications. Material eigth lecture.
30/11/09 Hashing: recalling the basics in chaining, Universal.
02/12/09 Perfect and Cuckoo hashing. Material nineth lecture. No proof of theo/cor 11.6-11.8
07/12/09 Bloom filter: definition, properties, applications. Material tenth lecture
09/12/09 Randomized dictionary: Skip lists. Material eleventh lecture
11/12/09 Q&A
14/12/09 Q&A