BIG DATA AND DATA MINING

Teachers: 
Credits: 
6
Year of erogation: 
2021/2022
Unit Coordinator: 
Disciplinary Sector: 
Computer Science
Semester: 
Second semester
Year of study: 
1
Language of instruction: 

Italian

Learning outcomes of the course unit

At the end of the course the student should have acquired knowledge and skills related to knowledge representation techniques and data mining algorithms. In particular, the student is expected to be able to:
- Know the main problems of Big Data and the objectives of Data Mining.
- Know the main techniques of knowledge representation.
- Knowing how to use formalisms appropriately for the representation of knowledge.
- Knowing how to use the main data mining techniques and algorithms.
- Knowing how to present a work project.
- Be able to analyze a problem and develop a data mining project.

Prerequisites

Good knowledge of the relational data model is strongly recommended. Knowledge of imperative programming languages.

Course contents summary

■ Semi-structured and unstructured data models
■ The limits of SQL and an introduction to SQL/XML and XQuery
■ The information retrieval models and web information retrieval
■ The datawarehousing and data mining

Course contents

■ Part I
■ Introduction
■ Semi-structured and unstructured data models
■ Part II
■ XML introduction
■ SQL/XML language
■ XQuery language
■ XQuery and database management system
■ NoSQL database
■ Part III
■ Information Retrieval introduction
■ Ranking
■ Web Information Retrieval
■ Information Retrieval evaluation
■ Advanced methods
■ Part IV
■ Data analytics
■ Data warehouse
■ Data mining: association rule, classification and clustering

Recommended readings

■ A. Moller, M. Schwartzbach - Introduzione a XML - Pearson, 2007, ISBN: 9788871923734
■ P.-N. Tan, M. Steinbach, V. Kumar - Introduction to data mining - Addison Wesley, 2005, ISBN: 0321420527
■ C.D. Manning, P. Raghavan, H. Schütze - Introduction to Information Retrieval - Cambridge University Press, 2008, ISBN: 0521865719
■ M. Golfarelli, S. Rizzi - Datawarehouse. Teoria e pratica della progettazione - McGraw-Hill Education, 2006, ISBN: 9788838662911

Teaching methods

Teaching activity partly in the classroom

Assessment methods and criteria

The assessment takes place with the discussion of a scientific article. The student explores an advanced topic starting from a research paper among those proposed and prepares a presentation to be used during the exam. The discussion will be mainly on the topics of the chosen article. The student, after the instructor's approval, can alternatively carry out a project on a topic of the course. The results of the project will have to be discussed during the exam. To take part in an exam session, you must register before 7 days of the exam date. Further health indications and restrictions may imply the activation of the remote mode for the exam.