Data Mining (ETF RIO DM 51060 )

General information

Module title

Data Mining

Module code

ETF RIO DM 51060

Study

ETF-B

Department

Computing and Informatics

Year

2

Semester

4

Module type

Mandatory

ECTS

5

Hours

60

Lectures

35

Exercises

25

Tutorials

0

Module goal - Knowledge and skill to be achieved by students

  Introduction to the principles of data analysis in random contexts and finding new relations and information useful for strategic decision making. <br>
Introduction to elements of an internal search process, by defining the search targets, collecting selected data, preparation for filtering, introduction to techniques and data mining algorithms. <br>
Acquiring knowledge necessary for choosing the best technique for solving a particular knowledge discovering problem. <br>
Acquiring knowledge about appliance of techniques and data mining algorithms as well as the interpretation and presentation of results obtained. <br>

Syllabus

  INTRODUCTION TO DATA MINING <br>
1. Strategic decision making <br>
2. Strategic planning <br>
3. Knowledge discovering process, target defining <br>
4. Choice of source data (text, web, image) <br>
DATA MINING TECHNIQUES <br>
5. String matching, brute-force string matching <br>
6. Linear editing algorithms, finite automata based string matching, Knutt-Morris-Pratt algorithm, approximate string matching, Wagner Fischer algorithm for computing string distances <br>
7. Classification, decision trees based classification, Bayesian classification, teachers on the distance basis, vector supported machines <br>
8. Fuzzy decision trees <br>
9. Clustering, distance measures and symbolic objects, clustering categories, scalable clustering algorithms, approaches based on soft computing, hierarchical symbolic clustering, segmentation <br>
10. Associative rules, candidate generation and test methods, rules of interest, multilevel rules, on-line rules generation, generalized rules, temporal association rules <br>
11. Filtering and data transformation, validation and visualization of results <br>
ARCHITECTURES AND STANDARDIZATION <br>
12. Data mining system architectures, standardization of information obtained by data mining <br>

Literature

Recommended1. Notes and slides from lectures (See Faculty WEB Site) <br>
2. Han, Kamber: Data Mining - Concepts and Techniques, Morgan Kaufmann, 2000. <br>
Additional1. Hand, Mannila, Smyth: Principles of Data Mining, MIT Press, 2001

Didactic methods

  Through lectures, students will learn about the theory, tasks and applicative examples within thematic units. Lectures consist of theoretical part, presentational descriptive examples, genesis and resolution of specific tasks. In this way, students will have basis for appliance of skilled material in engineering applications. Additional examples and exam tasks are discussed and solved during the laboratory exercises. Laboratory practice and home assignments will enable students of continuous work and their knowledge verification.

Exams

  During the course students will collect points according to the following system: <br>
- Attending lectures, exercises and tutorial classes: 10 points, student with more then three absences from lectures, exercises and/or tutorials can not achieve these points; <br>
- Home assignments: maximum of 10 points, assuming solving 5 to 10 assignments evenly distributed throughout the semester; <br>
- Partial exams: two written partial exams, maximum of 20 points for each positively evaluated partial exam; <br>
Student who during the semester achieved less than 20 points must re-enroll this course. <br>
Student who during the semester achieved 40 or more points will access to final oral exam, the exam consists of discussing the partial exams tasks, home assignments and answers to simple questions related to course topics. <br>
Final oral exam provides maximum of 40 points. To achieve a positive final grade, students in this exam must achieve a minimum of 20 points. Students who do not achieve this minimum will access to makeup oral exam. <br>
Student who during the semester achieved 20 or more points, and less than 40 points will access to makeup exam. Makeup exam is structured as follows: <br>
- Written part structured in the same way as a partial written exam, during which students solve problems in topics they failed on partial exams (achieved less then 10 points), <br>
- Oral part structured in the same way as a final oral exam. <br>
Only students who, after passing the written part of the makeup exam managed to achieve a total score of 40 or more points, can access to oral makeup exam, where the score consists of points achieved through attending classes, home assignments, passing partial exams and passing the written part of makeup exam. <br>
Oral makeup exam provides maximum of 40 points. To achieve a positive final grade, students in this exam must achieve a minimum of 20 points. Students who do not achieve this minimum must re-enroll this course. <br>

Aditional notes