COURSE UNIT TITLE

: INTRODUCTION TO TEXT AND WEB MINING

Description of Individual Course Units

Course Unit Code Course Unit Title Type Of Course D U L ECTS
BIL 3102 INTRODUCTION TO TEXT AND WEB MINING ELECTIVE 3 0 0 5

Offered By

Computer Science

Level of Course Unit

First Cycle Programmes (Bachelor's Degree)

Course Coordinator

ASISTANT PROFESSOR METE EMINAĞAOĞLU

Offered to

Computer Science

Course Objective

This course aims to give information about queries and documents, document pre-processing, the word distribution, patch assessment, automatic indexing / tagging, character matching, query expansion, random graph models, social network analysis, graph-based methods, semi-supervised text correction, spamming and anti-spamming techniques, text summarization, natural language processing, classifying web pages, extracting knowledge from the web.

Learning Outcomes of the Course Unit

1   Have general information about text mining techniques,
2   Have general information about web mining techniques,
3   Be capable of analysing text based documents,
4   Be capable of searching text based documents,
5   Have information about web search and indexing.

Mode of Delivery

Face -to- Face

Prerequisites and Co-requisites

None

Recomended Optional Programme Components

None

Course Contents

Week Subject Description
1 Introduction to text mining, Boolean retrieval
2 Dictionaries
3 Indexes construction, compression
4 Scoring, term weighting
5 Computing scores
6 Information retrieval
7 XML retrieval
8 Midterm Exam
9 Language models
10 Text classification
11 Vector space classification
12 Support vector machines, Machine learning on documents
13 Flat and hierarchical clustering
14 Web search basics, web crawling and indexes Link analysis

Recomended or Required Reading

Textbook(s):
- Baldi, P.,Frasconi, P., Symth, P., Modeling the Internet and the Web: Probabilistic Methods and Algorithms, John Wiley and Sons, 2003.
Supplementary Book(s):
- Song, M., Handbook of Research on Text and Web Mining Technologies, Volume I-II, Y-F. B. Wu, 2007.
- Pal, S.K. and Mitra, P., Pattern Recognition Algorithms for Data Mining, Chapman&Hall/CRC, 2004

Planned Learning Activities and Teaching Methods

The course is taught in a lecture, class presentation and discussion format. Besides the taught lecture, group presentations are to be prepared by the groups assigned and presented in a discussion session. In some weeks of the course, results of the homework given previously are discussed.

Assessment Methods

SORTING NUMBER SHORT CODE LONG CODE FORMULA
1 ASG ASSIGNMENT
2 MTE MIDTERM EXAM
3 FIN FINAL EXAM
4 FCG FINAL COURSE GRADE ASG * 0.45 + MTE * 0.25 + FIN * 0.30
5 RST RESIT
6 FCGR FINAL COURSE GRADE (RESIT) ASG * 0.45 + MTE * 0.25 + RST * 0.30


Further Notes About Assessment Methods

None

Assessment Criteria

To be announced.

Language of Instruction

Turkish

Course Policies and Rules

Students will come to the class in time. Attending the 70% of the classes are mandotary.

Contact Details for the Lecturer(s)

alper.vahaplar@deu.edu.tr
emel.kuruoglu@deu.edu.tr

Office Hours

Will be announced.

Work Placement(s)

None

Workload Calculation

Activities Number Time (hours) Total Work Load (hours)
Lectures 13 3 39
Preparations before/after weekly lectures 12 1 12
Preparation for midterm exam 1 15 15
Preparation for final exam 1 20 20
Preparing assignments 2 15 30
Final 1 2 2
Midterm 1 2 2
TOTAL WORKLOAD (hours) 120

Contribution of Learning Outcomes to Programme Outcomes

PO/LOPO.1PO.2PO.3PO.4PO.5PO.6PO.7PO.8PO.9PO.10PO.11PO.12PO.13
LO.13
LO.23
LO.33444
LO.43444
LO.543544