Schedule for 17:610:551

This schedule is subject to alterations.

(Legend for the readings: MBK = Meadow, Boyce and Kraft, vR = van Rijsbergen, RB = Rik Belew, SJW = Sparck-Jones and Willett)

- Week - - Topics / Activities - - Students' responsibilities -
(During and/or after class)
Textbook IR: general topics

* 1 *

Tue,
Jan 18

Slides in HTML and PDF

Introduction and overview of the course.

Get familiar with the course website. Set up your course website on eden.
Send me email with your details (use students.xml template).

Model.xls for homework.

* 2 *

Tue,
Jan 25

Slides in HTML and PDF

Introduction to IR. Information vs data retrieval.

What do we want from IR ? Introduction to evaluation.

 

* 3 *

Tue,
Feb 01

Slides in HTML and PDF

IR concepts. Aboutness. Relevance.

Rationalist vs. empriricist approaches (AI vs. Stats)

Design decisions for IRS; automatic vs. manual/intellectual systems.

 

* 4 *

Tue,
Feb 08

Slides in HTML and PDF

Indexing.

Document and query representation. Manual vs. automatic indexing.

Cynthia Hammell's presentation - Furnas, G.W., et al. (1987) "The vocabulary problem in human-system communication"

Look at an example of a document collection, a stopword list, an indexed collection and an inverted file.
Formulate a few boolean queries and figure out the result of a boolean search.

* 5 *

Tue,
Feb 15

Slides in HTML and PDF


Automatic indexing. Lexical analysis. Weighting. Data structures.

Homework (to be graded).

* 6 *

Tue,
Feb 22

Hands on indexing.

Michael Giarlo's presentation - Salton & Buckley (1988) "Term weighting approaches in automatic text retrieval"
Jingjing Liu's presentation - Mikheev (2000) "Document centered approach to text normalization"

WebClusterLite lab work. Optional homework.

Lemur lab work. Optional homework.

* 7 *

Tue,
Mar 01

Slides in HTML and PDF

Models of IR.

Interaction models.

Information Retrieval as interaction.

Judy Vioreanu's presentation - Marcia Bates (1988) The Berrypicking model;
Beth Chopin's presentation - van Rijsbergen's (1986) "A new theoretical framework for information retrieval";
Valerie Forrestal's presentation - Bates, M. (1990) "Where should the person stop and the information search interface start?"
Sara Ridder's presentation: Saracevic, T. (1996) "Interactive models in information retrieval (IR): Progress, problems, proposal"

 

* 8 *

Tue,
Mar 08

Models of IR.

Cognitive models.
Jingjing Liu's presentation - Nick Belkin et al (1982) "ASK for information retrieval: Part I. Background and theory."

Relevance estimation models.
Vector space model. Probabilistic model. Language models.

Topic models. User models.

An introduction to probabilities for IR - slides in HTML and PDF

 

* 9 *

Tue,
Mar 15

Spring break (no class).

 

* 10 *

Tue,
Mar 22

Slides in HTML and PDF

User interfaces and Information Visualization for IR
Part I : Interaction models.

ClusterBook (HTML and PDF).

Cynthia Hammell's presentation - Brajnik, Mizzaro et. al (2002) "Strategic Help in User Interfaces for IR"

 

* 11 *

Tue,
Mar 29

Slides in HTML and PDF

Evaluation of interactive systems.

Valerie Forrestal's presentation - Lucas and Topi (2004) "Training for Web search: Will it get you in shape?"
Charles Dennis' presentation - Chu, H. and Rosenthal, M (1996) “Search Engines for the World Wide Web"

 

* 12 *

Tue,
Apr 05

Slides in HTML and PDF

Evaluation of IR systems.

TREC - Tracks: Ad-hoc, Filtering, HARD, Web, Robust, Genomics, Enterprise, Spam.

Charles Dennis' presentation - Jakob Nielsen (2003)"Risks of Quantitative Studies"
SaraRidder's presentation: He et al (2004) "HARD Experiment at Maryland: from Need Negotiation to Automated HARD Process"

Michael Giarlo's presentation - Urban et al (2003) "An Adaptive Approach Towards Content-Based Image Retrieval"

Lab work / homework.
(See the top of the table in my spreadsheet, and my diagrams.)

Optional homework.

* 13 *

Tue,
Apr 12

Slides in HTML and PDF

User interfaces and Information Visualization for IR
Part II: Tools and techniques.

Beth Chopin's presentation - Reiterer et al (2005) "INSYDER: a content-based visual-information-seeking system for theWeb"
Judy Vioreanu's presentation - Karlson, et al (2005) "AppLens and LaunchTile: Two Designs for One-Handed Thumb Use on Small Devices"
Cinthia Hammel's presentation - Information Visualization in IR.

 
Advanced IR: current research topics

* 14 *

Tue,
Apr 19

Charles Dennis' presentation - INEX.

Trec 2003 Interactive sub-track - Jingjing Liu (topic presentation).

AI and IR.
Machine learning and data mining for IR.

Statistical model for IR.
Language models.
Topic modeling.

Structure. Clustering vs. classification. IR for structured documents. INEX.

Natural language processing for IR.

(Also see Lewis' tutorial)

* 15 *

Tue,
Apr 26

Web IR - Sara Ridder (term project).

The Semantic Web - Charles Dennis (term project).

Valerie Forrestal and Michael Giarlo's presentation - Integrating KEA (Keyphrase Extraction Algorithm) in Fedora (term project)

Information Visualization - Judy Vioreanu (topic presentation)

Informetrics and IR.

 

* 16 *

Tue,
May 03

Evaluation of Mediated IR on the Web - Jingjing Liu (term project)

Lucene - Judy Vioreanu (term project)

Cheshire II - Cynthia Hammell (term project)

System evaluation - Beth Chopin (term project)

Collaborative and recommender systems - Sara Ridder (topic presentation)

Cross-language IR.

Personalization and user modeling.

Implicit vs. explicit feedback.

Document summarization.

Information extraction.

Multimedia IR (image, video, music, ...).