INFORMATION RETRIEVAL

17:610:551
Spring 2002
Gheorghe Muresan

SYNOPSIS



Aims of the course

The aims of this course are that the students gain an understanding of:

 

 

Course conduct

 

Each meeting day, there will be a lecture on the scheduled topic, with readings assigned for that topic. Students are expected to participate in discussion on that topic, based on the readings. In order to structure the discussion, students will be expected to submit notes on the readings. In addition to the lectures and discussion, there will be periodic lab sessions, held during the class period. There will also be periodic exercises with information retrieval systems outside of class time.

 



Assignments

 

I. Notes on the readings – 3 topics. They will not be longer than two pages, and will focus on questions, problems and other issues arising from the readings. They should, in general, be critical and evaluative.

II. Two to five-page reports on practical exercises with information retrieval systems. There will be several such exercises, done outside of class time.

III. Presentation of a paper. Each student is required to select one paper from an extended list that will be made available on the web site, to study that paper thoroughly, and to present the key results of that paper to the entire class on a date to be selected. The presentation may include slides, handouts, or web pages. The presentation should make clear: (1) what problem the paper addresses; (2) what relation it has to prior cited literature; (3) what idea it proposes to solve or improve the problem; (4) what was done to implement that idea; (5) what results were found, and (6) what suggestions were made for further work.

IV. A final project, which can take various forms, for instance:

·        a review of an IR topic not covered in detail in lectures. Some suggestions:

1.     The semantic Web

2.     Agents in IR

3.     Query-expansion techniques

4.     Relevance feedback

5.     Information extraction

6.     Information filtering

7.     Recommender systems

8.     User profiles in IR

·        a detailed description and critique of some operational information retrieval system. Examples:

1.     MG, at RMIT

I.H. Witten and A. Moffat and T.C. Bell, “Managing Gigabytes: Compressing and Indexing Documents and Images”, 2nd ed, 1999.

Other bibliography indicated in the MG webpage.

2.     Inquery, from CIIR, University of Massachusetts, Amherst

3.     Lemur, at Carnegie-Mellon University

4.     Cheshire, at UC Berkeley

5.     Okapi

More info on the Okapi project at City University.

6.     Mapuccino

7.     AntWorld

8.     Onix

Note. Only the last 3 are Windows-based; the other only work on Unix or Linux.

·        compare two Web search engines based on functionality (“what we expect from an IR system”) and the support the user interface offers. Some of the functionality is not documented (order and weighting of query terms, for example), so an informed guess, based on observing the output for various inputs, is necessary.

·        a paper discussing changes taking place (or likely to occur) in operational information retrieval systems within the next five years, with specific reference to factors leading to such changes, and their likely effects;

·        constructing and documenting (a significant part of) an information retrieval system, and evaluating it by demonstration (this could be a group project).