NJIT ETD: "Improving document representation by accumulating relevance feedback : the relevance feedback accumulation (RFA) algorithm" by Bot, Razvan Stefan

E-books

Research & Information Literacy

Interlibrary loan

Theses & Dissertations

Littman Architecture Library

This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/727 in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Improving document representation by accumulating relevance feedback : the relevance feedback accumulation (RFA) algorithm

Author: Bot, Razvan Stefan

View Online: njit-etd2005-127
(xiv, 132 pages ~ 6.8 MB pdf)

Department: Department of Information Systems

Degree: Doctor of Philosophy

Program: Information Systems

Document Type: Dissertation

Advisory Committee: Wu, Yi-Fang Brook (Committee chair)
Turoff, Murray (Committee member)
Oria, Vincent (Committee member)
Belkin, Nicholas J. (Committee member)
Van de Walle, Bartel Albrecht (Committee member)

Date: 2005-05

Keywords: Information retrieval
Relevance feedback
Document representation

Availability: Unrestricted

Abstract:
Document representation (indexing) techniques are dominated by variants of the term-frequency analysis approach, based on the assumption that the more occurrences a term has throughout a document the more important the term is in that document. Inherent drawbacks associated with this approach include: poor index quality, high document representation size and the word mismatch problem. To tackle these drawbacks, a document representation improvement method called the Relevance Feedback Accumulation (RFA) algorithm is presented. The algorithm provides a mechanism to continuously accumulate relevance assessments over time and across users. It also provides a document representation modification function, or document representation learning function that gradually improves the quality of the document representations. To improve document representations, the learning function uses a data mining measure called "support" for analyzing the accumulated relevance feedback.

Evaluation is done by comparing the RFA algorithm to other four algorithms. The four measures used for evaluation are (a) average number of index terms per document; (b) the quality of the document representations assessed by human judges; (c) retrieval effectiveness; and (d) the quality of the document representation learning function. The evaluation results show that (1) the algorithm is able to substantially reduce the document representations size while maintaining retrieval effectiveness parameters; (2) the algorithm provides a smooth and steady document representation learning function; and (3) the algorithm improves the quality of the document representations. The RFA algorithm's approach is consistent with efficiency considerations that hold in real information retrieval systems.

The major contribution made by this research is the design and implementation of a novel, simple, efficient, and scalable technique for document representation improvement.

If you have any questions please contact the ETD Team, libetd@njit.edu.

ETD Information

Digital Commons @ NJIT

Theses and DIssertations

ETD Policies & Procedures

ETD FAQ's

ETD home

Request a Scan

NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003