10 Jul
On Ranking Techniques for Desktop Search
PubDate(2008), PubPlace(TIS) Author(Cohen,Domshlak,Zwerdling)
keyword(Desktop search,Learning to rank,)
Summary
Desktop search experiment with known-item search task.
Content
Background
- Stuff I’ve seen(SIS) : users sort result by last-update date more frequently than by IR ranking
- The older the data, the less often it is used
- Type of information stored in desktop
- Ephemeral : reminder, short-lived
- Working : related to ongoing work
- Archive : long-term resource
Contribution
- Novel Feature
- Level : the distance of a file from uppermost directory
- DirRank : The probability to open a file in specific directory is proportional to the number of files previous opened from this (and its sub) directory. (normalized by files in directory)
- Selectivity ; combine values of content-similarity feature by the inverse of of the file number with non-zero feature value.
- e.g. If user query was match with the filename field of 100 files, the score of name field is divided by 100.
Experiment
- Queries are grouped by the no. of results returned
- As more results are returned by each query, date-related features became more useful. (selectivity)
- Combination of result by selectivity was proven to be as effective as learning-based methods