25 Apr

iLab -- A Platform for IR Experiment

One of things you realize as a CS grad student is that writing a good code is not considered as important as you thought it would be. Since your research codes only need to get a data for your experiment, this downplay of coding seems to make a sense.

When I ask my fellow grad students, their responses are: “Why do you care about the code that should be used once and for all?.” For this reason, mose people end up writing ad-hoc scripts which seem to be seldom reused (even re-read, since they typically use Perl — a write-only language)

I took a different view from that. 3 years of experience as a software developer let me know that it is not good for your well-being(!) to see ugly code every day. And experiments we do as an IR grad studnet should not be that entirely different all the time.

After a year and half passed since I got here and I seem to know a bit better about IR experiments than before. For every new experiment I ran, I tried to extend and generalize existing code rather than starting from scratch, which left a considerable amount of Ruby code. (Yes, my choice of language is Ruby. After all, it is a language purportedly designed for the pleasure of programming. How appealing is that?)

The resulting software – dubbed ‘iLab’ – consists of the framework that is common to every experiment and the part that supports individual experiment. Since the framework part provides the object abstraction of every usual stuff IR experimentors deal with — document, query, retrieval engine, you can build your experiment by just calling it.

The good point of using iLab as opposed to building ad-hoc code for each experiment seems evident. When you want to work on a new collection or task, you can do it with simple set of API call. Compare this with having to copy-and-paste existing code, which will result in a pile of buggy codes which you may not want to look at again.

iLab also enables you to do things which would not be possible at all in ad-hoc scripting. For instance, you can build a more sophisticated experiment by combining smaller, simpler experiments. If you need to run a cross-validation of some machine learning algorithm, this can be a useful feature.

If you’re interested in, here’s the slide that briefly introduces iLab. Also, check out the follow experimental result which is auto-generated by iLab. (For a tip, if you click each heading, you can sort reports by that criteria)

Since iLab is not still distribution-ready, please let me know if you want to try it so that I can be motivated to make further effort.

Tags : IR,Experiment Print Comments Trackback