window office RSS

sporadic ramblings of a comp sci grad student studying information retrieval Me @ CMU

Archive

Jun
5th
Thu
permalink

Committee Perceptron implementation notes

I’ve received a few questions regarding our committee perceptron paper & thought I’d post a some notes on our implementation.  Unfortunately, some of the details of the algorithm aren’t clear from the paper — sorry about that!  Also unfortunately, I haven’t made my implementation publicly available yet.  But, it is a simple algorithm & quite easy to implement.

The questions mostly stem from our use of the alpha-bound.  This is a technique for removing outlier observations from the data set and is used in online error-driven learners that make multiple passes through the data.  See Noise Tolerant Variants of the Perceptron Algorithm by Khardon & Wachman for details.  

In our algorithm, we view pairs of documents as observations, and thus consider individual pairs for removal rather than documents.  This approach seems to be a bit more conservative than removing documents (and thus all the pairs that the documents belongs to).  It is also more in keeping with the original motivation of the alpha-bound.  But, the pairwise removal does sever the connection between the alpha-bound threshold and the alpha weights used in the dual form of our ranking formula (equation 2).  Its not immediately clear whether pair-wise or document-wise removal would perform better in practice, and this is an experiment we have not yet run.

I also neglected to mention in the paper that we let the algorithm run for some lag time before considering any observations for removal.  This is necessary to accumulate some statistics on how many times each observation has been incorrectly ordered.  In our testing, we used a lag time of 5 iterations through the data.  This effect is clear in figure 1, where the very beginning of the learning curve is quite erratic.  Much of the smoothing of the learning curve over epochs comes from outliers being removed from the data set, and the resulting learned hypotheses being more stable and of a higher quality.

blog comments powered by Disqus