December 2008
5 posts
Twitter Scrape (rough draft)  →
fairly large twitter dataset.  2.7M users, 10M messages, 58M edges. [via Flowing Data]
Dec 27th
argmax & Python performance
Intrigued & a little surprised by Daniel Lemire’s posts on computing a fast argmax in Pyton, I decided to reproduce his somewhat counterintuitive results myself: All of these methods have the same asymptotic complexity.  The first option array.index(max(array)) is clearly more readable, and faster than the alternatives.  BUT, its traversing the array twice in the worst case, whereas...
Dec 19th
Key Scientific Challenges Program →
$5,000 in unrestricted funding + access to data for PhD students in a variety of areas, including search & IR.  
Dec 19th
How to write a Good (No Great) PhD Dissertation →
(pdf) slides from Priya Narasimhan at CMU.
Dec 10th
E-mail as the social network →
Dec 10th