December 2008
5 posts
Twitter Scrape (rough draft) →
fairly large twitter dataset. 2.7M users, 10M messages, 58M edges. [via Flowing Data]
argmax & Python performance
Intrigued & a little surprised by Daniel Lemire’s posts on computing a fast argmax in Pyton, I decided to reproduce his somewhat counterintuitive results myself:
All of these methods have the same asymptotic complexity. The first option array.index(max(array)) is clearly more readable, and faster than the alternatives. BUT, its traversing the array twice in the worst case, whereas...
Key Scientific Challenges Program →
$5,000 in unrestricted funding + access to data for PhD students in a variety of areas, including search & IR.
How to write a Good (No Great) PhD Dissertation →
(pdf) slides from Priya Narasimhan at CMU.
E-mail as the social network →