window office RSS

sporadic ramblings of a comp sci grad student studying information retrieval
Me @ CMU

Archive

Jun
8th
Mon
permalink

Confmaster sucks

Every conference I’ve submitted papers to recently that’s used Confmaster has completely buckled under the pressure of submission day.  I’ve been trying to upload the final version of a CIKM submission for over an hour, and all I’m getting is a stuck progress meter.  This is really unacceptable.

I maintained the Confmaster system for SIGIR 05, and the system sucks from the administration end, too.

May
26th
Tue
permalink
Just received my first request for a paid link on my website (I assume its my academic homepage).  The funny thing is, the requested link is actually for a search engine marketing company.  Anyone else get these?  Is this really how the world of search engine marketing works?
And the real question… how much money should I ask for?

Just received my first request for a paid link on my website (I assume its my academic homepage).  The funny thing is, the requested link is actually for a search engine marketing company.  Anyone else get these?  Is this really how the world of search engine marketing works?

And the real question… how much money should I ask for?

May
25th
Mon
permalink
May
8th
Fri
permalink
permalink
permalink
Twitter Search will index the content of [linked] pages

Hey @Google - @Twitter To Start Indexing Links For Search

(and more here)

A somewhat shallow post at TechCrunch about Twitter search evolving, but there are some interesting bits in there.  Everyone likes to compare Twitter search to Google, but of course they’re complementary.  Thinking about twitter indexing content (not just tweets) is interesting.  They have the advantage of a “push” indexing model, where users deliver links to them, rather than having to crawl to discover new content.  As the linked-to post points out, it won’t be as complete as Google’s index, but it *might* be more fresh — at least for some portion of the web that’s interesting to Twitter users.

Most likely, Google makes heavy use of two types of data outside of the documents when ranking — links & anchor text from other documents on the web, and usage data from queries that result in clicks on the document.  If Twitter indexed page content, this would the a third type of external data to use in ranking.  How much information does this provide that’s not already taken into account from anchor text & query-clicks?

Apr
30th
Thu
permalink
Apr
29th
Wed
permalink
Apr
28th
Tue
permalink

version control with Apple's Time Machine

Time machine is great.  Its brain-dead simple and has saved my a** a few times.  But, its not a proper version control system, and when I’m trying to re-create poorly documented experimental results from 2 years ago with code I have re-written time & time again, I really need a version control system that can handle proper diffs (and tags and branches…).

Here’s a some tools that seem quite handy.  Although they don’t fill the gap between time machine and a proper version control system, they do have diffing capabilities, which is more than I’ve got now:

time machine diff script seems to work as advertised, generating colored diffs across all versions of a file in the backup volume.

And these I haven’t tried yet:

Time Tracker from Charles Soft

tms, A CVS-like command line tool.  Think I’ll wait for the full-featured 1.0 before giving this a whirl, but look like it will be the most polished of the bunch when it emerges from beta.

Apr
26th
Sun
permalink