<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><description>sporadic ramblings of a comp sci grad student studying information retrievalMe @ CMU</description><title>window office</title><generator>Tumblr (3.0; @windowoffice)</generator><link>http://windowoffice.tumblr.com/</link><item><title>Le Zhao's research tricks</title><description>&lt;a href="http://www.cs.cmu.edu/~lezhao/researchtricks.htm"&gt;Le Zhao's research tricks&lt;/a&gt;: &lt;p&gt;&lt;a href="http://www.cs.cmu.edu/~lezhao/"&gt;One of my colleagues at CMU&lt;/a&gt; has posted quite a few nice tips &amp; tricks for conducting CS &amp; IR research.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/237404698</link><guid>http://windowoffice.tumblr.com/post/237404698</guid><pubDate>Sun, 08 Nov 2009 14:43:46 -0800</pubDate></item><item><title>Yisong Yue on Self-Improving Systems that Learn Through Human Interaction</title><description>&lt;a href="http://www.scientificblogging.com/stated_degree_confidence/blog/selfimproving_systems_learn_through_human_interaction"&gt;Yisong Yue on Self-Improving Systems that Learn Through Human Interaction&lt;/a&gt;: &lt;p&gt;(via &lt;a href="http://hunch.net/?p=1014"&gt;hunch.net&lt;/a&gt;)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/235916757</link><guid>http://windowoffice.tumblr.com/post/235916757</guid><pubDate>Sat, 07 Nov 2009 04:33:30 -0800</pubDate></item><item><title>another good InfoVis at xkcd</title><description>&lt;a href="http://xkcd.com/657/"&gt;another good InfoVis at xkcd&lt;/a&gt;</description><link>http://windowoffice.tumblr.com/post/230720181</link><guid>http://windowoffice.tumblr.com/post/230720181</guid><pubDate>Mon, 02 Nov 2009 03:53:29 -0800</pubDate></item><item><title>M45 Enables Web-Scale Information Extraction Research (Hadoop and Distributed Computing at Yahoo!)</title><description>&lt;a href="http://developer.yahoo.net/blogs/hadoop/2009/10/m45_enables_webscale_informati.html"&gt;M45 Enables Web-Scale Information Extraction Research (Hadoop and Distributed Computing at Yahoo!)&lt;/a&gt;: &lt;p&gt;A post by a couple of my CMU colleagues on the Yahoo! Developer Network blog.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/225833340</link><guid>http://windowoffice.tumblr.com/post/225833340</guid><pubDate>Wed, 28 Oct 2009 04:51:13 -0700</pubDate></item><item><title>CFP ACM SIGIR 2010&#13;
  </title><description>&lt;a href="http://sigir2010.org/doku.php?id=cfp"&gt;CFP ACM SIGIR 2010&#13;
  &lt;/a&gt;: &lt;p&gt;Jan. 22 paper deadline.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/221767133</link><guid>http://windowoffice.tumblr.com/post/221767133</guid><pubDate>Sat, 24 Oct 2009 05:18:58 -0700</pubDate></item><item><title>Web 2.0 Summit 09:  Qi Lu and Tim O’Reilly on search at MS...</title><description>&lt;object width="400" height="336"&gt;&lt;param name="movie" value="http://www.youtube.com/v/WT2wqXrBQHI&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/WT2wqXrBQHI&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1" type="application/x-shockwave-flash" width="400" height="336" allowFullScreen="true" wmode="transparent"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Web 2.0 Summit 09:  Qi Lu and Tim O’Reilly on search at MS (via &lt;a href="http://www.cs.cmu.edu/~yangboz/"&gt;Yangbo&lt;/a&gt;)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/221761166</link><guid>http://windowoffice.tumblr.com/post/221761166</guid><pubDate>Sat, 24 Oct 2009 05:05:00 -0700</pubDate></item><item><title>The On-Line Encyclopedia of Integer Sequences</title><description>&lt;a href="http://www.research.att.com/~njas/sequences/"&gt;The On-Line Encyclopedia of Integer Sequences&lt;/a&gt;: &lt;p&gt;A clever retrieval system over some very interesting data.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/221454248</link><guid>http://windowoffice.tumblr.com/post/221454248</guid><pubDate>Fri, 23 Oct 2009 19:34:42 -0700</pubDate></item><item><title>Binary marble adding machine</title><description>&lt;a href="http://woodgears.ca/marbleadd/index.html"&gt;Binary marble adding machine&lt;/a&gt;</description><link>http://windowoffice.tumblr.com/post/215592398</link><guid>http://windowoffice.tumblr.com/post/215592398</guid><pubDate>Sat, 17 Oct 2009 09:51:41 -0700</pubDate></item><item><title>Got the wrong Bob?</title><description>&lt;a href="http://gmailblog.blogspot.com/2009/10/new-in-labs-got-wrong-bob.html"&gt;Got the wrong Bob?&lt;/a&gt;: &lt;p&gt;New GMail Labs feature, looks very similar to &lt;a href="http://www.cs.cmu.edu/%7Evitor/papers/ecir2008.pdf"&gt;a paper written by a good friend and CMU grad.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(via &lt;a href="http://www.cs.cmu.edu/~wcohen/"&gt;William Cohen&lt;/a&gt;)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/212449840</link><guid>http://windowoffice.tumblr.com/post/212449840</guid><pubDate>Tue, 13 Oct 2009 18:46:28 -0700</pubDate></item><item><title>United States Gross National Happiness on Facebook</title><description>&lt;a href="http://apps.facebook.com/usa_gnh/?_fb_fromhash=9229a923ed77cf1fc76374bd85617304"&gt;United States Gross National Happiness on Facebook&lt;/a&gt;: &lt;p&gt;Large scale sentiment analysis. (via &lt;a href="http://flowingdata.com/2009/10/05/facebook-measures-happiness-in-status-updates/"&gt;FlowingData&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;They label the spikes in happiness, but I really wish they’d labeled the dips, too.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/204982730</link><guid>http://windowoffice.tumblr.com/post/204982730</guid><pubDate>Mon, 05 Oct 2009 04:19:00 -0700</pubDate></item><item><title>Google Search Guru Singhal: We Will Try Outlandish Ideas - BusinessWeek</title><description>&lt;a href="http://www.businessweek.com/the_thread/techbeat/archives/2009/10/google_search_g.html"&gt;Google Search Guru Singhal: We Will Try Outlandish Ideas - BusinessWeek&lt;/a&gt;: &lt;p&gt;(via &lt;a href="http://twitter.com/dtunkelang"&gt;@dtunkelang&lt;/a&gt;)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/204227786</link><guid>http://windowoffice.tumblr.com/post/204227786</guid><pubDate>Sun, 04 Oct 2009 07:53:36 -0700</pubDate></item><item><title>On Facebook, Comments, and Implications</title><description>&lt;a href="http://battellemedia.com/archives/005027.php"&gt;On Facebook, Comments, and Implications&lt;/a&gt;: &lt;p&gt;John Battelle on why Facebook comments are in some ways more interesting than Twitter.  Its all about structure.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/201711305</link><guid>http://windowoffice.tumblr.com/post/201711305</guid><pubDate>Thu, 01 Oct 2009 05:18:37 -0700</pubDate></item><item><title>Ad-hoc retrieval: measurably going nowhere «  IREvalEtAl</title><description>&lt;a href="http://blog.codalism.com/?p=1029"&gt;Ad-hoc retrieval: measurably going nowhere «  IREvalEtAl&lt;/a&gt;: &lt;p&gt;A must-read post over at Will Webber’s blog.  It raises some issues I know several of my readers have been &lt;a href="http://probablyirrelevant.org/2008/10/is-the-science-of-ir-improving/"&gt;pondering for a while&lt;/a&gt;.  (That is… if I still have any readers.)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/199101792</link><guid>http://windowoffice.tumblr.com/post/199101792</guid><pubDate>Mon, 28 Sep 2009 04:48:27 -0700</pubDate></item><item><title>"computer science is mathematical engineering"</title><description>“computer science is mathematical engineering”&lt;br/&gt;&lt;br/&gt; - &lt;em&gt;&lt;p&gt;&lt;a href="http://blog.codalism.com/?p=938"&gt;Computer science is not real science «  IREvalEtAl&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Interesting post from Will Webber.  Not sure I agree completely — many of the things that computer scientists do now involves studying user behavior, or other observations of the “natural world” and describing, modeling, learning from them in order to improve system performance in some way.&lt;/p&gt;&lt;/em&gt;</description><link>http://windowoffice.tumblr.com/post/178777611</link><guid>http://windowoffice.tumblr.com/post/178777611</guid><pubDate>Thu, 03 Sep 2009 06:28:04 -0700</pubDate></item><item><title>The network connection here at SIGIR is so bad, it might be hard to upload longer posts.</title><description>&lt;p&gt;The network connection here at SIGIR is so bad, it might be hard to upload longer posts.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/145391145</link><guid>http://windowoffice.tumblr.com/post/145391145</guid><pubDate>Mon, 20 Jul 2009 08:11:51 -0700</pubDate></item><item><title>Sue talking about the work we did together while I was at MSR</title><description>&lt;img src="http://4.media.tumblr.com/k6d1kyDYHq4tarovGB3QFtuto1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Sue talking about the work we did together while I was at MSR&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/145356496</link><guid>http://windowoffice.tumblr.com/post/145356496</guid><pubDate>Mon, 20 Jul 2009 07:03:05 -0700</pubDate></item><item><title>Susan Dumais -- Salton Award Talk</title><description>&lt;p&gt;As you might have heard, &lt;a href="http://research.microsoft.com/en-us/um/people/sdumais/"&gt;Sue Dumais&lt;/a&gt; was awarded the &lt;a href="http://www.sigir.org/awards/awards.html#salton"&gt;Salton award&lt;/a&gt; this year.  My notes from her talk:&lt;/p&gt;
&lt;p&gt;&lt;b&gt;An Interdisciplinary Perspective on IR&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- Awards are not won by the individual, but by the team of colleagues we work with&lt;/p&gt;
&lt;p&gt;- Tag cloud of collaborator names (I’m in there somewhere)&lt;/p&gt;
&lt;p&gt;- Sue’s Salton number is 2 or 3&lt;/p&gt;
&lt;p&gt;- Background in mathematics &amp; psychology, studying vision &amp; perceptron &amp; developing quantitate models of those&lt;/p&gt;
&lt;p&gt;- After PhD, started in the HCI group at Bell Labs (1979), and has been in industrial research since.  This was the first HCI research group.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;From Verbal Disagreement to LSI&lt;/b&gt;: Mismatch between how people organize information and want to retrieve information, eg. unix command names “grep” “ls” “tr”&lt;/p&gt;
&lt;p&gt;- Tremendous diversity across users to describe objects or actions&lt;/p&gt;
&lt;p&gt;- “repeat rate” (Zipf) generally 5-20%; “the long tail”&lt;/p&gt;
&lt;p&gt;- need to recognize the fact that there is a long tail in the way people want to refer to an object&lt;/p&gt;
&lt;p&gt;- CHI ‘82 paper:  ”How can a computer use what people name things to guess what things people mean when they name things?”&lt;/p&gt;
&lt;p&gt;- soon became interested in applying retrieval technologies to this problem, and full text indexing&lt;/p&gt;
&lt;p&gt;- “Rich Aliasing” — multiple names for the same object&lt;/p&gt;
&lt;p&gt;- “Adaptive Indexing” — associate failed queries to destination objects, basically as new fields to the document objects&lt;/p&gt;
&lt;p&gt;- “Latent Semantic Indexing” — model relationships among words, using dimension reduction, esp. useful for short documents&lt;/p&gt;
&lt;p&gt;- Rich aliasing &amp; adaptive indexing are still here today: full text index (rich aliases from the author); anchor text/tags (rich aliases from other users); query-click data (adaptive indexing with implicit measures)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Common Themes&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- Last 10-20 years has been amazing for IR; search is everywhere&lt;/p&gt;
&lt;p&gt;- Lots of progress, but some tasks are still really hard.  How can we improve quality of search systems?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Web search @ age 15&lt;/b&gt;:&lt;/p&gt;
&lt;p&gt;- pages indexed: Lycos, 7/1994: 54,000 pages indexed, only first few hundred words from ea. document; now, &gt;10^10 pages?&lt;/p&gt;
&lt;p&gt;- Many types of content&lt;/p&gt;
&lt;p&gt;- how is it accessed?  basically SAME search box over the years; same ranked list (title, summary, url)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;
&lt;p&gt;&lt;b&gt;Support for searchers&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- spelling, q. suggestions, auto-complete, inline answers, rich summaries (deep links),&lt;/p&gt;
&lt;p&gt;- but much more can be done by understand context&lt;/p&gt;
&lt;p&gt;- great quote from a NYTimes article about Sue getting fired if search still has the same interface in 10 years&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Search and Context&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- Query context: where do the queries come from?  pay attention to information interactions, past queries&lt;/p&gt;
&lt;p&gt;- Document context: documents aren’t independent from eachother&lt;/p&gt;
&lt;p&gt;- Task/Use context: we don’t say “I want to search” we want to solve a problem.  We need to understand the problem.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Re-finding on the desktop:&lt;/b&gt; “Stuff I’ve Seen”&lt;/p&gt;
&lt;p&gt;- People don’t use query operators, but do use UI elements to express more sophisticated queries&lt;/p&gt;
&lt;p&gt;- Date is by far the most common document attribute for sorting results, especially for re-finding settings like the desktop&lt;/p&gt;
&lt;p&gt;- This paper was between CHI and SIGIR, with interface, user studies, and ranking algorithms.  interesting reviews&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Re-finding on the Web:&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- see SIGIR 07 Teevan &amp; Jones — HUGE number of repeat queries &amp; page visits&lt;/p&gt;
&lt;p&gt;- there’s not much work on algorithms for integrating re-finding &amp; re-visitation into ranking&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Personalization &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;- results are typically independent of recent behavior (see PSearch papers, SIGIR 05, SIGIR 07)&lt;/p&gt;
&lt;p&gt;- Works well for some queries, awful for others — when does it wor &lt;/p&gt;&lt;/b&gt;&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/145350310</link><guid>http://windowoffice.tumblr.com/post/145350310</guid><pubDate>Mon, 20 Jul 2009 06:50:00 -0700</pubDate></item><item><title>Sue’s award talk</title><description>&lt;img src="http://20.media.tumblr.com/k6d1kyDYHq4qkzwykIgmTVPKo1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Sue’s award talk&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/145323137</link><guid>http://windowoffice.tumblr.com/post/145323137</guid><pubDate>Mon, 20 Jul 2009 05:47:07 -0700</pubDate></item><item><title>Susan Dumais receiving the Salton award.</title><description>&lt;img src="http://4.media.tumblr.com/k6d1kyDYHq4qczufPmK9t32ko1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Susan Dumais receiving the Salton award.&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/145320494</link><guid>http://windowoffice.tumblr.com/post/145320494</guid><pubDate>Mon, 20 Jul 2009 05:40:50 -0700</pubDate></item><item><title>Detexify LaTeX handwritten symbol recognition</title><description>&lt;a href="http://detexify.kirelabs.org/classify.html"&gt;Detexify LaTeX handwritten symbol recognition&lt;/a&gt;: &lt;p&gt;(via &lt;a href="http://blog.codalism.com/?p=610"&gt;Will Webber&lt;/a&gt;)&lt;/p&gt;</description><link>http://windowoffice.tumblr.com/post/142062406</link><guid>http://windowoffice.tumblr.com/post/142062406</guid><pubDate>Wed, 15 Jul 2009 04:40:19 -0700</pubDate></item></channel></rss>
