<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How Search Really Works: Relevance (2) &#8211; Vector Space</title>
	<atom:link href="http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html</link>
	<description>Canada's Search and Social Media Authority</description>
	<lastBuildDate>Mon, 13 Feb 2012 17:01:05 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: SEOs and their IDF Myths: Part 2 &#171; IR Thoughts</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-2370</link>
		<dc:creator>SEOs and their IDF Myths: Part 2 &#171; IR Thoughts</dc:creator>
		<pubDate>Thu, 03 Jul 2008 13:24:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-2370</guid>
		<description>[...] Others have claimed that it is not possible to evaluate the IDF of a phrase. Even some that plan to teach IR have claimed that calling log(N/n) &#8220;inverse document frequency&#8221; is an &#8220;insult to students&#8221;. Before making a fool of themselves they should read Robertson and Sparck Jones legacy papers on the topic. [...]</description>
		<content:encoded><![CDATA[<p>[...] Others have claimed that it is not possible to evaluate the IDF of a phrase. Even some that plan to teach IR have claimed that calling log(N/n) &#034;inverse document frequency&#034; is an &#034;insult to students&#034;. Before making a fool of themselves they should read Robertson and Sparck Jones legacy papers on the topic. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vector Space Models and Search Engines &#171; IR Thoughts</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1473</link>
		<dc:creator>Vector Space Models and Search Engines &#171; IR Thoughts</dc:creator>
		<pubDate>Mon, 21 Apr 2008 15:36:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1473</guid>
		<description>[...] That said, today&#8217;s post is in reaction to the article at http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html [...]</description>
		<content:encoded><![CDATA[<p>[...] That said, today&#039;s post is in reaction to the article at <a href="http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html</a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: How Search Engines Do Not Work &#171; IR Thoughts</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1474</link>
		<dc:creator>How Search Engines Do Not Work &#171; IR Thoughts</dc:creator>
		<pubDate>Thu, 17 Apr 2008 12:40:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1474</guid>
		<description>[...] 1. http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html [...]</description>
		<content:encoded><![CDATA[<p>[...] 1. <a href="http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html</a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Malte Landwehr</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1469</link>
		<dc:creator>Malte Landwehr</dc:creator>
		<pubDate>Wed, 16 Apr 2008 21:06:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1469</guid>
		<description>An excellent analysis of how to weight terms by their frequency. But I doubt that the two dimensional space is enough to represent the complexity needed to maintain an index of millions of documents.</description>
		<content:encoded><![CDATA[<p>An excellent analysis of how to weight terms by their frequency. But I doubt that the two dimensional space is enough to represent the complexity needed to maintain an index of millions of documents.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dev Basu</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1470</link>
		<dc:creator>Dev Basu</dc:creator>
		<pubDate>Mon, 14 Apr 2008 21:22:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1470</guid>
		<description>As usual Ruud this is a great post. It&#039;s always interesting to learn the inner workings of an SE :)</description>
		<content:encoded><![CDATA[<p>As usual Ruud this is a great post. It&#039;s always interesting to learn the inner workings of an SE <img src='http://www.searchenginepeople.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ruud Hein</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1471</link>
		<dc:creator>Ruud Hein</dc:creator>
		<pubDate>Sat, 12 Apr 2008 03:20:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1471</guid>
		<description>Good indeed to point that out. Doing any of this at run time is extremely costly. There are cost reducing procedures; working with top N documents or leader/follower samples.

Yet I too think that this isn&#039;t used at run time (read: query time) because the TFxIDF vector space model is geared towards words. The IDF of a words is computed; not of phrases. All in all it doesn&#039;t deliver enough bang for its buck.

Worse: it&#039;s typically a model for a clean index. Boosting TF for a high IDF word is too easy when you have search access to the whole collection.

It&#039;s interesting though to see how this model can find related documents.</description>
		<content:encoded><![CDATA[<p>Good indeed to point that out. Doing any of this at run time is extremely costly. There are cost reducing procedures; working with top N documents or leader/follower samples.</p>
<p>Yet I too think that this isn&#039;t used at run time (read: query time) because the TFxIDF vector space model is geared towards words. The IDF of a words is computed; not of phrases. All in all it doesn&#039;t deliver enough bang for its buck.</p>
<p>Worse: it&#039;s typically a model for a clean index. Boosting TF for a high IDF word is too easy when you have search access to the whole collection.</p>
<p>It&#039;s interesting though to see how this model can find related documents.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hamlet Batista</title>
		<link>http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1472</link>
		<dc:creator>Hamlet Batista</dc:creator>
		<pubDate>Fri, 11 Apr 2008 22:08:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.searchenginepeople.com/blog/how-search-really-works-relevance-2-vector-space.html#comment-1472</guid>
		<description>Hi Rudd,

Excellent post as usual. It is important to mention that vector space model for ranking is not currently practical for the top search engines due to the size of their index (and the corresponding size of the document vectors). While they use huge matrices for computing the importance of the links (PageRank), the process is done offline and is query-independent. Computing such vectors are query time would be prohibitively expensive in times and resources.

Cheers</description>
		<content:encoded><![CDATA[<p>Hi Rudd,</p>
<p>Excellent post as usual. It is important to mention that vector space model for ranking is not currently practical for the top search engines due to the size of their index (and the corresponding size of the document vectors). While they use huge matrices for computing the importance of the links (PageRank), the process is done offline and is query-independent. Computing such vectors are query time would be prohibitively expensive in times and resources.</p>
<p>Cheers</p>
]]></content:encoded>
	</item>
</channel>
</rss>

