Wed, 24 January 2007
This show takes a look at trends in the broadband industry and some projections for 2007. References:
Budde.com: http://www.budde.com JP Morgan: http://www.jpmorgan.com Nielsen Media Research: http://www.nielsen.com/ In-Stat: http://www.instat.com/ US Bancorp: http://www.usbank.com/ Lehman Brothers: http://www.lehman.com/ Verizon: http://www.verizon.com/ Cox: http://www.cox.com Comcast: http://www.comcast.com Time Warner: http://www.timewarner.com |
Sun, 7 January 2007
Intro: Right before the 2006 holidays Jimmy Wales, creator of the online encyclopedia Wikipedia, announced the Search Wikia project. This project will rely on search results based on the future sites community of users. In this podcast we take a look at popular search engine technologies and discuss the Search Wikia project concept. Question: I know this project was
really just announced. Before we get into the technology involved - can you
tell us what phase the project is in? Question: What makes this concept
fundamentally different than what Google or Yahoo! Are doing? Question: This sounds a lot like digg - am I on the
right track? Question: Can you provide a bit more
detail on how Google works? Source: www.google.com
Source: www.google.com
PageRank is Google's system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank. Source: www.google.com
1. User accesses google server at google.com and makes query. 2. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book--it tells which pages contain the words that match any particular query term. 3. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result. 4. The search results are returned to the user in a fraction of a second. Source: www.google.com
There are a couple of projects
called Nutch and Lucene, along with some others that can now provide the
background infrastructure needed to generate a new kind of search engine, which
relies on human intelligence to do what algorithms cannot. Let's take a quick
look at these projects. Lucene: Lucene is a free and
open source information retrieval API, originally implemented in Java by Doug
Cutting. It is supported by the Apache Software Foundation and is released
under the Apache Software License. We mentioned Nutch earlier. Nutch
is a project to develop an open source search engine. Nutch is supported by the
Apache Software Foundation, and is a subproject of Lucene since 2005.
References: Wikipedia creator turns to search: http://news.bbc.co.uk/2/hi/technology/6216619.stm How Google Works: http://www.googleguide.com/google_works.html Search Wikia website: http://search.wikia.com Search Wikia Nutch website http://search.wikia.com/wiki/Nutch Lucene Website: http://lucene.apache.org/java/docs/ Wikipedia Website: http://wikipedia.org/ |