英语翻译

[复制链接]
查看11 | 回复3 | 2008-11-21 15:01:52 | 显示全部楼层 |阅读模式
Future Work
A large-scale web search engine is a complex system and much remains to be done. Our immediate goals are to improve search efficiency and to scale to approximately 100 million web pages. Some simple improvements to efficiency include query caching, smart disk allocation, and subindices.Another area which requires much research is updates. We must have smart algorithms to decide what old web pages should be recrawled and what new ones should be crawled. Work toward this goal has been done in [Cho 98]. One promising area of research is using proxy caches to build search databases, sincd they are demand driven. We are planning to add simple features supported by commercial search engines like boolean operators, negation, and stemming. However, other features are just starting to be explored such as relevance feedback and clustering (Google currently supports a simple hostname based clustering). We also plan to support user context (like the user’s location), and result summarization. We are also working to extend the use of link structure and link text. Simple experiments indicate PageRank can be personalized by increasing the weight of a user’s home page or bookmarks. As for link text, we are experimenting with using text surrounding links in addition to the link text itself. A Web search engine is a very rich environment for research ideas. We have far too many to list here so we do not expect this Future Work section to become much shorter in the near future.
High Quality Search
The biggest problem facing users of web search engines today is the quality of the results they get back. While the results are often amusing and expand user’s horizons, they are often frustrating and consume precious time. For example, the top result for a search for “Bill Clinton” on one of the most popular commercial search engines was the Bill Clinton Joke lf the Day: April 14,1997. Google is designed to provide higher quality search so as the Web continues to grow rapidly, informantion can be found easily. In order to accomplish this Google makes heavy use of hypertextual information consisting of link structure and link (anchor) text. Google also uses proximity and font information. While evaluation of a search engine is difficult, we have subjectively found that Google returns higher quality search results than current commercial search engines. The analysis of link structure via PageRank allows Google to evaluate the quality of web pages. The use of link text as a description of what the link points to helps the search engine return relevant (and to some degree high quality) results. Finally, the use of proximity information helps increase relevance a great deal for many queries.
回答前麻烦自己先读读看 能都读的通 OK?

回复

使用道具 举报

千问 | 2008-11-21 15:01:52 | 显示全部楼层
未来工作 大规模的网络搜索引擎是一个复杂的系统,仍然有许多工作要做。我们眼前的目标是提高搜索效率,将规模扩大到大约1亿的网页中。一些简单的改进,效率包含查询缓存,智能磁盘分配,并subindices.Another领域需要大量的研究更新。我们必须有智能算法,以决定哪些旧的网页应重新与新的应检索。努力实现这一目标已经在[赵98 ] 。一个有希望的研究领域是使用代理缓存建立搜索数据库, sincd他们是需求驱动的。我们正计划新增功能支持简单的商业搜索引擎运营商一样布尔,否定和制止。然而,其他功能都才刚刚开始探索,如相关反馈和集群(谷歌目前支持一个简单的基于主机集群) 。我们还计划支持用户上下文(如用户的位置) ,并因此总结。我们还正在努力扩大利用链接结...
回复

使用道具 举报

千问 | 2008-11-21 15:01:52 | 显示全部楼层
未来工作 A大规模网搜索引擎是复杂系统,并且留待去做。 我们的直接目标是改进查寻效率和称对大约100百万个网页。 对效率的一些简单的改善包括询问贮藏,聪明的磁盘分配和分索引。要求研究的另一个区域是更新。 我们必须有决定聪明的算法应该爬行什么老网页应该recrawled,并且什么新的。 往这个目标的工作被完成了[Cho 98]。 一有为的研究领域使用代理人...
回复

使用道具 举报

千问 | 2008-11-21 15:01:52 | 显示全部楼层
QQ:88886666加我,我发你...
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

主题

0

回帖

4882万

积分

论坛元老

Rank: 8Rank: 8

积分
48824836
热门排行