1) David Hawking , Web Search Engines:
I could not access to the article
2) Shreeves, S. L., Habing, T. O., Hagedorn, K., & Young, J. A. (2005). Current developments and future trends for the OAI protocol for metadata harvesting. Library Trends, 53(4), 576-589.
- Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting
- The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has been widely adopted since its initial release in 2001
- This article provides a brief overview of the OAI environment, two years out from the release of the production version of the protocol.
- It looks into some of the interesting developments within the OAI world, particularly the use of the protocol within specific communities of interest, the development of a comprehensive registry of OAI data providers, and a resolver for OAI identifiers that extends the protocol beyond its traditional use
- it documents some of the current challenges for both data and service provider
- and the article provides some of the possible future directions for the OAI protocol and community.
3) MICHAEL K. BERGMAN, “The Deep Web: Surfacing Hidden Value” http://www.press.umich.edu/jep/07-01/bergman.html- According to Michael K. Bergman:
- While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed
- Traditional search engines create their indices by spidering or crawling surface Web page
- raditional search engines cannot "see" or retrieve content in the deep Web
- Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request. But a direct query is a "one at a time" laborious way to search. BrightPlanet's search technology automates the process of making dozens of direct queries simultaneously using multiple-thread technology and thus is the only search technology, so far, that is capable of identifying, retrieving, qualifying, classifying, and organizing both "deep" and "surface" content.
- Search engines obtain their listings in two ways: Authors may submit their own Web pages, or the search engines "crawl" or "spider" documents by following one hypertext link to another.
Friday, November 6, 2009
Subscribe to:
Post Comments (Atom)
2 comments:
Before reading this article, I was unaware of the vast amount of hidden information on the web! I'm curious to know how much of that information is accessible to the public?
It is really scary to know how much is out there. In class Dr. Tomer said that Pitt has kept every singe email message ever sent out via a Pitt email account. That drives me crazy, although I am very happy that I never use that account. It's not that there is anything to hide, it is just weird to know that it is all out there.
Post a Comment