Thursday, November 13, 2008

Reading Response #10: Harvest Time

Of the readings assigned this week, I found “Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting” to be the most interesting as it peripherally touched upon the relationship between metadata harvesting and “The Deep Web.”

Typically, harvesting culls information from Web pages using the metadata tags embedded in the HTML code. Initially, there was some dispute as to whether the function should be performed digitally or manually, the concern being that people would create too many disparate terms while a purely software-based procedure would exclude important semantic relationships. With Dublin Core rapidly becoming a standard for metadata schema, Web pages are increasingly adhering to a similar standard and format. But does this improvement in search methods extend to the “Deep Web”?

The “Deep Web” includes information that is available to the public but resides outside the scope of traditional search engines because data is stored in proprietary databases, only accessible by direct inquiries. However, the OIA Protocol allows search engines without normal access to this information to index pages hosted on the “Deep Web” through OAI repositories. As a significant portion of digital information resides on the “Deeb Web” this new component of accessibility is important as it helps to promote open-source and transparent policies in regards to public information.

No comments: