Thursday, November 6, 2008

Reading Response #9: Gone Fishing

Michael Bergman’s article “The Deep Web: Surfacing Hidden Value” is an important white paper because it addresses both the limitations of current search engine formats and the structure of information on the Web. Google, today’s most popular search engine, relies on an aggregate formula to create a list of query results intended to minimize duplicates and increase the number of relevant resources. However this format is inherently imperfect because it relies on a system of popularity through web citation (similar to the way the most prominently published scientific journal articles cite each other). As a result, a web page with relevant information might end up further down on a list of query results because it has not been cited adequately by other web pages.

A bigger problem with this search method is that it skims over the larger repository of information available in the Deep Web. Most of the information here is digitally available but instead of being hosted on a “surface page”, it is embedded in proprietary databases that are linked to but function off of the Internet.

The Deep Web should be a primary concern for several reasons. Currently a great deal of development is being done on more semantic and comprehensive search capabilities. For this work to be functional and current, it has to be able to adapt to the exponential increase in digital information as well as its location, both on surface web pages and the Deep Web.

Also, the availability of information is one of the most important components of digital network systems because without it, the democratic intention of the web is meaningless. Bergman gives the example of several federal organizations that post their information online but not in a format accessible by commercial web engines; the majority of the information is hidden in the “Deep Web.” Though not intentionally deceptive, this unexplored territory of information could inadvertently become an intentional iron curtain. As the format of information transitions from analog to digital, it is important that the same amount of information be readily available.

2 comments:

Samantha Le Blanc said...

The iron curtain is up, and in some cases I think it is appropriate. Not because its a national security issue per se, but to protect individuals from identity theft, for example.

spk said...

everything that you say is true. i used to like fishing, then is started to feel bad for the fish; now i couldn't enjoy it if you paid me.