Warning: pg_query(): Query failed: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 in /dati/webiit-old/includes/database.pgsql.inc on line 138 Warning: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'https://www-old.iit.cnr.it/node/1038' in /dati/webiit-old/includes/database.pgsql.inc on line 159 Warning: pg_query(): Query failed: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 in /dati/webiit-old/includes/database.pgsql.inc on line 138 Warning: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'https://www-old.iit.cnr.it/node/1038' in /dati/webiit-old/includes/database.pgsql.inc on line 159 Collaborative Crawling | IIT - CNR - Istituto di Informatica e Telematica
IIT Home Page CNR Home Page

Collaborative Crawling

Web crawler design presents many different challenges: architecture, strategies, performance and more. One of the most important research topics concerns improving the selection of "interesting" Web pages (for the user), according to importance metrics. Another relevant point is content freshness, i.e. maintaining freshness and consistency of temporary stored copies. For this, the crawler periodically repeats its activity going over stored contents (recrawling process). We propose a scheme to permit a crawler to acquire information about the global state of a Website before the crawling process takes place. This scheme requires Web server cooperation in order to collect and publish information on its content, useful for enabling a crawler to tune its visit strategy. If this information is unavailable or not updated the crawler still acts in the usual manner. In this sense the proposed scheme is not invasive and is independent from any crawling strategy and architecture.


1st Latin American Web Congress, Santiago, Chile, 2003

Autori: Marina Buzzi
Autori IIT:

Tipo: Abstract/poster in Atti di convegno internazionale
Area di disciplina: Information Technology and Communication Systems