J4

• 计算机科学 • Previous Articles     Next Articles

Solve the cycle links problem in Internet crawlingby directe d graph

HE Feng-ling, ZUO Wan-Li   

  1. College of Computer Science and Technology, Jilin University, Chan gchun 130012, China
  • Received:2003-12-17 Revised:1900-01-01 Online:2004-07-26 Published:2004-07-26
  • Contact: HE Feng-ling

Abstract: The present paper deals with the technique how to solve the problem of cycle links in internet crawling by directed graph. First, the problem is proposed. Then, the formal definition of cycle links in internet crawling is described. Finally, the algorithm to solve the problem by directed graph is given. The key problem to a crawler is how to find directed loops effectively in web pages crawled by the crawler. The algorithm described in this paper can make the crawler avoid dropping in the pitfall created by cycle links.

Key words: crawler, internet search engine, hyperlink, directed graph

CLC Number: 

  • TP393.09