Fast-Crawler for Harvesting Deep-Web Interfaces
Keywords:Deep web; two-stage crawler; feature selection; ranking; adaptive learning; TFIDF
Due to the large volume of internet and the dynamic nature of deep web, getting wide coverage and high
efficiency is a difficult task. so a two-stage framework, that is Fast Crawler, for efficient harvesting deep web interfaces.
During first stage, Fast Crawler performs site-based searching for center pages with the help of search engines, and by
this it will avoid visiting a large number of pages. To get more accurate results than of focused crawl, Fast Crawler
ranks websites to prioritize highly relevant ones for a given topic. During second stage, Fast Crawler achieves fast insite searching by excavating most relevant links with an adaptive link-ranking. This will avoid visiting the more number
of web sites and due to this we can achieve a good result.