Analyzing Web Structure Mining Techniques for Classification of Web Data
Since the interest to World Wide Web has grown dramatically for the last decade, the need for retrieving the right information in a short time from Web has become very important. The requested documents must be found in a desirable speed and information must be collected by efficient methods. When we make a search on the Web by using search engines, we approximately find all what we look for. Besides that, we agree that the speed we find the documents is reasonable, especially when we think that there are more than 8 billion web pages indexed in Google!
In this paper, we introduce a progressive review of the link-structure based clustering approaches that has been introduced to make this possible. We study these techniques comparatively. These clustering algorithms (such as Companion and Cocitation) are used to identify related web pages with fast and high precision by utilizing links on the pages and the order of the links they appear. The purpose of this paper is to consider some of the most challenging aspects of link-structure based clustering; and to consider the means of improving them.