This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related publications (4)
Please note that this is not a complete list of this person’s publications. It includes only semantically relevant works. For a full list, please refer to Infoscience.
As the World Wide Web is growing rapidly, it is getting increasingly challenging to gather representative information about it. Instead of crawling the web exhaustively one has to resort to other techniques like sampling to determine the properties of the ...
The World Wide Web is one of the most widely used information resources. Understanding the web better will enable us to benefit more of it. In this thesis we develop techniques to learn the properties of the web pages like language and topic using only the ...
EPFL2009
, , ,
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content, but a URL-only classifier is preferable, (i) when speed is crucial, (ii) to enable conte ...