Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2015, Volume 16, Issue 6 doi: 10.1631/FITEE.1400377

A sampling method based on URL clustering for fast web accessibility evaluation

College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

Available online: 2016-01-05

Next Previous

Abstract

When evaluating the accessibility of a large website, we rely on sampling methods to reduce the cost of evaluation. This may lead to a biased evaluation when the distribution of checkpoint violations in a website is skewed and the selected samples do not provide a good representation of the entire website. To improve sampling quality, stratified sampling methods first cluster web pages in a site and then draw samples from each cluster. In existing stratified sampling methods, however, all the pages in a website need to be analyzed for clustering, causing huge I/O and computation costs. To address this issue, we propose a novel page sampling method based on URL clustering for web accessibility evaluation, namely URLSamp. Using only the URL information for stratified page sampling, URLSamp can efficiently scale to large websites. Meanwhile, by exploiting similarities in URL patterns, URLSamp cluster pages by their generating scripts and can thus effectively detect accessibility problems from web page templates. We use a data set of 45 web sites to validate our method. Experimental results show that our URLSamp method is both effective and efficient for web accessibility evaluation.

Related Research