Skip to main content

OPAL: a passe−partout for web forms

Xiaonan Guo‚ Jochen Kranzdorf‚ Tim Furche‚ Giovanni Grasso‚ Giorgio Orsi and Christian Schallhart

Abstract

Web forms are the interfaces of the deep web. Though modern web browsers provide facilities to assist in form filling, this assistance is limited to prior form fillings or keyword matching. Automatic form understanding enables a broad range of applications, including crawlers, meta-search engines, and usability and accessibility support for enhanced web browsing. In this demonstration, we use a novel form understanding approach, OPAL, to assist in form filling even for complex, previously unknown forms. OPAL associates form labels to fields by analyzing structural properties in the HTML encoding and visual features of the page rendering. OPAL interprets this labeling and classifies the fields according to a given domain ontology. The combination of these two properties, allows OPAL to deal effectively with many forms outside of the grasp of existing form filling techniques. In the UK real estate domain, OPAL achieves more than 99 percent accuracy in form understanding.

Book Title
Proc. of the 21st World Wide Web Conf. (WWW Companion Volume)
Note
Demonstration
Pages
353–356
Year
2012