logo TagSoup

TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

homepage: home.ccil.org/~cowan/XML/tagsoup
fresh index:
last release: 5 years ago, first release: 5 years ago
packaging: jar
get this artifact from: clojars




This chart shows how much is this artifact used as a dependency in other Maven artifacts in Central repository and GitHub:



© Jiri Pinkas 2015 - 2016. Admin login To submit bugs / feature requests please use this github page
related: JavaVids | Top Java Blogs | Java školení | monitored using: sitemonitoring
Apache and Apache Maven are trademarks of the Apache Software Foundation. The Central Repository is a service mark of Sonatype, Inc.