TagSoup is a library for parsing HTML/XML.
It supports the HTML 5 specification, and can be used to parse
either well-formed XML, or unstructured and malformed HTML from the web.
The library also provides useful functions to extract information from
an HTML document, making it ideal for screen-scraping.
Users should start from the "Text.HTML.TagSoup" module.