HtmlRewriter (open)

An HTML parser

Syntax

LOADLIB "wh::filetypes/html.whlib";

OBJECTTYPE HtmlRewriter

Constructor

Variables

  • STRING ARRAY allowed_attrs

    List of allowed attributes (in addition to our standard per-tag list), if strict filtering is enabled

  • STRING ARRAY allowed_tags

    List of allowed tags, if strict filtering is enabled

  • BOOLEAN allow_comments

    Whether to allow comments tags

  • BOOLEAN allow_scripting

    Whether to allow tags like , , , , , , etc that directly or indirectly allow scripting. Disabled by default

  • BOOLEAN cleanup_msoffice

    If enabled (default), try to clean up MS Office noise (strictly, this breaks (X)HTML conformance)

  • BOOLEAN clean_newlines

    Replace all newlines by spaces (for flash, which renders newlines)

  • BOOLEAN debug

    If enabled, prints out a lot of debug info

  • STRING ARRAY disallowed_attrs

    List of disallowed attributes

  • RECORD ARRAY htmltags

    HTML tag listing used to filter (see beginning of this file for its format)

  • INTEGER max_content_length

    Maximum text length (defaults to -1/no limit)

  • FUNCTION PTR rewrite_hyperlink

    If set, this function is called for every encountered hyperlink within attributes (see the 'links' attribute in the html array for the attributes that are handles as hyperlinks) Give back the rewritten hyperlink (signature: STRING FUNCTION(STRING hyperlink)

  • FUNCTION PTR rewrite_img

    If set, this function is called for every encountered image url within attributes (see the 'imgs' attribute in the html array for the attributes that are handles as hyperlinks) Give back the rewritten image url (signature: STRING FUNCTION(STRING imageurl)

  • FUNCTION PTR rewrite_link

    If set, this function is called for every encountered link within attributes (see the 'links' attribute in the html array for the attributes that are handles as links) Give back the rewritten link (signature: STRING FUNCTION(STRING link)

  • BOOLEAN strict_filtering

    Enable filtering based on allowed_tags and allowed_attrs, and filter all unknown attributes

  • BOOLEAN trim_whitespace

    Strip whitespace (whitespace characters and empty elements) from the beginning and end of the document

Properties

Functions