HTML::Detoxifier(3pm) - phpMan

Command: man perldoc info search(apropos)  


HTML::Detoxifier(3pm)          User Contributed Perl Documentation          HTML::Detoxifier(3pm)



NAME
       HTML::Detoxifier - practical module to strip harmful HTML

SYNOPSIS
               use HTML::Detoxifier qw<detoxify>;

               my $clean_html = detoxify $html;

               my $cleaner_html = detoxify($html, disallow =>
                       [qw(dynamic images document)]);

               my $stripped_html = detoxify($html, disallow => [qw(everything)]);

DESCRIPTION
       HTML::Detoxifier is a practical module to remove harmful tags from HTML input.  It's
       intended to be used for web sites that accept user input in the form of HTML and then
       present that information in some form.

       Accepting all HTML from untrusted users is generally a very bad idea; typically, all HTML
       should be run through some kind of filter before being presented to end users. Cross-site
       scripting (XSS) vulnerabilities can run rampant without a filter. The most common and
       obvious HTML vulnerability lies in stealing users' login cookies through JavaScript.

       Unlike other modules, HTML::Detoxifier is intended to be a practical solution that
       abstracts away all the specifics of whitelisting certain tags easily and securely. Tags
       are divided into functional groups, each of which can be disallowed or allowed as you
       wish. Additionally, HTML::Detoxifier knows how to clean inline CSS; with HTML::Detoxifier,
       you can securely allow users to use style sheets without allowing cross-site scripting
       vulnerabilities. (Yes, it is possible to execute JavaScript from CSS!)

       In addition to this main purpose, HTML::Detoxifier cleans up some common mistakes with
       HTML: all tags are closed, empty tags are converted to valid XML (that is, with a trailing
       /), and images without ALT text as required in HTML 4.0 are given a plain ALT tag. The
       module does its best to emit valid XHTML 1.0; it even adds XML declarations and DOCTYPE
       elements where needed.

HTML TAG GROUPS
       The following groups can be disallowed or allowed as you choose. Some tags are present in
       more than one group. In these cases, the tag must be present in every allowed group, or
       the tag will be removed.

   everything
       All HTML.

   document
       Markup that defines the basic structure of a document (e.g. html, head, body).

   aesthetic
       Markup that alters the appearance of text (e.g. strong, strike, b, i, em).

   size-altering
       Markup that can alter the size of text (e.g. big, small).

   block
       Most block-level markup as defined in the HTML4 specification.

   comments
       HTML comments.

   forms
       Markup used to create fill-in forms.

   layout
       Markup that creates tables or otherwise controls page layout.

   images
       Markup that creates images.

   annoying
       Markup that creates "annoying" effects undesirable by the majority of web users (marquee,
       blink).

   dynamic
       Markup that specifies JavaScript or some other embedded format (SVG, Flash, Java, etc.)
       Possibly dangerous.

   misc
       Usually seldom-used, typically-harmless HTML tags that specify special types of inline
       text. (e.g. abbr, dd, span).

INVOCATION
               detoxify(html, options)

       Call detoxify to detoxify html with the given options. The most common key in for the
       options hash is disallow, which disallows certain features of HTML. See above for the list
       of acceptable values. Pass a reference to an array of strings specifying groups as the
       value to the optional disallow hash. You may also specify allow_only, which has the same
       syntax but performs the reverse action: only the specified tag sets are allowed. If no
       options are specified, dynamic content only is removed.

       If you want to detoxify a document in multiple stages, set the section key in the options
       hash to the value 'first' on the first page and 'next' on every subsequent page. This will
       postpone the tag closing mechanism until you pass 'last' as the value to the section key.

AUTHOR
       Patrick Walton <pwalton AT metajournal.net>

SEE ALSO
       HTML::Sanitizer, HTML::Scrubber, HTML::StripScripts, HTML::Parser

COPYRIGHT
       Copyright (c) 2004 Patrick Walton. You may redistribute this module under the same terms
       as Perl itself. For more information, see the appropriate LICENSE file.



perl v5.10.0                                2004-03-01                      HTML::Detoxifier(3pm)

Generated by $Id: phpMan.php,v 4.49 2006/02/26 13:18:18 chedong Exp $ Author: Che Dong
On Apache
Under GNU General Public License
2012-05-24 08:30 @38.107.179.239 Crawled by CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
Valid XHTML 1.0!Valid CSS!