source: exist/trunk/python/elementtree-1.3a6-20070310-badc/docs/pythondoc-elementtree.HTMLTreeBuilder.html @ 3578

Subversion URL: http://proj.badc.rl.ac.uk/svn/ndg/exist/trunk/python/elementtree-1.3a6-20070310-badc/docs/pythondoc-elementtree.HTMLTreeBuilder.html@3578
Revision 3578, 5.7 KB checked in by pjkersha, 11 years ago (diff)

Latest releases from Fredrik Lundh. 10 March release has exclusive C14N support with namespace prefixes.

Line 
1<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
2<html>
3<head>
4<meta http-equiv='Content-Type' content='text/html; charset=us-ascii' />
5<title>The elementtree.HTMLTreeBuilder Module</title>
6<link rel='stylesheet' href='effbot.css' type='text/css' />
7</head>
8<body>
9<h1>The elementtree.HTMLTreeBuilder Module</h1>
10<p>Tools to build element trees from HTML files.</p>
11<h2>Module Contents</h2>
12<dl>
13<dt><b>HTMLTreeBuilder(builder=None, encoding=None)</b> (class) [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder-class'>#</a>]</dt>
14<dd>
15<p>ElementTree builder for HTML source code.</p>
16<dl>
17<dt><i>builder=</i></dt>
18<dd>
19Optional builder object.  If omitted, the parser
20    uses the standard <b>elementtree</b> builder.
21</dd>
22<dt><i>encoding=</i></dt>
23<dd>
24Optional character encoding, if known.  If omitted,
25    the parser looks for META tags inside the document.  If no tags
26    are found, the parser defaults to ISO-8859-1.  Note that if your
27    document uses a non-ASCII compatible encoding, you must decode
28    the document before parsing.</dd>
29</dl><br />
30<p>For more information about this class, see <a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder-class'><i>The HTMLTreeBuilder Class</i></a>.</p>
31</dd>
32<dt><a id='elementtree.HTMLTreeBuilder.parse-function' name='elementtree.HTMLTreeBuilder.parse-function'><b>parse(source, encoding=None)</b></a> [<a href='#elementtree.HTMLTreeBuilder.parse-function'>#</a>]</dt>
33<dd>
34<p>Parse an HTML document or document fragment.</p>
35<dl>
36<dt><i>source</i></dt>
37<dd>
38A filename or file object containing HTML data.</dd>
39<dt><i>encoding</i></dt>
40<dd>
41Optional character encoding, if known.  If omitted,
42    the parser looks for META tags inside the document.  If no tags
43    are found, the parser defaults to ISO-8859-1.</dd>
44<dt>Returns:</dt>
45<dd>
46An ElementTree instance</dd>
47</dl><br />
48</dd>
49<dt><a id='elementtree.HTMLTreeBuilder.TreeBuilder-variable' name='elementtree.HTMLTreeBuilder.TreeBuilder-variable'><b>TreeBuilder</b></a> (variable) [<a href='#elementtree.HTMLTreeBuilder.TreeBuilder-variable'>#</a>]</dt>
50<dd>
51<p>An alias for the <b>HTMLTreeBuilder</b> class.
52</p></dd>
53</dl>
54<h2><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder-class' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder-class'>The HTMLTreeBuilder Class</a></h2>
55<dl>
56<dt><b>HTMLTreeBuilder(builder=None, encoding=None)</b> (class) [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder-class'>#</a>]</dt>
57<dd>
58<p>ElementTree builder for HTML source code.  This builder converts an
59HTML document or fragment to an ElementTree.
60</p><p>
61The parser is relatively picky, and requires balanced tags for most
62elements.  However, elements belonging to the following group are
63automatically closed: P, LI, TR, TH, and TD.  In addition, the
64parser automatically inserts end tags immediately after the start
65tag, and ignores any end tags for the following group: IMG, HR,
66META, and LINK.
67
68</p><dl>
69<dt><i>builder=</i></dt>
70<dd>
71Optional builder object.  If omitted, the parser
72    uses the standard <b>elementtree</b> builder.
73</dd>
74<dt><i>encoding=</i></dt>
75<dd>
76Optional character encoding, if known.  If omitted,
77    the parser looks for META tags inside the document.  If no tags
78    are found, the parser defaults to ISO-8859-1.  Note that if your
79    document uses a non-ASCII compatible encoding, you must decode
80    the document before parsing.</dd>
81</dl><br />
82</dd>
83<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.close-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.close-method'><b>close()</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.close-method'>#</a>]</dt>
84<dd>
85<p>Flushes parser buffers, and return the root element.</p>
86<dl>
87<dt>Returns:</dt>
88<dd>
89An Element instance.</dd>
90</dl><br />
91</dd>
92<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_charref-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_charref-method'><b>handle_charref(char)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_charref-method'>#</a>]</dt>
93<dd>
94<p>(Internal) Handles character references.</p>
95</dd>
96<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_data-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_data-method'><b>handle_data(data)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_data-method'>#</a>]</dt>
97<dd>
98<p>(Internal) Handles character data.</p>
99</dd>
100<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_endtag-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_endtag-method'><b>handle_endtag(tag)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_endtag-method'>#</a>]</dt>
101<dd>
102<p>(Internal) Handles end tags.</p>
103</dd>
104<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_entityref-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_entityref-method'><b>handle_entityref(name)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_entityref-method'>#</a>]</dt>
105<dd>
106<p>(Internal) Handles entity references.</p>
107</dd>
108<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_starttag-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_starttag-method'><b>handle_starttag(tag, attrs)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.handle_starttag-method'>#</a>]</dt>
109<dd>
110<p>(Internal) Handles start tags.</p>
111</dd>
112<dt><a id='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.unknown_entityref-method' name='elementtree.HTMLTreeBuilder.HTMLTreeBuilder.unknown_entityref-method'><b>unknown_entityref(name)</b></a> [<a href='#elementtree.HTMLTreeBuilder.HTMLTreeBuilder.unknown_entityref-method'>#</a>]</dt>
113<dd>
114<p>(Hook) Handles unknown entity references.  The default action
115is to ignore unknown entities.</p>
116</dd>
117</dl>
118</body></html>
Note: See TracBrowser for help on using the repository browser.