Groups > comp.lang.python > #66973 > unrolled thread

insert html into ElementTree without parsing it

Started by	graeme.pietersz@gmail.com
First post	2014-02-24 01:45 -0800
Last post	2014-03-01 15:26 +0100
Articles	2 — 2 participants

Back to article view | Back to comp.lang.python

  insert html into ElementTree without parsing it graeme.pietersz@gmail.com - 2014-02-24 01:45 -0800
    Re: insert html into ElementTree without parsing it Stefan Behnel <stefan_ml@behnel.de> - 2014-03-01 15:26 +0100

#66973 — insert html into ElementTree without parsing it

From	graeme.pietersz@gmail.com
Date	2014-02-24 01:45 -0800
Subject	insert html into ElementTree without parsing it
Message-ID	<b7c05dd5-d25f-428e-b1d5-2ab75ab451f8@googlegroups.com>

I am building HTML pages using ElementTree.

I need to insert chunks of untrusted HTML into the page. I do not need or want to parse this, just insert it at a particular point as is.

The best solutions I can think of are rather ugly ones: manipulating the string created by tostring.

Is there a nicer way of doing this? Is it possible, for example, to customise how an element is converted to a string representation? I am open to using something else (e.g. lxml) if necessary.

[toc] | [next] | [standalone]

#67315

From	Stefan Behnel <stefan_ml@behnel.de>
Date	2014-03-01 15:26 +0100
Message-ID	<mailman.7513.1393683982.18130.python-list@python.org>
In reply to	#66973

graeme.pietersz@gmail.com, 24.02.2014 10:45:
> I am building HTML pages using ElementTree.
> I need to insert chunks of untrusted HTML into the page. I do not need or want to parse this, just insert it at a particular point as is.

How would you want to find out if it can be safely inserted or not without
parsing it?


> The best solutions I can think of are rather ugly ones: manipulating the string created by tostring.
> 
> Is there a nicer way of doing this? Is it possible, for example, to customise how an element is converted to a string representation? I am open to using something else (e.g. lxml) if necessary.

lxml has a tool to discard potentially unsafe content from HTML files:

http://lxml.de/lxmlhtml.html#cleaning-up-html

Stefan

[toc] | [prev] | [standalone]

csiph-web

insert html into ElementTree without parsing it

Contents

#66973 — insert html into ElementTree without parsing it

#67315