Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #67346
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder1.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <shiblydu60@yahoo.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.003 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'subject:text': 0.05; 'none:': 0.07; 'sys': 0.07; '#print': 0.09; 'line:': 0.09; 'skip:# 30': 0.09; 'skip:/ 10': 0.09; 'subject:How': 0.10; 'received:113.11': 0.16; 'skip:# 20': 0.16; 'import': 0.22; 'to:name:python-list@python.org': 0.22; 'print': 0.22; 'skip:c 70': 0.24; 'skip:e 30': 0.24; 'skip:s 30': 0.35; 'charset:us- ascii': 0.36; 'subject:?': 0.36; 'hi,': 0.36; 'skip:- 20': 0.37; 'expected': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'header:Reply-To:1': 0.67; 'received:113': 0.68; 'skip:/ 30': 0.84 |
| X-Yahoo-Newman-Property | ymail-3 |
| X-Yahoo-Newman-Id | 282255.99327.bm@omp1052.mail.ne1.yahoo.com |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1393697458; bh=BRupQenYjqyd4eA9MyuwISc1U7lSQv8AHz9Bj818wDo=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=ndVHLTugPTO8aMcqlgK7yMltx5BK8tmsSfDeiSsH0kUpF/PtP+MUD1h5aiY9Ka5un3+oZb1KdbSlc1Zclb31HSOi7l26QkuF7+0V2FYVf2JdZYuu01E14Lgs7TDtrzTfsjZr/oE+2wLRhMye26oN3MwoGfsLNPegP7kXPJSa1p8= |
| DomainKey-Signature | a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=OU2pU6f9G/Iz8s3P220yOTrIIC6JLJXLdHpi9oVjttDlvoE/kpLoDp+xZV4jTxH6lrP+LGwsDPgon1uNMX8oy5iI/PvGXgp30pqIii8C+/glUVuEEkTKMjzDo/S3qtaKYlA9x+FAMKyTV9Bd0Bno6mkruOFqfp+9tzaP0PAB2cU=; |
| X-YMail-OSG | quGQhpcVM1nHQmHOTUnj4pZd8DUbMGFzZZOaRJOTURtyie3 h31v08FF.5oXnitFRjwsa8ov1oh8alQgIsDfmRTxIx5jS_hQmMaLX46jptnw P2SN4cKIEiqH9M07ubKZWBnYMUmrZ8vmCTs_MDGkB0NLF16so_eDhqzOiCVy EIMtCXIyV5saGbzWwO7M8VV7Pq7QAl6N.JFwSBPyg2633819dVnAGoGKx2Am 7vNFLmCzeXqY93UBY3q2rG0pZqscftoXrmGIwach6esxyH2GMVZOHe_IOojf 18Y1izQoVA_xr2FZ1yeF7j7FD1DvJJ_pho0a5GhiNbjKTwdUbp0.TX_fFgWg yAeWbJGI3gyg2wXEGpKnDPZ68Djpguhx18_TaJLFGm.U.7_8VITVa8dpMwwk aWQV7dYnTJgyZiz51CwN2tUfV3Z_d6JH6_0ninvYIkaO2RbX5aCsF8W129NZ 9YGNZlhLhMwcFQnPhtWBy_jiZ6aG4b5XqCMdzNnkfYTjvgNvvPRHTJQExCDO NFKEJVrXY8sMvdUC3co65EZWaitJr56h3tvVSjY32z3c9xuYxTxII5HWSi8S A1mE217klzZMFEA-- |
| X-Rocket-MIMEInfo | 002.001, SGksCgojIyNpbi50eHQKPGtiZCBjbGFzcz0iY29tbWFuZCI.CiAgICBjcCAtdiAtLXJlbW92ZS1kZXN0aW5hdGlvbiAvdXNyL3NoYXJlL3pvbmVpbmZvLwogICAgPGVtIGNsYXNzPSJyZXBsYWNlYWJsZSI.PGNvZGU.PHh4eD48L2NvZGU.PC9lbT4KICAgICAgIFwKICAgIC9ldGMvbG9jYWx0aW1lCjwva2JkPgoKaW1wb3J0IHN5cwppbXBvcnQgdW5pY29kZWRhdGEKZnJvbSBiczQgaW1wb3J0IEJlYXV0aWZ1bFNvdXAKCmZpbGVfbmFtZT0iaW4udHh0IgpodG1sX2RvYz1vcGVuKGZpbGVfbmFtZSwncicpCnNvdXABMAEBAQE- |
| X-Mailer | YahooMailWebService/0.8.177.636 |
| Date | Sat, 1 Mar 2014 10:10:58 -0800 (PST) |
| From | "Golam Md. Shibly" <shiblydu60@yahoo.com> |
| Subject | How to extract contents of inner text of html tag? |
| To | "python-list@python.org" <python-list@python.org> |
| MIME-Version | 1.0 |
| Content-Type | multipart/alternative; boundary="1688457910-1520518465-1393697458=:5147" |
| X-Mailman-Approved-At | Sat, 01 Mar 2014 21:06:05 +0100 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| Reply-To | "Golam Md. Shibly" <shiblydu60@yahoo.com> |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.7534.1393704366.18130.python-list@python.org> (permalink) |
| Lines | 101 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1393704366 news.xs4all.nl 2860 [2001:888:2000:d::a6]:33231 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:67346 |
Show key headers only | View raw
[Multipart message — attachments visible in raw view] - view raw
Hi,
###in.txt
<kbd class="command">
cp -v --remove-destination /usr/share/zoneinfo/
<em class="replaceable"><code><xxx></code></em>
\
/etc/localtime
</kbd>
import sys
import unicodedata
from bs4 import BeautifulSoup
file_name="in.txt"
html_doc=open(file_name,'r')
soup=BeautifulSoup(html_doc)
#print soup.prettify().encode('utf-8')
#file_to_write.writelines( soup.prettify().encode() )
all_kbd=soup.find_all('kbd')
for line in all_kbd:
if line.string == None:
extract_code=line.code.extract().string
#store_code=line.code.decompose()
for inside_line in line:
if "<<" not in inside_line and "EOF" not in inside_line:
if len(inside_line)>0:
print inside_line
print extract_code
expected output:
cp -v --remove-destination /usr/share/zoneinfo/<xxx>\
/etc/localtime
Got output:
cp -v --remove-destination /usr/share/zoneinfo/
None
\
/etc/localtime
None
shibly
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
How to extract contents of inner text of html tag? "Golam Md. Shibly" <shiblydu60@yahoo.com> - 2014-03-01 10:10 -0800 Re: How to extract contents of inner text of html tag? Jesse Adam <jaahush@gmail.com> - 2014-06-27 10:36 -0700
csiph-web