Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #34949
| Date | 2012-12-16 22:10 +0100 |
|---|---|
| Subject | Unicode |
| From | Anatoli Hristov <tolidtm@gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.941.1355692240.29569.python-list@python.org> (permalink) |
Hello guys,
I'm using Linux CentOS and Python 2.4 with MySQL 5.xx, I get error
with Unicode I tried many things that I found on the net but none of
them working.
If I dont use UTF-8 it inserts the data into the DB but some French
char. are not correctly decoded. Could you please help me ?
Thanks
def PrepareSpecs(product_id, icecat_prod_id, icecat_image_url, name):
"""Gets the specifications of a product from Icecat.biz and insert
them into the DB
"""
specs = {3:GetSpecsNL(icecat_prod_id),2:GetSpecsFR(icecat_prod_id).decode('utf-8'),1:GetSpecsEN(icecat_prod_id)}
SpecsToSQL(product_id,specs,name)
CategorySQL(product_id)
StoreSQL(product_id)
GetIMG(icecat_image_url,icecat_prod_id)
return
def GetSpecsFR(icecat_prod_id):
opener = urllib.FancyURLopener({})
ffr = opener.open("http://prf.icecat.biz/index.cgi?product_id=%s;mi=start;smi=product;shopname=openICEcat-url;lang=fr"
% icecat_prod_id)
specsfr = ffr.read()
#specsfr = specsfr.decode('utf-8')
specsfr = RemoveHTML(specsfr)
##specsfr = "%r" % specsfr
## if specsfr:
## try:
## specsfr = str(specsfr)
## except UnicodeEncodeError:
## specsfr = str(specsfr.encode('utf-16'))
return specsfr
def RemoveHTML(specs):
specs = specs.replace("<html>","")
specs = specs.replace("<HTML>","")
specs = specs.replace("</html>","")
specs = specs.replace("</HTML>","")
specs = specs.replace("<head>","")
specs = specs.replace("<HEAD>","")
specs = specs.replace("</head>","")
specs = specs.replace("</HEAD>","")
specs = specs.replace("<body>","")
specs = specs.replace("</body>","")
specs = specs.replace("<BODY>","")
specs = specs.replace("</body>","")
specs = specs.replace("<TITLE>","")
specs = specs.replace("</TITLE>","")
specs = specs.replace("<title>","")
specs = specs.replace("</title>","")
specs = specs.replace("<p>","")
specs = specs.replace("</p>","")
return specs
def SpecsToSQL(product_id, specs, name):
for lang, spec in specs.iteritems():
InsertSpecsDB(product_id, spec, lang, name)
return
def InsertSpecsDB(product_id, spec, name, lang):
db = MySQLdb.connect("localhost","getit","opencart")
cursor = db.cursor()
sql = "INSERT INTO product_description (product_id, language_id,
name, description) VALUES (%s,%s,%s,%s)"
params = (product_id, lang, name, spec)
cursor.execute(sql, params)
id = cursor.lastrowid
print"Updated ID %s description %s" %(int(id), lang)
return
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-16 22:10 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-17 06:06 +0000
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 09:59 +0100
Re: Unicode Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-12-17 01:28 -0800
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 10:45 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 11:17 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 12:14 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 12:56 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 18:43 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 13:07 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 19:36 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-18 00:07 +0000
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 20:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 21:00 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 16:09 -0500
Re: Unicode Hans Mulder <hansmu@xs4all.nl> - 2012-12-17 23:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:33 +0100
Re: Unicode Terry Reedy <tjreedy@udel.edu> - 2012-12-17 17:03 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:31 +0100
csiph-web