Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'nicely': 0.07; 'properly.': 0.07; 'python': 0.09; '(it': 0.09; 'charset': 0.09; 'encoding.': 0.09; 'fetch': 0.09; 'likewise': 0.09; 'otherwise)': 0.09; 'pyodbc': 0.09; 'subject:characters': 0.09; 'unicode,': 0.09; 'cc:addr:python- list': 0.10; 'subject:python': 0.11; 'library': 0.15; '"client': 0.16; 'columns': 0.16; 'general.': 0.16; 'installs': 0.16; "microsoft's": 0.16; 'sqlalchemy': 0.16; 'wrote:': 0.17; 'driver': 0.17; 'unicode': 0.17; 'jan': 0.18; 'shell': 0.18; 'module': 0.19; 'email addr:gmail.com>': 0.20; 'issue.': 0.20; 'wednesday,': 0.22; 'cc:2**0': 0.23; '>': 0.23; 'specified': 0.23; 'seems': 0.23; 'project,': 0.24; 'linux': 0.24; 'least': 0.25; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; '(which': 0.26; 'am,': 0.27; 'module.': 0.27; 'opposed': 0.27; 'message-id:@mail.gmail.com': 0.27; 'went': 0.28; 'chris': 0.28; 'facing': 0.29; 'url:download': 0.29; 'skip:& 10': 0.29; 'probably': 0.29; 'that.': 0.30; 'ubuntu': 0.33; 'received:google.com': 0.34; 'server': 0.35; 'pm,': 0.35; 'similar': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'characters': 0.36; 'should': 0.36; 'itself': 0.37; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'perform': 0.38; 'things': 0.38; 'advice': 0.39; 'url:microsoft': 0.39; 'help': 0.40; 'red': 0.60; 'url:aspx': 0.60; 'personally': 0.61; "you'll": 0.62; 'gone': 0.64; 'choose': 0.65; 'taking': 0.65; 'box.': 0.65; 'url:en-us': 0.65; 'gathering': 0.71; 'special': 0.73; '2013': 0.84; 'packaged': 0.84; 'route': 0.84; 'sender:addr:chris': 0.84; 'dealt': 0.91; 'exchanging': 0.91; 'on?': 0.91; 'subject:Special': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rebertia.com; s=google; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=AhIgc6qj9OtU56FJs98yg7YMcV1d/midMmi1P0wAVWo=; b=QIIPc8KHTyZ3Urq4L6G1ijPZfI62Jdk5zwdQouCRicHQ2SdvUSUKREGRZ0utzr8L3S D65/0Gl9v523wKd7wCF1lWv5FDByGQCnFwNIfeAStt5joI0tfn+yhfejYH80yjiGfYBT 213Z0+ajXPO/OKSm8a7yi1fuqSqmTs+efuZTE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=AhIgc6qj9OtU56FJs98yg7YMcV1d/midMmi1P0wAVWo=; b=cM6+1AFXlzgEUpSa0tYEf7cp/T9uckNZh8IXRQGizXRGsY41/djmcJWtoYQP5qPWp8 vpiigdOR/TNhbvwbeM1FRBdxVyfnmgGQVi+aepVexQhp9MjhUtuctfuoM4jqY2aQJCyL 5coY+oEPaxVHsbxLRbpA1FcIKMn54/7Bc7CbPzzO04nI2aeVVlJLOqrtiF6MsGtJjSar XPa4EO+4GX2j7v1gpvSGGfW4ay2T+wEFb9j69tO+Ds4cyi/u/Hi0QL8/w/sGS/PNvw3T JjozJCnutva2aqLBdDFCe1eyaHHgpk4hDpjlR5bHH2/fKg/YRmwdYV/YPa+odTGpdJAz GlQQ== MIME-Version: 1.0 Sender: chris@rebertia.com In-Reply-To: <90663ba3-2307-45ed-a1b7-c3dbe5130ebd@googlegroups.com> References: <90663ba3-2307-45ed-a1b7-c3dbe5130ebd@googlegroups.com> Date: Tue, 1 Jan 2013 22:32:34 -0800 X-Google-Sender-Auth: 84cOMdISoeLSkVbWAv2cVjdMHCA Subject: Re: Handling Special characters in python From: Chris Rebert To: anilkumar.dannina@gmail.com Content-Type: multipart/alternative; boundary=90e6ba1efcb26b559704d24869e4 X-Gm-Message-State: ALoCoQlp4tlA4hwRd0qciJeSzKZEVN8aUrPNOR1eDBfd6h4LQeiR641GlOfJK2OfTHdn9RSK8Zhy Cc: Python X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 85 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1357108357 news.xs4all.nl 6901 [2001:888:2000:d::a6]:40629 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:35964 --90e6ba1efcb26b559704d24869e4 Content-Type: text/plain; charset=UTF-8 On Jan 1, 2013 8:48 PM, wrote: > On Wednesday, January 2, 2013 12:00:06 AM UTC+5:30, Chris Rebert wrote: > > On Jan 1, 2013 3:41 AM, wrote: > > > > > I am facing one issue in my module. I am gathering data from sql server database. In the data that I got from db contains special characters like "endash". Python was taking it as "\x96". I require the same character(endash). How can I perform that. Can you please help me in resolving this issue. > > > > 1. What library are you using to access the database? > > 2. To confirm, it's a Microsoft SQL Server database? > > 3. What OS are you on? > > 1. I am using "pymssql" module to access the database. > 2. Yes, It is a SQL server database. > 3. I am on Ubuntu 11.10 Did you set "client charset" (to "UTF-8", unless you have good reason to choose otherwise) in freetds.conf? That should at least ensure that the driver itself is exchanging bytestrings via a well-defined encoding. If you want to work in Unicode natively (Recommended), you'll probably need to ensure that the columns are of type NVARCHAR as opposed to VARCHAR. Unless you're using SQLAlchemy or similar (which I personally would recommend using), you may need to do the .encode() and .decode()-ing manually, using the charset you specified in freetds.conf. Sorry my advice is a tad general. I went the alternative route of SQLAlchemy + PyODBC + Microsoft's SQL Server ODBC driver for Linux ( http://www.microsoft.com/en-us/download/details.aspx?id=28160 ) for my current project, which likewise needs to fetch data from MS SQL to an Ubuntu box. The driver is intended for Red Hat and isn't packaged nicely (it installs via a shell script), but after that was dealt with, things have gone smoothly. Unicode, in particular, seems to work properly. --90e6ba1efcb26b559704d24869e4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On Jan 1, 2013 8:48 PM, <anilkumar.dannina@gmail.com> wrote:
> On Wednesday, January 2, 2013 12:00:06 AM UTC+5:30, Chris Rebert wrote= :
> > On Jan 1, 2013 3:41 AM, <anilkuma...@gmail.com> wrote:
> >
> > > I am facing one issue in my module. I am gathering data from= sql server database. In the data that I got from db contains special chara= cters like "endash". Python was taking it as "\x96". I = require the same character(endash). How can I perform that. Can you please = help me in resolving this issue.
> >
> > 1. What library are you using to access the database?
> > 2. To confirm, it's a Microsoft SQL Server database?
> > 3. What OS are you on?
>
> 1. I am using "pymssql" module to access the database.
> 2. Yes, It is a SQL server database.
> 3. I am on Ubuntu 11.10

Did you set "client charset" (to "UTF-8"= , unless you have good reason to choose otherwise) in freetds.conf? That sh= ould at least ensure that the driver itself is exchanging bytestrings via a= well-defined encoding.
If you want to work in Unicode natively (Recommended), you'll probably = need to ensure that the columns are of type NVARCHAR as opposed to VARCHAR.= Unless you're using SQLAlchemy or similar (which I personally would re= commend using), you may need to do the .encode() and .decode()-ing manually= , using the charset you specified in freetds.conf.

Sorry my advice is a tad general. I went the alternative rou= te of SQLAlchemy + PyODBC + Microsoft's SQL Server ODBC driver for Linu= x (http://www.microsoft.com/en-us/download/details.aspx?id=3D28160 ) f= or my current project, which likewise needs to fetch data from MS SQL to an= Ubuntu box. The driver is intended for Red Hat and isn't packaged nice= ly (it installs via a shell script), but after that was dealt with, things = have gone smoothly. Unicode, in particular, seems to work properly.

--90e6ba1efcb26b559704d24869e4--