Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!newsreader4.netcologne.de!news.netcologne.de!feed.xsnews.nl!border-3.ams.xsnews.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'column': 0.07; 'completeness': 0.07; 'nicely': 0.07; 'properly.': 0.07; 'python': 0.09; '(it': 0.09; 'charset': 0.09; 'encoding.': 0.09; 'fetch': 0.09; 'likewise': 0.09; 'otherwise)': 0.09; 'pyodbc': 0.09; 'sake': 0.09; 'snippet': 0.09; 'subject:characters': 0.09; 'unicode,': 0.09; 'cc:addr:python-list': 0.10; 'subject:python': 0.11; 'library': 0.15; '"client': 0.16; 'archives,': 0.16; 'columns': 0.16; 'general.': 0.16; 'installs': 0.16; "microsoft's": 0.16; 'sqlalchemy': 0.16; 'settings': 0.16; 'wed,': 0.16; 'wrote:': 0.17; 'config': 0.17; 'driver': 0.17; 'module,': 0.17; 'unicode': 0.17; 'jan': 0.18; 'shell': 0.18; 'module': 0.19; 'issue.': 0.20; 'trying': 0.21; 'latter': 0.22; 'wednesday,': 0.22; 'cc:2**0': 0.23; 'specified': 0.23; 'seems': 0.23; 'project,': 0.24; 'cc:no real name:2**0': 0.24; 'linux': 0.24; 'tried': 0.25; 'least': 0.25; 'cc:addr:python.org': 0.25; 'header :In-Reply-To:1': 0.25; '(which': 0.26; 'am,': 0.27; 'module.': 0.27; 'opposed': 0.27; 'message-id:@mail.gmail.com': 0.27; 'went': 0.28; 'chris': 0.28; 'initial': 0.28; 'facing': 0.29; 'helpful.': 0.29; 'use?': 0.29; 'url:download': 0.29; 'probably': 0.29; 'this.': 0.29; 'connection': 0.30; 'that.': 0.30; 'code': 0.31; 'getting': 0.33; 'ubuntu': 0.33; 'received:google.com': 0.34; 'thanks': 0.34; 'server': 0.35; 'pm,': 0.35; 'similar': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'characters': 0.36; 'should': 0.36; 'itself': 0.37; 'rather': 0.37; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'perform': 0.38; 'things': 0.38; 'advice': 0.39; 'url:microsoft': 0.39; 'header:Received:5': 0.40; 'help': 0.40; 'your': 0.60; 'red': 0.60; 'url:aspx': 0.60; 'personally': 0.61; 'first': 0.61; "you'll": 0.62; 'details': 0.63; 'information': 0.63; 'more': 0.63; 'gone': 0.64; 'choose': 0.65; 'taking': 0.65; 'box.': 0.65; 'url:en-us': 0.65; 'gathering': 0.71; 'overcome': 0.71; 'special': 0.73; '2013': 0.84; 'etc..': 0.84; 'packaged': 0.84; 'route': 0.84; 'sender:addr:chris': 0.84; 'dealt': 0.91; 'exchanging': 0.91; 'on?': 0.91; 'subject:Special': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rebertia.com; s=google; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=6s1tQpgzt0xqMsy/gBvL6uyDw+AR7bBeNshw681/ZbU=; b=S/kWsLFqxokPAXz1MBBZGJXCspQlkGkPW7JtyJpFUW/VuMTRbzkIX6/FC49Vduh+rL eSakLv9XRFtEnwvqitTp8s8oJnb19wcE4zNOyul1augsGHEDG5u4lIDPhJQvJzjdxcoE +j08Kz8LQrLEbsarX+XD4SkY57GBFjGZsQgdk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=6s1tQpgzt0xqMsy/gBvL6uyDw+AR7bBeNshw681/ZbU=; b=gj0wsG+dfhzgCALMXlkNgCx2qILGhSAu5ZJLyKVLwixP/84hVnNPt34ZNcXPB0dW7B 5tW9IWYesh/RmCWuJt53vkTLMI7tpz/s588qRVCQxZfdI6ROF9M9nua78odk9HF9XtK3 6/LuWS6gHELHxeLCegif8qRg+sYadIG3lbks0OEqzHyaVpBEvjArKT5h/ZQuM78uNijv hFN1Df9Cu4ol5KvUYxFrGFoSMJRxb0Se+afJv6HcgJJa2VDOyDdQTvPsykSMjr3s2MTG MpPlkyLDekrUnGoOQtQxJd4zfQ6kaAyuKsQ+yRcEwqsBAzzakp3ow+u8zHVU2L7a6Iso wRzQ== MIME-Version: 1.0 Sender: chris@rebertia.com In-Reply-To: <310c83c4-cfa2-4425-b291-d1a3604b3e29@googlegroups.com> References: <90663ba3-2307-45ed-a1b7-c3dbe5130ebd@googlegroups.com> <310c83c4-cfa2-4425-b291-d1a3604b3e29@googlegroups.com> Date: Wed, 2 Jan 2013 22:00:16 -0800 X-Google-Sender-Auth: ofZSGACBGXkqKIKNTulgaADCF0g Subject: Re: Handling Special characters in python From: Chris Rebert To: anilkumar.dannina@gmail.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQkzcA01lUUUtiAyPAJGAClT/B5sxyw5BACmvU0oqU4TKsDR8WKq+ZLfXSFNoF00qOHRI+AW Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 53 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1357192819 news.xs4all.nl 6863 [2001:888:2000:d::a6]:58813 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:36037 On Wed, Jan 2, 2013 at 5:39 AM, wrote: > On Wednesday, January 2, 2013 12:02:34 PM UTC+5:30, Chris Rebert wrote: >> On Jan 1, 2013 8:48 PM, wrote: >> > On Wednesday, January 2, 2013 12:00:06 AM UTC+5:30, Chris Rebert wrote= : >> > > On Jan 1, 2013 3:41 AM, wrote: >> > > > I am facing one issue in my module. I am gathering data from sql s= erver database. In the data that I got from db contains special characters = like "endash". Python was taking it as "\x96". I require the same character= (endash). How can I perform that. Can you please help me in resolving this = issue. >> >> > > 1. What library are you using to access the database? >> > > 2. To confirm, it's a Microsoft SQL Server database? >> > > 3. What OS are you on? >> >> > 1. I am using "pymssql" module to access the database. >> > 2. Yes, It is a SQL server database. >> > 3. I am on Ubuntu 11.10 >> >> Did you set "client charset" (to "UTF-8", unless you have good reason to= choose otherwise) in freetds.conf? That should at least ensure that the dr= iver itself is exchanging bytestrings via a well-defined encoding. >> If you want to work in Unicode natively (Recommended), you'll probably n= eed to ensure that the columns are of type NVARCHAR as opposed to VARCHAR. = Unless you're using SQLAlchemy or similar (which I personally would recomme= nd using), you may need to do the .encode() and .decode()-ing manually, usi= ng the charset you specified in freetds.conf. >> >> Sorry my advice is a tad general. I went the alternative route of SQLAlc= hemy + PyODBC + Microsoft's SQL Server ODBC driver for Linux (http://www.mi= crosoft.com/en-us/download/details.aspx?id=3D28160 ) for my current project= , which likewise needs to fetch data from MS SQL to an Ubuntu box. The driv= er is intended for Red Hat and isn't packaged nicely (it installs via a she= ll script), but after that was dealt with, things have gone smoothly. Unico= de, in particular, seems to work properly. > > Thanks Chris Rebert for your suggestion, I tried with PyODBC module, But = at the place of "en dash(-)", I am getting '?' symbol. How can I overcome t= his. I would recommend first trying the advice in the initial part of my response rather than the latter part. The latter part was more for completeness and for the sake of the archives, although I can give more details on its approach if you insist. Additionally, giving more information as to what exactly you tried would be helpful. What config / connection settings did you use? Of what datatype is the relevant column of the table? What's your code snippet look like? Etc.. Regards, Chris