Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.028 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'cpython': 0.05; 'encoding': 0.05; 'subject:file': 0.07; 'escape': 0.09; 'jan': 0.12; 'windows': 0.15; '(sorry,': 0.16; 'entities.': 0.16; 'ideally,': 0.16; 'non-ascii': 0.16; 'reedy': 0.16; 'urlencode': 0.16; 'wrote:': 0.18; 'header:User-Agent:1': 0.23; 'artist': 0.24; 'header:In-Reply-To:1': 0.27; 'characters': 0.30; 'serve': 0.31; 'file': 0.32; 'could': 0.34; 'problem': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'received:google.com': 0.35; 'scheme': 0.36; 'possible': 0.36; 'received:209': 0.37; 'message- id:@gmail.com': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'rather': 0.38; 'little': 0.38; 'to:addr:python.org': 0.39; 'easy': 0.60; 'most': 0.60; 'identify': 0.61; 'new': 0.61; 'special': 0.74; '3.3.1': 0.84; '9.1': 0.84; 'freebsd': 0.84; 'subject:Making': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=pTH9yvr7inaK/neberRV05cKETt/rqLTwy0zBf39rcg=; b=oVnkBi/ofv+DCDkvXTaap+e3is5cPAP7gz5dECTFsot4PpRFZO9Bg9E5zvJCWHVMkC f5DRzbg0l+ZI1kyofzrOcBkFTmnJrWePQh07PaMQOthgqoPlQSETgbkNHLMUSBvbVJ0U TqA3n9c746hf+hkRwf6ydYqD6mzWV5T13d1/x4NLmNUgdhNzrDeIzaoWHGBsXMQNxWdP cm/v7yqOWg05BDSqzj66pxfJvvKnf40ZhXLABWSaZSLa02ZSYKwXTYgF8mB+sOCoi/zV HPSjpYWFG/unLe3Pp0SKe9lXU0BpgJM7o5geRp+fUNhENkRcLOPItCXEqRSf8K9FXX8G 4AWQ== X-Received: by 10.236.168.166 with SMTP id k26mr4072792yhl.182.1367967650501; Tue, 07 May 2013 16:00:50 -0700 (PDT) Date: Tue, 07 May 2013 18:00:43 -0500 From: Andrew Berg User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: "comp.lang.python" Subject: Re: Making safe file names References: <51895D03.4000300@gmail.com> In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 13 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1367969029 news.xs4all.nl 15952 [2001:888:2000:d::a6]:33355 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:44916 On 2013.05.07 17:01, Terry Jan Reedy wrote: > Sounds like you want something like the html escape or urlencode > functions, which serve the same purpose of encoding special chars. > Rather than invent a new tranformation, you could use the same scheme > used for html entities. (Sorry, I forget the details.) It is possible > that one of the functions would work for you as is, or with little > modification. This has the problem of mangling non-ASCII characters (and artist names with non-ASCII characters are not rare). I most definitely want to keep as many characters untouched as possible so that the files are easy to identify by looking at the file name. Ideally, only characters that file systems don't like would be transformed. -- CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1