Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44919

Re: Making safe file names

From Roy Smith <roy@panix.com>
Newsgroups comp.lang.python
Subject Re: Making safe file names
Date 2013-05-07 20:22 -0400
Organization PANIX Public Access Internet and UNIX, NYC
Message-ID <roy-EE293A.20221707052013@news.panix.com> (permalink)
References <51895D03.4000300@gmail.com> <mailman.1428.1367972114.3114.python-list@python.org>

Show all headers | View raw


In article <mailman.1428.1367972114.3114.python-list@python.org>,
 Dave Angel <davea@davea.name> wrote:

> On 05/07/2013 03:58 PM, Andrew Berg wrote:
> > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls 
> > and have been naming the files using the artist name. However,
> > artist names can have characters that are not allowed in file names for 
> > most file systems (e.g., C/A/T has forward slashes). Are there any
> > recommended strategies for naming such files while avoiding conflicts (I 
> > wouldn't want to run into problems for an artist named C-A-T or
> > CAT, for example)? I'd like to make the files easily identifiable, and 
> > there really are no limits on what characters can be in an artist name.
> >
> 
> So what you need first is a list of allowable characters for all your 
> target OS versions.  And don't forget that the allowable characters may 
> vary depending on the particular file system(s) mounted on a given OS.
> 
> You also need to decide how to handle Unicode characters, since they're 
> different for different OS.  In Windows on NTFS, filenames are in 
> Unicode, while on Unix, filenames are bytes.  So on one of those, you 
> will be encoding/decoding if your code is to be mostly portable.
> 
> Don't forget that ls and rm may not use the same encoding you're using. 
>   So you may not consider it adequate to make the names legal, but you 
> may also want they easily typeable in the shell.

One possible tool that may help you here is unidecode 
(https://pypi.python.org/pypi/Unidecode).  It doesn't solve your whole 
problem, but it does help get unicode text into a form which is both 
7-bit clean and human readable.

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Re: Making safe file names Dave Angel <davea@davea.name> - 2013-05-07 20:14 -0400
  Re: Making safe file names Roy Smith <roy@panix.com> - 2013-05-07 20:22 -0400

csiph-web