Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44918 > unrolled thread

Re: Making safe file names

Started byDave Angel <davea@davea.name>
First post2013-05-07 20:14 -0400
Last post2013-05-07 20:22 -0400
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Making safe file names Dave Angel <davea@davea.name> - 2013-05-07 20:14 -0400
    Re: Making safe file names Roy Smith <roy@panix.com> - 2013-05-07 20:22 -0400

#44918 — Re: Making safe file names

FromDave Angel <davea@davea.name>
Date2013-05-07 20:14 -0400
SubjectRe: Making safe file names
Message-ID<mailman.1428.1367972114.3114.python-list@python.org>
On 05/07/2013 03:58 PM, Andrew Berg wrote:
> Currently, I keep Last.fm artist data caches to avoid unnecessary API calls and have been naming the files using the artist name. However,
> artist names can have characters that are not allowed in file names for most file systems (e.g., C/A/T has forward slashes). Are there any
> recommended strategies for naming such files while avoiding conflicts (I wouldn't want to run into problems for an artist named C-A-T or
> CAT, for example)? I'd like to make the files easily identifiable, and there really are no limits on what characters can be in an artist name.
>

So what you need first is a list of allowable characters for all your 
target OS versions.  And don't forget that the allowable characters may 
vary depending on the particular file system(s) mounted on a given OS.

You also need to decide how to handle Unicode characters, since they're 
different for different OS.  In Windows on NTFS, filenames are in 
Unicode, while on Unix, filenames are bytes.  So on one of those, you 
will be encoding/decoding if your code is to be mostly portable.

Don't forget that ls and rm may not use the same encoding you're using. 
  So you may not consider it adequate to make the names legal, but you 
may also want they easily typeable in the shell.

-- 
DaveA

[toc] | [next] | [standalone]


#44919

FromRoy Smith <roy@panix.com>
Date2013-05-07 20:22 -0400
Message-ID<roy-EE293A.20221707052013@news.panix.com>
In reply to#44918
In article <mailman.1428.1367972114.3114.python-list@python.org>,
 Dave Angel <davea@davea.name> wrote:

> On 05/07/2013 03:58 PM, Andrew Berg wrote:
> > Currently, I keep Last.fm artist data caches to avoid unnecessary API calls 
> > and have been naming the files using the artist name. However,
> > artist names can have characters that are not allowed in file names for 
> > most file systems (e.g., C/A/T has forward slashes). Are there any
> > recommended strategies for naming such files while avoiding conflicts (I 
> > wouldn't want to run into problems for an artist named C-A-T or
> > CAT, for example)? I'd like to make the files easily identifiable, and 
> > there really are no limits on what characters can be in an artist name.
> >
> 
> So what you need first is a list of allowable characters for all your 
> target OS versions.  And don't forget that the allowable characters may 
> vary depending on the particular file system(s) mounted on a given OS.
> 
> You also need to decide how to handle Unicode characters, since they're 
> different for different OS.  In Windows on NTFS, filenames are in 
> Unicode, while on Unix, filenames are bytes.  So on one of those, you 
> will be encoding/decoding if your code is to be mostly portable.
> 
> Don't forget that ls and rm may not use the same encoding you're using. 
>   So you may not consider it adequate to make the names legal, but you 
> may also want they easily typeable in the shell.

One possible tool that may help you here is unidecode 
(https://pypi.python.org/pypi/Unidecode).  It doesn't solve your whole 
problem, but it does help get unicode text into a form which is both 
7-bit clean and human readable.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web