Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #6775 > unrolled thread

Sanitizing filename strings across platforms

Started byTim Chase <python.list@tim.thechases.com>
First post2011-05-31 21:17 -0500
Last post2011-05-31 20:17 -0700
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  Sanitizing filename strings across platforms Tim Chase <python.list@tim.thechases.com> - 2011-05-31 21:17 -0500
    Re: Sanitizing filename strings across platforms Jean-Paul Calderone <calderone.jeanpaul@gmail.com> - 2011-05-31 20:17 -0700

#6775 — Sanitizing filename strings across platforms

FromTim Chase <python.list@tim.thechases.com>
Date2011-05-31 21:17 -0500
SubjectSanitizing filename strings across platforms
Message-ID<mailman.2350.1306896665.9059.python-list@python.org>
Scenario: a file-name from potentially untrusted sources may have 
odd filenames that need to be sanitized for the underlying OS. 
On *nix, this generally just means "don't use '/' or \x00 in your 
string", while on Win32, there are a host of verboten characters 
and file-names.  Then there's also checking the abspath/normpath 
of the resulting name to make sure it's still in the intended folder.


I've read through [1] and have started to glom together various 
bits from that thread.  My current course of action is something like

  SACRED_WIN32_FNAMES = set(
    ['CON', 'PRN', 'CLOCK$', 'AUX', 'NUL'] +
    ['LPT%i' % i for i in range(32)] +
    ['CON%i' % i for i in range(32)] +

  def sanitize_filename(fname):
    sane = set(string.letters + string.digits + '-_.[]{}()$')
    results = ''.join(c for c in fname if c in sane)
    # might have to check sans-extension
    if results.upper() in SACRED_WIN32_FNAMES:
      results = "_" + results
    return results

but if somebody already has war-hardened code they'd be willing 
to share, I'd appreciate any thoughts.

Thanks,

-tkc

[1]
http://stackoverflow.com/questions/295135/turn-a-string-into-a-valid-filename-in-python




[toc] | [next] | [standalone]


#6778

FromJean-Paul Calderone <calderone.jeanpaul@gmail.com>
Date2011-05-31 20:17 -0700
Message-ID<65ba3cf2-5163-473d-b8b3-b6321c47d6a5@dn9g2000vbb.googlegroups.com>
In reply to#6775
On May 31, 10:17 pm, Tim Chase <python.l...@tim.thechases.com> wrote:
> Scenario: a file-name from potentially untrusted sources may have
> odd filenames that need to be sanitized for the underlying OS.
> On *nix, this generally just means "don't use '/' or \x00 in your
> string", while on Win32, there are a host of verboten characters
> and file-names.  Then there's also checking the abspath/normpath
> of the resulting name to make sure it's still in the intended folder.
>
> I've read through [1] and have started to glom together various
> bits from that thread.  My current course of action is something like
>
>   SACRED_WIN32_FNAMES = set(
>     ['CON', 'PRN', 'CLOCK$', 'AUX', 'NUL'] +
>     ['LPT%i' % i for i in range(32)] +
>     ['CON%i' % i for i in range(32)] +
>
>   def sanitize_filename(fname):
>     sane = set(string.letters + string.digits + '-_.[]{}()$')
>     results = ''.join(c for c in fname if c in sane)
>     # might have to check sans-extension
>     if results.upper() in SACRED_WIN32_FNAMES:
>       results = "_" + results
>     return results
>
> but if somebody already has war-hardened code they'd be willing
> to share, I'd appreciate any thoughts.
>

There's http://pypi.python.org/pypi/filepath/0.1 (taken from
twisted.python.filepath).

Jean-Paul

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web