Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44974 > unrolled thread

Re: Making safe file names

Started byDennis Lee Bieber <wlfraed@ix.netcom.com>
First post2013-05-08 19:37 -0400
Last post2013-05-09 13:53 +1000
Articles 11 — 7 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Making safe file names Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-05-08 19:37 -0400
    Re: Making safe file names Roy Smith <roy@panix.com> - 2013-05-08 20:16 -0400
      Re: Making safe file names Chris Angelico <rosuav@gmail.com> - 2013-05-09 10:27 +1000
      Re: Making safe file names Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-09 01:49 +0000
        Re: Making safe file names Roy Smith <roy@panix.com> - 2013-05-08 21:56 -0400
      Re: Making safe file names Andrew Berg <bahamutzero8825@gmail.com> - 2013-05-08 21:11 -0500
        Re: Making safe file names Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-09 03:08 +0000
          Re: Making safe file names Roy Smith <roy@panix.com> - 2013-05-09 08:55 -0400
            Re: Making safe file names Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-05-10 12:04 +1200
              Re: Making safe file names Tim Chase <python.list@tim.thechases.com> - 2013-05-09 19:17 -0500
          Re: Making safe file names Chris Angelico <rosuav@gmail.com> - 2013-05-09 13:53 +1000

#44974 — Re: Making safe file names

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-05-08 19:37 -0400
SubjectRe: Making safe file names
Message-ID<mailman.1465.1368056269.3114.python-list@python.org>
On Tue, 07 May 2013 18:10:25 -0500, Andrew Berg
<bahamutzero8825@gmail.com> declaimed the following in
gmane.comp.python.general:

> None of these would work because I would have no idea which file stores data for which artist without writing code to figure it out. If I
> were to end up writing a bug that messed up a few of my cache files and noticed it with a specific artist (e.g., doing a "now playing" and
> seeing the wrong tags), I would either have to manually match up the hash or base64 encoding in order to delete just that file so that it
> gets regenerated or nuke and regenerate my entire cache.
>
	And now you've seen why music players don't show the user the
physical file name, but maintain a database mapping the internal data
(name, artist, track#, album, etc.) to whatever mangled name was needed
to satisfy the file system.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [next] | [standalone]


#44982

FromRoy Smith <roy@panix.com>
Date2013-05-08 20:16 -0400
Message-ID<roy-1B572B.20162508052013@news.panix.com>
In reply to#44974
In article <mailman.1465.1368056269.3114.python-list@python.org>,
 Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:

> On Tue, 07 May 2013 18:10:25 -0500, Andrew Berg
> <bahamutzero8825@gmail.com> declaimed the following in
> gmane.comp.python.general:
> 
> > None of these would work because I would have no idea which file stores 
> > data for which artist without writing code to figure it out. If I
> > were to end up writing a bug that messed up a few of my cache files and 
> > noticed it with a specific artist (e.g., doing a "now playing" and
> > seeing the wrong tags), I would either have to manually match up the hash 
> > or base64 encoding in order to delete just that file so that it
> > gets regenerated or nuke and regenerate my entire cache.
> >
> 	And now you've seen why music players don't show the user the
> physical file name, but maintain a database mapping the internal data
> (name, artist, track#, album, etc.) to whatever mangled name was needed
> to satisfy the file system.

Yup.  At Songza, we deal with this crap every day.  It usually bites us 
the worst when trying to do keyword searches.  When somebody types in 
"Blue Oyster Cult", they really mean "Blue Oyster Cult", and our search 
results need to reflect that.  Likewise for Ke$ha, Beyonce, and I don't 
even want to think about the artist formerly known as an unpronounceable 
glyph.

Pro-tip, guys.  If you want to form a band, and expect people to be able 
to find your stuff in a search engine some day, don't play cute with 
your name.

[toc] | [prev] | [next] | [standalone]


#44984

FromChris Angelico <rosuav@gmail.com>
Date2013-05-09 10:27 +1000
Message-ID<mailman.1471.1368059242.3114.python-list@python.org>
In reply to#44982
On Thu, May 9, 2013 at 10:16 AM, Roy Smith <roy@panix.com> wrote:
> Pro-tip, guys.  If you want to form a band, and expect people to be able
> to find your stuff in a search engine some day, don't play cute with
> your name.

It's the modern equivalent of names like Catherine Withekay.

ChrisA

[toc] | [prev] | [next] | [standalone]


#44987

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-05-09 01:49 +0000
Message-ID<518b00a2$0$29997$c3e8da3$5496439d@news.astraweb.com>
In reply to#44982
On Wed, 08 May 2013 20:16:25 -0400, Roy Smith wrote:

> Yup.  At Songza, we deal with this crap every day.  It usually bites us
> the worst when trying to do keyword searches.  When somebody types in
> "Blue Oyster Cult", they really mean "Blue Oyster Cult", 

Surely they really mean Blue Öyster Cult.


> and our search
> results need to reflect that.  Likewise for Ke$ha, Beyonce, and I don't
> even want to think about the artist formerly known as an unpronounceable
> glyph.

Dropped or incorrect accents are no different from any other misspelling, 
and good search engines (whether online or in a desktop application) 
should be able to deal with a tolerable number of misspellings.

Googling for "Blue Oyster Cult" brings up four of the top ten hits 
spelled correctly with the accent, "Blue Öyster Cult". Even misspelled as 
"blew oytser cult", Google does the right thing.

Even Bing manages to find Ke$ha's wikipedia page, her official website, 
youtube channel, facebook and myspace pages from the misspelling "kehsha".



> Pro-tip, guys.  If you want to form a band, and expect people to be able
> to find your stuff in a search engine some day, don't play cute with
> your name.

Googling for "the the" (including quotes) brings up 145 million hits, 
nine of the first ten hits being relevant to the band. 

On the other hand, I wouldn't want to be in a band called "The Beetles".


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#44989

FromRoy Smith <roy@panix.com>
Date2013-05-08 21:56 -0400
Message-ID<roy-B50670.21564108052013@news.panix.com>
In reply to#44987
In article <518b00a2$0$29997$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> > When somebody types in
> > "Blue Oyster Cult", they really mean "Blue Oyster Cult", 
> 
> Surely they really mean Blue Öyster Cult.

Yes.  The oomlaut was there when I typed it.  Who knows what happened to 
it by the time it hit the wire.

[toc] | [prev] | [next] | [standalone]


#44990

FromAndrew Berg <bahamutzero8825@gmail.com>
Date2013-05-08 21:11 -0500
Message-ID<mailman.1474.1368065495.3114.python-list@python.org>
In reply to#44982
On 2013.05.08 19:16, Roy Smith wrote:
> Yup.  At Songza, we deal with this crap every day.  It usually bites us 
> the worst when trying to do keyword searches.  When somebody types in 
> "Blue Oyster Cult", they really mean "Blue Oyster Cult", and our search 
> results need to reflect that.  Likewise for Ke$ha, Beyonce, and I don't 
> even want to think about the artist formerly known as an unpronounceable 
> glyph.
>
> Pro-tip, guys.  If you want to form a band, and expect people to be able 
> to find your stuff in a search engine some day, don't play cute with 
> your name.
It's a thing (especially in witch house) to make names with odd glyphs in order to be harder to find and be more "underground". Very silly.
Try doing searches for these artists with names like these:
http://www.last.fm/music/%E2%96%BC%E2%96%A1%E2%96%A0%E2%96%A1%E2%96%A0%E2%96%A1%E2%96%A0
http://www.last.fm/music/ki%E2%80%A0%E2%80%A0y+c%E2%96%B2t
-- 
CPython 3.3.1 | Windows NT 6.2.9200 / FreeBSD 9.1

[toc] | [prev] | [next] | [standalone]


#44996

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-05-09 03:08 +0000
Message-ID<518b133b$0$29997$c3e8da3$5496439d@news.astraweb.com>
In reply to#44990
On Wed, 08 May 2013 21:11:28 -0500, Andrew Berg wrote:

> It's a thing (especially in witch house) to make names with odd glyphs
> in order to be harder to find and be more "underground". Very silly. Try
> doing searches for these artists with names like these:

Challenge accepted.

> http://www.last.fm/music/%E2%96%BC%E2%96%A1%E2%96%A0%E2%96%A1%E2%96%A0%
E2%96%A1%E2%96%A0
> http://www.last.fm/music/ki%E2%80%A0%E2%80%A0y+c%E2%96%B2t


The second one is trivial. Googling for "kitty cat" "witch 
house" (including quotes) gives at least 3 relevant links out of the top 
4 hits are relevant. (I'm not sure about the Youtube page.) That gets you 
the correct spelling, "ki††y c△t", and googling for that brings up many 
more hits.

The first one is a tad trickier, since googling for "▼□■□■□■" brings up 
nothing at all, and "mourning star" doesn't give any relevant hits on the 
first page. But "mourning star" "witch house" (inc. quotes) is successful.

I suspect that the only way to be completely ungoogleable would be to 
name yourself something common, not something obscure. Say, if you called 
yourself "Hard Rock Band", and did hard rock. But then, googling for 
"Heavy Metal" alone brings up the magazine as the fourth hit, so if you 
get famous enough, even that won't work.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#45031

FromRoy Smith <roy@panix.com>
Date2013-05-09 08:55 -0400
Message-ID<roy-313758.08553509052013@news.panix.com>
In reply to#44996
In article <518b133b$0$29997$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> I suspect that the only way to be completely ungoogleable would be to 
> name yourself something common, not something obscure.

http://en.wikipedia.org/wiki/The_band

[toc] | [prev] | [next] | [standalone]


#45069

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2013-05-10 12:04 +1200
Message-ID<av2rtdF5igdU1@mid.individual.net>
In reply to#45031
Roy Smith wrote:
> In article <518b133b$0$29997$c3e8da3$5496439d@news.astraweb.com>,
>  Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> 
>>I suspect that the only way to be completely ungoogleable would be to 
>>name yourself something common, not something obscure.
> 
> http://en.wikipedia.org/wiki/The_band

Nope... googling for "the band" brings that up as the
very first result.

The Google knows all. You cannot escape The Google...

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#45072

FromTim Chase <python.list@tim.thechases.com>
Date2013-05-09 19:17 -0500
Message-ID<mailman.1515.1368148651.3114.python-list@python.org>
In reply to#45069
On 2013-05-10 12:04, Gregory Ewing wrote:
> Roy Smith wrote:
> > http://en.wikipedia.org/wiki/The_band
> 
> Nope... googling for "the band" brings that up as the
> very first result.
> 
> The Google knows all. You cannot escape The Google...

That does it.  I'm naming my band "Google". :-)

-tkc

[toc] | [prev] | [next] | [standalone]


#45138

FromChris Angelico <rosuav@gmail.com>
Date2013-05-09 13:53 +1000
Message-ID<mailman.1550.1368273954.3114.python-list@python.org>
In reply to#44996
On Thu, May 9, 2013 at 1:08 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> I suspect that the only way to be completely ungoogleable would be to
> name yourself something common, not something obscure. Say, if you called
> yourself "Hard Rock Band", and did hard rock. But then, googling for
> "Heavy Metal" alone brings up the magazine as the fourth hit, so if you
> get famous enough, even that won't work.

Yeah, so why are ubergeneric domain names worth so much? Whatevs.

The best way to be findable in a web search is to have content on your
web site. Real crawlable content. I guarantee you'll be found. Even if
you're some tiny thing tucked away in a corner of teh interwebs, you
can be found.

http://www.google.com/search?q=minstrel+hall

The song is there, but so is an obscure little D&D MUD.

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web