Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #47773

Re: Turnign greek-iso filenames => utf-8 iso

From Νικόλαος Κούρας <support@superhost.gr>
Newsgroups comp.lang.python
Subject Re: Turnign greek-iso filenames => utf-8 iso
Date 2013-06-12 13:40 +0300
Organization National Technical University of Athens, Greece
Message-ID <kp9j6l$2tle$1@news.ntua.gr> (permalink)
References <kp99ug$v4c$1@news.ntua.gr> <51b83b6d$0$29998$c3e8da3$5496439d@news.astraweb.com>

Show all headers | View raw


Thanks Steven , i made some alternations to the variables names and at 
the end of the way that i check a database filename against and hdd 
filename. Here is the code:

# 
=================================================================================================================
# Convert wrongly encoded filenames to utf-8
# 
=================================================================================================================
path = b'/home/nikos/public_html/data/apps/'
filenames = os.listdir( path )

utf8_filenames = []

for filename in filenames:
	# Compute 'path/to/filename'
	filename_bytes = path + filename
	encoding = guess_encoding( filename_bytes )
	
	if encoding == 'utf-8':
		# File name is valid UTF-8, so we can skip to the next file.
		utf8_filenames.append( filename_bytes )
		continue
	elif encoding is None:
		# No idea what the encoding is. Hit it with a hammer until it stops 
moving.
		filename = filename_bytes.decode( 'utf-8', 'xmlcharrefreplace' )
	else:
		filename = filename_bytes.decode( encoding )

	# Rename the file to something which ought to be UTF-8 clean.
	newname_bytes = filename.encode('utf-8')
	os.rename( filename_bytes, newname_bytes )
	utf8_filenames.append( newname_bytes )
	
	# Once we get here, the file ought to be UTF-8 clean and the Unicode 
name ought to exist:
	assert os.path.exists( newname_bytes.decode('utf-8') )


# Switch filenames from utf8 bytestrings => unicode strings
filenames = []

for utf8_filename in utf8_filenames:
	filenames.append( utf8_filename.decode('utf-8') )

# Check the presence of a database file against the dir files and delete 
record if it doesn't exist
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()

for url in data:
	if url not in filenames:
		# Delete spurious
		cur.execute('''DELETE FROM files WHERE url = %s''', url )
=========================

Now 'http://superhost.gr/?page=files.py' is not erring out at all but 
also it doesn't display the big filename table for users to download.

Here is how i try to print the filenames with button for the users:

=================================================================================================================
#Display ALL files, each with its own download button# 
=================================================================================================================
print('''<body background='/data/images/star.jpg'>
		 <center><img src='/data/images/download.gif'><br><br>
		 <table border=5 cellpadding=5 bgcolor=green>
''')

try:
	cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
	data = cur.fetchall()
	
	for row in data:
		(filename, hits, host, lastvisit) = row
		lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
		
		print('''
		<form method="get" action="/cgi-bin/files.py">
			<tr>
				<td> <center> <input type="submit" name="filename" value="%s"> </td>
				<td> <center> <font color=yellow size=5> %s </td>
				<td> <center> <font color=orange size=4> %s </td>
				<td> <center> <font color=silver size=4> %s </td>
			</tr>
		</form>
		''' % (filename, hits, host, lastvisit) )
	print( '''</table><br><br>''' )
except pymysql.ProgrammingError as e:
	print( repr(e) )

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 08:02 +0000
  Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 08:31 +0000
    Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:00 +0300
      Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:17 +0000
        Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:24 +0300
          Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:37 +0000
            Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 14:32 +0300
              Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 15:42 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-12 15:42 +0100
                Re: Turnign greek-iso filenames => utf-8 iso rusi <rustompmody@gmail.com> - 2013-06-12 09:14 -0700
                Re: Turnign greek-iso filenames => utf-8 iso Neil Cerutti <neilc@norwich.edu> - 2013-06-12 16:18 +0000
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:16 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:22 +0000
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:14 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:20 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:20 +0000
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:27 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 22:05 +0300
    Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:04 +0300
  Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:12 +0000
    Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 13:40 +0300
      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 09:49 +0300
        Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-13 17:54 +1000
          Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 11:15 +0300
            Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-13 19:25 +1000
              Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 12:43 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-14 00:05 +1000
        Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 17:28 +0300
        Re: Turnign greek-iso filenames => utf-8 iso Zero Piraeus <schesis@gmail.com> - 2013-06-13 10:16 -0400
          Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 19:20 +0300
            Re: Turnign greek-iso filenames => utf-8 iso Grant Edwards <invalid@invalid.invalid> - 2013-06-13 17:17 +0000
            Re: Turnign greek-iso filenames => utf-8 iso Zero Piraeus <schesis@gmail.com> - 2013-06-13 13:27 -0400
              Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 20:48 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Grant Edwards <invalid@invalid.invalid> - 2013-06-13 17:53 +0000
                Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-14 07:46 +1000
                Re: Turnign greek-iso filenames => utf-8 iso Dave Angel <davea@davea.name> - 2013-06-13 18:20 -0400
            Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-14 03:05 +0000
          Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-14 01:28 +0000

csiph-web