Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #47751 > unrolled thread

Turnign greek-iso filenames => utf-8 iso

Started byΝικόλαος Κούρας <support@superhost.gr>
First post2013-06-12 08:02 +0000
Last post2013-06-14 01:28 +0000
Articles 20 on this page of 38 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 08:02 +0000
    Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 08:31 +0000
      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:00 +0300
        Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:17 +0000
          Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:24 +0300
            Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:37 +0000
              Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 14:32 +0300
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 15:42 +0300
                  Re: Turnign greek-iso filenames => utf-8 iso Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-12 15:42 +0100
                    Re: Turnign greek-iso filenames => utf-8 iso rusi <rustompmody@gmail.com> - 2013-06-12 09:14 -0700
                    Re: Turnign greek-iso filenames => utf-8 iso Neil Cerutti <neilc@norwich.edu> - 2013-06-12 16:18 +0000
                      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:16 +0300
                      Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:22 +0000
                    Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:14 +0300
                      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:20 +0300
                    Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:20 +0000
                  Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 20:27 +0300
                    Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 22:05 +0300
      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 12:04 +0300
    Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-12 09:12 +0000
      Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-12 13:40 +0300
        Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 09:49 +0300
          Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-13 17:54 +1000
            Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 11:15 +0300
              Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-13 19:25 +1000
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 12:43 +0300
                  Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-14 00:05 +1000
          Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 17:28 +0300
          Re: Turnign greek-iso filenames => utf-8 iso Zero Piraeus <schesis@gmail.com> - 2013-06-13 10:16 -0400
            Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 19:20 +0300
              Re: Turnign greek-iso filenames => utf-8 iso Grant Edwards <invalid@invalid.invalid> - 2013-06-13 17:17 +0000
              Re: Turnign greek-iso filenames => utf-8 iso Zero Piraeus <schesis@gmail.com> - 2013-06-13 13:27 -0400
                Re: Turnign greek-iso filenames => utf-8 iso Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 20:48 +0300
                  Re: Turnign greek-iso filenames => utf-8 iso Grant Edwards <invalid@invalid.invalid> - 2013-06-13 17:53 +0000
                  Re: Turnign greek-iso filenames => utf-8 iso Chris Angelico <rosuav@gmail.com> - 2013-06-14 07:46 +1000
                  Re: Turnign greek-iso filenames => utf-8 iso Dave Angel <davea@davea.name> - 2013-06-13 18:20 -0400
              Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-14 03:05 +0000
            Re: Turnign greek-iso filenames => utf-8 iso Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-14 01:28 +0000

Page 1 of 2  [1] 2  Next page →


#47751 — Turnign greek-iso filenames => utf-8 iso

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 08:02 +0000
SubjectTurnign greek-iso filenames => utf-8 iso
Message-ID<kp99ug$v4c$1@news.ntua.gr>
#========================================================
# Collect directory and its filenames as bytes
path = b'/home/nikos/public_html/data/apps/'
files = os.listdir( path )

for filename in files:
	# Compute 'path/to/filename'
	filepath_bytes = path + filename
	
	for encoding in ('utf-8', 'iso-8859-7', 'latin-1'):
		try: 
			filepath = filepath_bytes.decode( encoding )
		except UnicodeDecodeError:
			continue
        
		# Rename to something valid in UTF-8 
		if encoding != 'utf-8': 
			os.rename( filepath_bytes, filepath.encode('utf-8') 
)

		assert os.path.exists( filepath.encode('utf-8') )
		break 
	else: 
		# This only runs if we never reached the break
		raise ValueError( 'unable to clean filename %r' % 
filepath_bytes ) 


# Collect filenames of the path dir as strings
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )

# Build a set of 'path/to/filename' based on the objects of path dir
filepaths = set()
for filename in filenames:
	filepaths.add( filename )

==================
# Load'em
for filename in filenames:
	try:
		# Check the presence of a file against the database and 
insert if it doesn't exist
		cur.execute('''SELECT url FROM files WHERE url = %s''', 
filename )
		data = cur.fetchone()

====================
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173] Traceback (most 
recent call last):, referer: http://superhost.gr/
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173]   File "/home/
nikos/public_html/cgi-bin/files.py", line 102, in <module>, referer: 
http://superhost.gr/
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173]     print
( filename ), referer: http://superhost.gr/
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173]   File "/usr/
local/lib/python3.3/codecs.py", line 355, in write, referer: http://
superhost.gr/
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173]     data, 
consumed = self.encode(object, self.errors), referer: http://superhost.gr/
[Wed Jun 12 10:56:56 2013] [error] [client 79.103.41.173] 
UnicodeEncodeError: 'utf-8' codec can't encode character '\\udcce' in 
position 0: surrogates not allowed, referer: http://superhost.gr/
=====================

i tried to insert
print( filename )
sys.exit(0)

just before the execute
and the output is just Pacman.exe as seen in 

http://superhost.gr/?page=files.py

Seens the encoding precedure successfully turned all the filenames from 
greek-iso to utf-8 without failing, why woul it still be encoding issues 
when it comes to execute?

[toc] | [next] | [standalone]


#47753

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-12 08:31 +0000
Message-ID<51b831f2$0$29998$c3e8da3$5496439d@news.astraweb.com>
In reply to#47751
On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:

> i tried to insert
> print( filename )
> sys.exit(0)

That's not very useful. That will just print ONE file name, then stop. 
You have how many files in there? Two? Twenty? What if the problem does 
not lie with the first one?

> just before the execute
> and the output is just Pacman.exe as seen in
> 
> http://superhost.gr/?page=files.py

Wrong. The output is:

Internal Server Error

The server encountered an internal error or misconfiguration and was 
unable to complete your request.  ...


> Seens the encoding precedure successfully turned all the filenames from
> greek-iso to utf-8 without failing, why woul it still be encoding issues
> when it comes to execute?

Because the problems are unrelated. Just because you fix one bug, doesn't 
mean all the other bugs magically disappear.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#47758

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 12:00 +0300
Message-ID<kp9dbn$1lop$1@news.ntua.gr>
In reply to#47753
On 12/6/2013 11:31 πμ, Steven D'Aprano wrote:
> On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:
>
>> i tried to insert
>> print( filename )
>> sys.exit(0)
>
> That's not very useful. That will just print ONE file name, then stop.
> You have how many files in there? Two? Twenty? What if the problem does
> not lie with the first one?
>
>> just before the execute
>> and the output is just Pacman.exe as seen in
>>
>> http://superhost.gr/?page=files.py
>
> Wrong. The output is:
>
> Internal Server Error

print( filenames )
sys.exit(0)


No it dosnt not, it loads properly and if you visit it again you will 
see all the files being displayed since now i:

print( filenames )
sys.exit(0)

Thne grek ones ar displayed as 
'\udcce\udc95\udccf\udc85\udccf\udc87\udcce\udcae 
\udccf\udc84\udcce\udcbf\udccf\udc85 
\udcce\udc99\udcce\udcb7\udccf\udc83\udcce\udcbf\udccf\udc8d.mp3' in Chrome

dont know why since the above procedure supposed to turned them into utf-8

ls -l apps though via putty display all filesnames correctly.

===============
# Collect filenames of the path dir as strings
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )

# Build a set of 'path/to/filename' based on the objects of path dir
filepaths = set()
for filename in filenames:
	filepaths.add( filename )

# Load'em
for filename in filenames:
	try:
		# Check the presence of a file against the database and insert if it 
doesn't exist
		print( filenames )
		sys.exit(0)
		cur.execute('''SELECT url FROM files WHERE url = %s''', filename )
		data = cur.fetchone()
===============================

Into the database only 2 english have been inserted pacman.exe and one 
other english filenames before filename breaks into the execute statemnt.

[toc] | [prev] | [next] | [standalone]


#47764

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-12 09:17 +0000
Message-ID<51b83cab$0$29998$c3e8da3$5496439d@news.astraweb.com>
In reply to#47758
On Wed, 12 Jun 2013 12:00:38 +0300, Νικόλαος Κούρας wrote:

> On 12/6/2013 11:31 πμ, Steven D'Aprano wrote:
>> On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:

>>> and the output is just Pacman.exe as seen in
>>>
>>> http://superhost.gr/?page=files.py
>>
>> Wrong. The output is:
>>
>> Internal Server Error
> 
> print( filenames )
> sys.exit(0)
> 
> 
> No it dosnt not, it loads properly and if you visit it again you will
> see all the files being displayed since now i:

Wrong again. It still gives Internal Error. I have just revisited the 
page three times now, and every time it still fails.

I am not lying, I am not making this up. Here is the text:



Internal Server Error

The server encountered an internal error or misconfiguration and was 
unable to complete your request.

Please contact the server administrator, support@superhost.gr and inform 
them of the time the error occurred, and anything you might have done 
that may have caused the error.

More information about this error may be available in the server error 
log.

Additionally, a 404 Not Found error was encountered while trying to use 
an ErrorDocument to handle the request.
Apache/2.2.24 (Unix) mod_ssl/2.2.24 OpenSSL/1.0.0-fips 
mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635 Server at 
superhost.gr Port 80





-- 
Steven

[toc] | [prev] | [next] | [standalone]


#47766

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 12:24 +0300
Message-ID<kp9eoa$1ssu$1@news.ntua.gr>
In reply to#47764
On 12/6/2013 12:17 μμ, Steven D'Aprano wrote:
> On Wed, 12 Jun 2013 12:00:38 +0300, Νικόλαος Κούρας wrote:
>
>> On 12/6/2013 11:31 πμ, Steven D'Aprano wrote:
>>> On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:
>
>>>> and the output is just Pacman.exe as seen in
>>>>
>>>> http://superhost.gr/?page=files.py
>>>
>>> Wrong. The output is:
>>>
>>> Internal Server Error
>>
>> print( filenames )
>> sys.exit(0)
>>
>>
>> No it dosnt not, it loads properly and if you visit it again you will
>> see all the files being displayed since now i:
>
> Wrong again. It still gives Internal Error. I have just revisited the
> page three times now, and every time it still fails.
>
> I am not lying, I am not making this up. Here is the text:
>
>
>
> Internal Server Error
>
> The server encountered an internal error or misconfiguration and was
> unable to complete your request.
>
> Please contact the server administrator, support@superhost.gr and inform
> them of the time the error occurred, and anything you might have done
> that may have caused the error.
>
> More information about this error may be available in the server error
> log.
>
> Additionally, a 404 Not Found error was encountered while trying to use
> an ErrorDocument to handle the request.
> Apache/2.2.24 (Unix) mod_ssl/2.2.24 OpenSSL/1.0.0-fips
> mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635 Server at
> superhost.gr Port 80

I know that you do not lie and also i think i know why *you* 
specifically can load my webiste.


i think your ip address does not have a PTR entry (reverse DNS entry)

and when you try to laod superhost.gr this lines fail for you and hece 
it errs out an internal server error.

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]


i just switched the above line to to avoid missing PTRs

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or 'UnResolved'

Try now please.

[toc] | [prev] | [next] | [standalone]


#47768

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-12 09:37 +0000
Message-ID<51b84151$0$29872$c3e8da3$5496439d@news.astraweb.com>
In reply to#47766
On Wed, 12 Jun 2013 12:24:24 +0300, Νικόλαος Κούρας wrote:

> On 12/6/2013 12:17 μμ, Steven D'Aprano wrote:
>> On Wed, 12 Jun 2013 12:00:38 +0300, Νικόλαος Κούρας wrote:
>>
>>> On 12/6/2013 11:31 πμ, Steven D'Aprano wrote:
>>>> On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:
>>
>>>>> and the output is just Pacman.exe as seen in
>>>>>
>>>>> http://superhost.gr/?page=files.py

Νικόλαος, look at the URL you have given me. It still fails. I have 
tested it repeatedly.[1]


Now look at this URL:

http://superhost.gr/data/apps/

Notice that it is a different URL? That one works and shows the file 
listing.

Also, you need to include an encoding line in the page, otherwise people 
viewing it will see mojibake. You need a line like:

<meta charset="UTF-8" />

in the generated HTML.



[1] Correction. While I was typing this, it came good, for about 20 
seconds, and displayed a hideously ugly background pattern and a cute 
smiling face waving, and then broke again.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#47784

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 14:32 +0300
Message-ID<kp9m8c$dic$1@news.ntua.gr>
In reply to#47768
On 12/6/2013 12:37 μμ, Steven D'Aprano wrote:
> On Wed, 12 Jun 2013 12:24:24 +0300, Νικόλαος Κούρας wrote:

>
> [1] Correction. While I was typing this, it came good, for about 20
> seconds, and displayed a hideously ugly background pattern and a cute
> smiling face waving, and then broke again.


Ah sorry Steven i made the change of:

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or 'UnResolved'

to metrites.py isntead of files.py

now i have made both chnages.

you can see the webpage now, eys the one with the cure smile face.

behidn that we should beeen seeing all the files in a table like format 
for uses to downlaod, instead not a single file is being displayed.

here is the print process., actually here is what i have up until now 
after modifications for you to take an overall look.

# 
=================================================================================================================
# Convert wrongly encoded filenames to utf-8
# 
=================================================================================================================
path = b'/home/nikos/public_html/data/apps/'
filenames = os.listdir( path )

utf8_filenames = []

for filename in filenames:
	# Compute 'path/to/filename'
	filename_bytes = path + filename
	encoding = guess_encoding( filename_bytes )
	
	if encoding == 'utf-8':
		# File name is valid UTF-8, so we can skip to the next file.
		utf8_filenames.append( filename_bytes )
		continue
	elif encoding is None:
		# No idea what the encoding is. Hit it with a hammer until it stops 
moving.
		filename = filename_bytes.decode( 'utf-8', 'xmlcharrefreplace' )
	else:
		filename = filename_bytes.decode( encoding )

	# Rename the file to something which ought to be UTF-8 clean.
	newname_bytes = filename.encode('utf-8')
	os.rename( filename_bytes, newname_bytes )
	utf8_filenames.append( newname_bytes )
	
	# Once we get here, the file ought to be UTF-8 clean and the Unicode 
name ought to exist:
	assert os.path.exists( newname_bytes.decode('utf-8') )


# Switch filenames from utf8 bytestrings => unicode strings
filenames = []

for utf8_filename in utf8_filenames:
	filenames.append( utf8_filename.decode('utf-8') )

# Check the presence of a database file against the dir files and delete 
record if it doesn't exist
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()

for url in data:
	if url not in filenames:
		# Delete spurious
		cur.execute('''DELETE FROM files WHERE url = %s''', url )


# 
=================================================================================================================
# Display ALL files, each with its own download button
# 
=================================================================================================================
print('''<body background='/data/images/star.jpg'>
		 <center><img src='/data/images/download.gif'><br><br>
		 <table border=5 cellpadding=5 bgcolor=green>
''')

try:
	cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
	data = cur.fetchall()
	
	for row in data:
		(filename, hits, host, lastvisit) = row
		lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
		
		print('''
		<form method="get" action="/cgi-bin/files.py">
			<tr>
				<td> <center> <input type="submit" name="filename" value="%s"> </td>
				<td> <center> <font color=yellow size=5> %s </td>
				<td> <center> <font color=orange size=4> %s </td>
				<td> <center> <font color=silver size=4> %s </td>
			</tr>
		</form>
		''' % (filename, hits, host, lastvisit) )
	print( '''</table><br><br>''' )
except pymysql.ProgrammingError as e:
	print( repr(e) )
	
sys.exit(0)

==========
ima happy that at elaST IT DOES NOT ERRIGN OUT!

[toc] | [prev] | [next] | [standalone]


#47796

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 15:42 +0300
Message-ID<kp9qck$pp8$2@news.ntua.gr>
In reply to#47784
On 12/6/2013 2:32 μμ, Νικόλαος Κούρας wrote:
> On 12/6/2013 12:37 μμ, Steven D'Aprano wrote:
>> On Wed, 12 Jun 2013 12:24:24 +0300, Νικόλαος Κούρας wrote:
>
>>
>> [1] Correction. While I was typing this, it came good, for about 20
>> seconds, and displayed a hideously ugly background pattern and a cute
>> smiling face waving, and then broke again.
>
>
> Ah sorry Steven i made the change of:
>
> host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or 'UnResolved'
>
> to metrites.py isntead of files.py
>
> now i have made both chnages.
>
> you can see the webpage now, eys the one with the cure smile face.
>
> behidn that we should beeen seeing all the files in a table like format
> for uses to downlaod, instead not a single file is being displayed.
>
> here is the print process., actually here is what i have up until now
> after modifications for you to take an overall look.
>
> #
> =================================================================================================================
>
> # Convert wrongly encoded filenames to utf-8
> #
> =================================================================================================================
>
> path = b'/home/nikos/public_html/data/apps/'
> filenames = os.listdir( path )
>
> utf8_filenames = []
>
> for filename in filenames:
>      # Compute 'path/to/filename'
>      filename_bytes = path + filename
>      encoding = guess_encoding( filename_bytes )
>
>      if encoding == 'utf-8':
>          # File name is valid UTF-8, so we can skip to the next file.
>          utf8_filenames.append( filename_bytes )
>          continue
>      elif encoding is None:
>          # No idea what the encoding is. Hit it with a hammer until it
> stops moving.
>          filename = filename_bytes.decode( 'utf-8', 'xmlcharrefreplace' )
>      else:
>          filename = filename_bytes.decode( encoding )
>
>      # Rename the file to something which ought to be UTF-8 clean.
>      newname_bytes = filename.encode('utf-8')
>      os.rename( filename_bytes, newname_bytes )
>      utf8_filenames.append( newname_bytes )
>
>      # Once we get here, the file ought to be UTF-8 clean and the
> Unicode name ought to exist:
>      assert os.path.exists( newname_bytes.decode('utf-8') )
>
>
> # Switch filenames from utf8 bytestrings => unicode strings
> filenames = []
>
> for utf8_filename in utf8_filenames:
>      filenames.append( utf8_filename.decode('utf-8') )
>
> # Check the presence of a database file against the dir files and delete
> record if it doesn't exist
> cur.execute('''SELECT url FROM files''')
> data = cur.fetchall()
>
> for url in data:
>      if url not in filenames:
>          # Delete spurious
>          cur.execute('''DELETE FROM files WHERE url = %s''', url )
>
>
> #
> =================================================================================================================
>
> # Display ALL files, each with its own download button
> #
> =================================================================================================================
>
> print('''<body background='/data/images/star.jpg'>
>           <center><img src='/data/images/download.gif'><br><br>
>           <table border=5 cellpadding=5 bgcolor=green>
> ''')
>
> try:
>      cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
>      data = cur.fetchall()
>
>      for row in data:
>          (filename, hits, host, lastvisit) = row
>          lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>
>          print('''
>          <form method="get" action="/cgi-bin/files.py">
>              <tr>
>                  <td> <center> <input type="submit" name="filename"
> value="%s"> </td>
>                  <td> <center> <font color=yellow size=5> %s </td>
>                  <td> <center> <font color=orange size=4> %s </td>
>                  <td> <center> <font color=silver size=4> %s </td>
>              </tr>
>          </form>
>          ''' % (filename, hits, host, lastvisit) )
>      print( '''</table><br><br>''' )
> except pymysql.ProgrammingError as e:
>      print( repr(e) )
>
> sys.exit(0)
>
> ==========
> ima happy that at elaST IT DOES NOT ERRIGN OUT!


Can you shed some ligth please as to what might the problem be and its 
not displayign the filenames sicn ethey are utf8 encoded correctly?

Something you want me to try?

[toc] | [prev] | [next] | [standalone]


#47809

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-06-12 15:42 +0100
Message-ID<mailman.3110.1371048306.3114.python-list@python.org>
In reply to#47796
On 12/06/2013 13:42, Νικόλαος Κούρας wrote:
>
> Something you want me to try?

I'd suggest suicide but that would no doubt start another stream of 
questions along the lines of "How do I do it?".

-- 
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#47813

Fromrusi <rustompmody@gmail.com>
Date2013-06-12 09:14 -0700
Message-ID<8eb29ebf-e832-4bfa-9e07-3c0bf6c36f8f@a9g2000pbq.googlegroups.com>
In reply to#47809
On Jun 12, 7:42 pm, Mark Lawrence <breamore...@yahoo.co.uk> wrote:
> On 12/06/2013 13:42, Νικόλαος Κούρας wrote:
>
>
>
> > Something you want me to try?
>
> I'd suggest suicide but that would no doubt start another stream of
> questions along the lines of "How do I do it?".

There's a saying in some Indian languages (unfortunately sounds tame
when I translate it):
Counterfeit coin refuses to be lost…

[toc] | [prev] | [next] | [standalone]


#47814

FromNeil Cerutti <neilc@norwich.edu>
Date2013-06-12 16:18 +0000
Message-ID<b1rlauFp6jcU1@mid.individual.net>
In reply to#47809
On 2013-06-12, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> On 12/06/2013 13:42, ???????????????? ???????????? wrote:
>>
>> Something you want me to try?
>
> I'd suggest suicide but that would no doubt start another
> stream of questions along the lines of "How do I do it?".

hi. I loopet rope aroung and jumped, but bruise happen and erron
do the death.

Pls heelp!

Nikos

[toc] | [prev] | [next] | [standalone]


#47821

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 20:16 +0300
Message-ID<kpaadp$275g$3@news.ntua.gr>
In reply to#47814
On 12/6/2013 7:18 μμ, Neil Cerutti wrote:
> On 2013-06-12, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> On 12/06/2013 13:42, ???????????????? ???????????? wrote:
>>>
>>> Something you want me to try?
>>
>> I'd suggest suicide but that would no doubt start another
>> stream of questions along the lines of "How do I do it?".
>
> hi. I loopet rope aroung and jumped, but bruise happen and erron
> do the death.
>
> Pls heelp!
>
> Nikos
>
looooool :-)

Guys i'am really not trolling here, i just learn along the way as error 
appear. this is my last questiosn for my webiste to work, any other 
scripts i ahve is properrly working , its just this files.py error that 
needs fixing anf then i'll stop ask, at least for a while :)

[toc] | [prev] | [next] | [standalone]


#47869

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-13 00:22 +0000
Message-ID<51b910bb$0$29997$c3e8da3$5496439d@news.astraweb.com>
In reply to#47814
On Wed, 12 Jun 2013 16:18:38 +0000, Neil Cerutti wrote:

> On 2013-06-12, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> On 12/06/2013 13:42, ???????????????? ???????????? wrote:
>>>
>>> Something you want me to try?
>>
>> I'd suggest suicide but that would no doubt start another stream of
>> questions along the lines of "How do I do it?".
> 
> hi. I loopet rope aroung and jumped, but bruise happen and erron do the
> death.
> 
> Pls heelp!
> 
> Nikos


Oh god I shouldn't laugh but that is funny.

Still not cool though. Please stop.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#47820

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 20:14 +0300
Message-ID<kpaa9b$275g$2@news.ntua.gr>
In reply to#47809
On 12/6/2013 5:42 μμ, Mark Lawrence wrote:
> On 12/06/2013 13:42, Νικόλαος Κούρας wrote:
>>
>> Something you want me to try?
>
> I'd suggest suicide but that would no doubt start another stream of
> questions along the lines of "How do I do it?".

Okey that was indeed very finny, i even laughed at my own expense :)

[toc] | [prev] | [next] | [standalone]


#47822

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 20:20 +0300
Message-ID<kpaaks$275g$4@news.ntua.gr>
In reply to#47820
On 12/6/2013 8:14 μμ, Νικόλαος Κούρας wrote:
> On 12/6/2013 5:42 μμ, Mark Lawrence wrote:
>> On 12/06/2013 13:42, Νικόλαος Κούρας wrote:
>>>
>>> Something you want me to try?
>>
>> I'd suggest suicide but that would no doubt start another stream of
>> questions along the lines of "How do I do it?".
>
> Okey that was indeed very finny, i even laughed at my own expense :)
>

Hahahahahhahahahha, damn life, there is a how-to for everything in this 
world, even when it comes to leave it to go to afterlife :)

[toc] | [prev] | [next] | [standalone]


#47868

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-13 00:20 +0000
Message-ID<51b91038$0$29997$c3e8da3$5496439d@news.astraweb.com>
In reply to#47809
On Wed, 12 Jun 2013 15:42:07 +0100, Mark Lawrence wrote:

> On 12/06/2013 13:42, Νικόλαος Κούρας wrote:
>>
>> Something you want me to try?
> 
> I'd suggest suicide 

Mark, not cool. Seriously not cool.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#47824

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 20:27 +0300
Message-ID<51B8AF7C.4080904@superhost.gr>
In reply to#47796
On 12/6/2013 3:42 μμ, Νικόλαος Κούρας wrote:

>> =================================================================================================================
>> # Convert wrongly encoded filenames to utf-8
>> #
>> =================================================================================================================
>>
>>
>> path = b'/home/nikos/public_html/data/apps/'
>> filenames = os.listdir( path )
>>
>> utf8_filenames = []
>>
>> for filename in filenames:
>>      # Compute 'path/to/filename'
>>      filename_bytes = path + filename
>>      encoding = guess_encoding( filename_bytes )
>>
>>      if encoding == 'utf-8':
>>          # File name is valid UTF-8, so we can skip to the next file.
>>          utf8_filenames.append( filename_bytes )
>>          continue
>>      elif encoding is None:
>>          # No idea what the encoding is. Hit it with a hammer until it
>> stops moving.
>>          filename = filename_bytes.decode( 'utf-8', 'xmlcharrefreplace' )
>>      else:
>>          filename = filename_bytes.decode( encoding )
>>
>>      # Rename the file to something which ought to be UTF-8 clean.
>>      newname_bytes = filename.encode('utf-8')
>>      os.rename( filename_bytes, newname_bytes )
>>      utf8_filenames.append( newname_bytes )
>>
>>      # Once we get here, the file ought to be UTF-8 clean and the
>> Unicode name ought to exist:
>>      assert os.path.exists( newname_bytes.decode('utf-8') )
>>
>>
>> # Switch filenames from utf8 bytestrings => unicode strings
>> filenames = []
>>
>> for utf8_filename in utf8_filenames:
>>      filenames.append( utf8_filename.decode('utf-8') )
>>
>> # Check the presence of a database file against the dir files and delete
>> record if it doesn't exist
>> cur.execute('''SELECT url FROM files''')
>> data = cur.fetchall()
>>
>> for url in data:
>>      if url not in filenames:
>>          # Delete spurious
>>          cur.execute('''DELETE FROM files WHERE url = %s''', url )
>>
>>
>> #
>> =================================================================================================================
>>
>>
>> # Display ALL files, each with its own download button
>> #
>> =================================================================================================================
>>
>>
>> print('''<body background='/data/images/star.jpg'>
>>           <center><img src='/data/images/download.gif'><br><br>
>>           <table border=5 cellpadding=5 bgcolor=green>
>> ''')
>>
>> try:
>>      cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
>>      data = cur.fetchall()
>>
>>      for row in data:
>>          (filename, hits, host, lastvisit) = row
>>          lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>>
>>          print('''
>>          <form method="get" action="/cgi-bin/files.py">
>>              <tr>
>>                  <td> <center> <input type="submit" name="filename"
>> value="%s"> </td>
>>                  <td> <center> <font color=yellow size=5> %s </td>
>>                  <td> <center> <font color=orange size=4> %s </td>
>>                  <td> <center> <font color=silver size=4> %s </td>
>>              </tr>
>>          </form>
>>          ''' % (filename, hits, host, lastvisit) )
>>      print( '''</table><br><br>''' )
>> except pymysql.ProgrammingError as e:
>>      print( repr(e) )
>>
>> sys.exit(0)
>>
>> ==========

Please help, the script does not erring out, bu neither print the files 
along with download button for the users to download.

What else do i need to try to find out, where the logical error(it it is 
one) might be?

After correcting this issue, this gonna be my last question, every other 
script is fixed, its just this issues now almost 15 days.

[toc] | [prev] | [next] | [standalone]


#47831

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 22:05 +0300
Message-ID<kpagq6$qvj$1@news.ntua.gr>
In reply to#47824
On 12/6/2013 8:27 μμ, Νικόλαος Κούρας wrote:
> On 12/6/2013 3:42 μμ, Νικόλαος Κούρας wrote:
>
>>> =================================================================================================================
>>>
>>> # Convert wrongly encoded filenames to utf-8
>>> #
>>> =================================================================================================================
>>>
>>>
>>>
>>> path = b'/home/nikos/public_html/data/apps/'
>>> filenames = os.listdir( path )
>>>
>>> utf8_filenames = []
>>>
>>> for filename in filenames:
>>>      # Compute 'path/to/filename'
>>>      filename_bytes = path + filename
>>>      encoding = guess_encoding( filename_bytes )
>>>
>>>      if encoding == 'utf-8':
>>>          # File name is valid UTF-8, so we can skip to the next file.
>>>          utf8_filenames.append( filename_bytes )
>>>          continue
>>>      elif encoding is None:
>>>          # No idea what the encoding is. Hit it with a hammer until it
>>> stops moving.
>>>          filename = filename_bytes.decode( 'utf-8',
>>> 'xmlcharrefreplace' )
>>>      else:
>>>          filename = filename_bytes.decode( encoding )
>>>
>>>      # Rename the file to something which ought to be UTF-8 clean.
>>>      newname_bytes = filename.encode('utf-8')
>>>      os.rename( filename_bytes, newname_bytes )
>>>      utf8_filenames.append( newname_bytes )
>>>
>>>      # Once we get here, the file ought to be UTF-8 clean and the
>>> Unicode name ought to exist:
>>>      assert os.path.exists( newname_bytes.decode('utf-8') )
>>>
>>>
>>> # Switch filenames from utf8 bytestrings => unicode strings
>>> filenames = []
>>>
>>> for utf8_filename in utf8_filenames:
>>>      filenames.append( utf8_filename.decode('utf-8') )
>>>
>>> # Check the presence of a database file against the dir files and delete
>>> record if it doesn't exist
>>> cur.execute('''SELECT url FROM files''')
>>> data = cur.fetchall()
>>>
>>> for url in data:
>>>      if url not in filenames:
>>>          # Delete spurious
>>>          cur.execute('''DELETE FROM files WHERE url = %s''', url )
>>>
>>>
>>> #
>>> =================================================================================================================
>>>
>>>
>>>
>>> # Display ALL files, each with its own download button
>>> #
>>> =================================================================================================================
>>>
>>>
>>>
>>> print('''<body background='/data/images/star.jpg'>
>>>           <center><img src='/data/images/download.gif'><br><br>
>>>           <table border=5 cellpadding=5 bgcolor=green>
>>> ''')
>>>
>>> try:
>>>      cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
>>>      data = cur.fetchall()
>>>
>>>      for row in data:
>>>          (filename, hits, host, lastvisit) = row
>>>          lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>>>
>>>          print('''
>>>          <form method="get" action="/cgi-bin/files.py">
>>>              <tr>
>>>                  <td> <center> <input type="submit" name="filename"
>>> value="%s"> </td>
>>>                  <td> <center> <font color=yellow size=5> %s </td>
>>>                  <td> <center> <font color=orange size=4> %s </td>
>>>                  <td> <center> <font color=silver size=4> %s </td>
>>>              </tr>
>>>          </form>
>>>          ''' % (filename, hits, host, lastvisit) )
>>>      print( '''</table><br><br>''' )
>>> except pymysql.ProgrammingError as e:
>>>      print( repr(e) )
>>>
>>> sys.exit(0)
>>>
>>> ==========
>
> Please help, the script does not erring out, bu neither print the files
> along with download button for the users to download.
>
> What else do i need to try to find out, where the logical error(it it is
> one) might be?
>
> After correcting this issue, this gonna be my last question, every other
> script is fixed, its just this issues now almost 15 days.

Any ideas please on how to check this?

[toc] | [prev] | [next] | [standalone]


#47760

FromΝικόλαος Κούρας <support@superhost.gr>
Date2013-06-12 12:04 +0300
Message-ID<kp9din$1mvp$1@news.ntua.gr>
In reply to#47753
root@nikos [/home/nikos/www/data/apps]# ls -l
total 412788
drwxr-xr-x 2 nikos nikos     4096 Jun 12 12:03 ./
drwxr-xr-x 6 nikos nikos     4096 May 26 21:13 ../
-rwxr-xr-x 1 nikos nikos 13157283 Mar 17 12:57 100\ Mythoi\ tou\ 
Aiswpou.pdf*
-rwxr-xr-x 1 nikos nikos 29524686 Mar 11 18:17 Anekdotologio.exe*
-rw-r--r-- 1 nikos nikos 42413964 Jun  2 20:29 Battleship.exe
-rw-r--r-- 1 nikos nikos 51819750 Jun  2 20:04 Luxor\ Evolved.exe
-rw-r--r-- 1 nikos nikos 60571648 Jun  2 14:59 Monopoly.exe
-rwxr-xr-x 1 nikos nikos  1788164 Mar 14 11:31 Online\ Movie\ Player.zip*
-rw-r--r-- 1 nikos nikos  5277287 Jun  1 18:35 O\ Nomos\ tou\ Merfy\ 
v1-2-3.zip
-rwxr-xr-x 1 nikos nikos 16383001 Jun 22  2010 Orthodoxo\ Imerologio.exe*
-rw-r--r-- 1 nikos nikos  6084806 Jun  1 18:22 Pac-Man.exe
-rw-r--r-- 1 nikos nikos 45297713 Jun 10 12:38 Raptor\ Chess.exe
-rw-r--r-- 1 nikos nikos 25476584 Jun  2 19:50 Scrabble.exe
-rwxr-xr-x 1 nikos nikos 49141166 Mar 17 12:48 To\ 1o\ mou\ vivlio\ gia\ 
to\ skaki.pdf*
-rwxr-xr-x 1 nikos nikos  3298310 Mar 17 12:45 Vivlos\ gia\ Atheofovous.pdf*
-rw-r--r-- 1 nikos nikos  1764864 May 29 21:50 V-Radio\ v2.4.msi
-rw-r--r-- 1 nikos nikos  3511233 Jun  4 14:11 Ευχή\ του\ Ιησού.mp3
-rwxr-xr-x 1 nikos nikos 66896732 Mar 17 13:13 Κοσμάς\ Αιτωλός\ -\ 
Προφητείες.pdf*
-rw-r--r-- 1 nikos nikos   236032 Jun  4 14:10 Σκέψου\ έναν\ αριθμό.exe
root@nikos [/home/nikos/www/data/apps]#

[toc] | [prev] | [next] | [standalone]


#47763

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-06-12 09:12 +0000
Message-ID<51b83b6d$0$29998$c3e8da3$5496439d@news.astraweb.com>
In reply to#47751
On Wed, 12 Jun 2013 08:02:24 +0000, Νικόλαος Κούρας wrote:

> # Collect directory and its filenames as bytes
> path = b'/home/nikos/public_html/data/apps/'
> files = os.listdir( path )
[snip code]


I realised that the version I gave you earlier, or rather the modified 
version you came up with, was subject to a race condition. If somebody 
uploaded a file while the script was running, and that file name was not 
UTF-8 clean, the script would fail.

This version may be more robust and should be resistant to race 
conditions when files are uploaded. (However, do not *delete* files while 
this script is running.) As before, I have not tested this. I recommend 
that you test it thoroughly before deploying it live.



def guess_encoding(bytestring):
    for encoding in ('utf-8', 'iso-8859-7', 'latin-1'):
        try:
            bytestring.decode(encoding)
        except UnicodeDecodeError:
            # Decoding failed. Try the next one.
            pass
        else:
            # Decoding succeeded. This is our guess.
            return encoding
    # If we get here, none of the encodings worked. We cannot guess.
    return None


path = b'/home/nikos/public_html/data/apps/'
files = os.listdir( path )
clean_files = []
for filename in files:
    # Compute 'path/to/filename'
    filepath_bytes = path + filename
    encoding = guess_encoding(filepath_bytes)
    if encoding == 'utf-8':
        # File name is valid UTF-8, so we can skip to the next file.
        clean_files.append(filepath_bytes)
        continue
    if encoding is None:
        # No idea what the encoding is. Hit it with a hammer until it 
        # stops moving.
        filename = filepath_bytes.decode('utf-8', 'xmlcharrefreplace')
    else:
        filename = filepath_bytes.decode(encoding)
    # Rename the file to something which ought to be UTF-8 clean.
    newname_bytes = filename.encode('utf-8')
    os.rename(filepath_bytes, newname_bytes)
    clean_files.append(newname_bytes)
    # Once we get here, the file ought to be UTF-8 clean,
    # and the Unicode name ought to exist:
    assert os.path.exists(newname_bytes.decode('utf-8'))

# Dump the old list of file names, it is no longer valid.
del files

# DO NOT CALL listdir again. Somebody might have uploaded a 
# new file, with a broken file name. That will be fixed next 
# time this script runs, but for now, we ignore the dirty file
# name and just use the list of clean file names we built above.

clean_files = set(clean_files)

for name_as_bytes in sorted(clean_files):
    filename = name_as_bytes.decode('utf-8')
    # Check the presence of a file against the database 
    # and insert if it doesn't exist
    cur.execute('SELECT url FROM files WHERE url = %s', filename)
    data = cur.fetchone()




-- 
Steven

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web