Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #37455 > unrolled thread

Converting a number back to it's original string (that was hashed to generate that number)

Started byFerrous Cranus <nikos.gr33k@gmail.com>
First post2013-01-23 04:21 -0800
Last post2013-01-23 05:38 -0800
Articles 9 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Converting a number back to it's original string (that was hashed to generate that number) Ferrous Cranus <nikos.gr33k@gmail.com> - 2013-01-23 04:21 -0800
    Re: Converting a number back to it's original string (that was hashed to generate that number) Lele Gaifax <lele@metapensiero.it> - 2013-01-23 14:06 +0100
      Re: Converting a number back to it's original string (that was hashed to generate that number) newspost2012@gmx.de - 2013-01-23 05:24 -0800
      Re: Converting a number back to it's original string (that was hashed to generate that number) newspost2012@gmx.de - 2013-01-23 05:24 -0800
      Re: Converting a number back to it's original string (that was hashed to generate that number) Ferrous Cranus <nikos.gr33k@gmail.com> - 2013-01-23 05:38 -0800
        Re: Converting a number back to it's original string (that was hashed to generate that number) Dave Angel <d@davea.name> - 2013-01-23 08:58 -0500
          Re: Converting a number back to it's original string (that was hashed to generate that number) Ferrous Cranus <nikos.gr33k@gmail.com> - 2013-01-23 07:30 -0800
          Re: Converting a number back to it's original string (that was hashed to generate that number) Ferrous Cranus <nikos.gr33k@gmail.com> - 2013-01-23 07:30 -0800
      Re: Converting a number back to it's original string (that was hashed to generate that number) Ferrous Cranus <nikos.gr33k@gmail.com> - 2013-01-23 05:38 -0800

#37455 — Converting a number back to it's original string (that was hashed to generate that number)

FromFerrous Cranus <nikos.gr33k@gmail.com>
Date2013-01-23 04:21 -0800
SubjectConverting a number back to it's original string (that was hashed to generate that number)
Message-ID<2c2351fb-2044-4351-af3e-63cff4fbf0f8@googlegroups.com>
Now my website finally works as intended. Just visit the following links plz. 
------------------------------------------------------------------------------ 
1. http://superhost.gr 

2. http://superhost.gr/?show=log 

3. http://i.imgur.com/89Eqmtf.png  (this displays the database's column 'pin', a 5-digit number acting as a filepath indicator) 

4. http://i.imgur.com/9js4Pz0.png   (this is the detailed html page's information associated to 'pin' column indicator instead of something like '/home/nikos/public_html/index.html' 

Isn't it a nice solution? 

I beleive it is.

but what happens when:   http://superhost.gr/?show=stats

I just see the number that correspons to a specific html page and hence i need to convert that number back to its original string.

# ==========================================================
# generating an 5-digit integer based on filepath, for to identify the current html page
# ==========================================================

pin = int( htmlpage.encode("hex"), 16 ) % 100000

Now i need the opposite procedure. Will hex.decode(number) convert back to the original string?

I think not because this isnt a hash procedure.
But how can it be done then?

[toc] | [next] | [standalone]


#37459

FromLele Gaifax <lele@metapensiero.it>
Date2013-01-23 14:06 +0100
Message-ID<mailman.891.1358946424.2939.python-list@python.org>
In reply to#37455
Ferrous Cranus <nikos.gr33k@gmail.com> writes:

> pin = int( htmlpage.encode("hex"), 16 ) % 100000
>
> Now i need the opposite procedure.

As already said several times by different persons in this thread, there
is no way you can get the original string that originated a particular
“pin”: the function you are using is “lossy”, that is, information gets
lost in order to reduce a BIG string into a SMALL five-digits integer
number.

>  Will hex.decode(number) convert back to the original string?

NO. As people demonstrated you, you are going to meet collisions very
fast, if you insist going this way (even you thought a “smarter” way to
get a checksum out of your string by using a different weight for the
single characters, there is still high chances of collisions, not
counting the final “modulo” operation). Once you get such a collision,
there is not enough information in that single tiny number to get back a
single string that generated it.

Imagine that, instead of using an integer checksum of your full path,
you “shrink” it by replacing each name in the path with its starting
letter, that is:

 /home/ferrous/public_html/index.html => /h/f/p/i

That is evidently way shorter of the original, but you LOST information,
and you cannot write code in any language that eventually reproduce the
original.

The only way out is either use the fullpath as the primary key of your
table, or using a mapping table with a bi-directional univoke mapping
between any single fullpath to the corresponding "short" integer value.

ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@metapensiero.it  |                 -- Fortunato Depero, 1929.

[toc] | [prev] | [next] | [standalone]


#37460

Fromnewspost2012@gmx.de
Date2013-01-23 05:24 -0800
Message-ID<fd985823-2ba8-4b82-8b2d-b23f7daadda1@googlegroups.com>
In reply to#37459
please don't feed the troll.

cu,
Kurt

[toc] | [prev] | [next] | [standalone]


#37461

Fromnewspost2012@gmx.de
Date2013-01-23 05:24 -0800
Message-ID<mailman.892.1358947452.2939.python-list@python.org>
In reply to#37459
please don't feed the troll.

cu,
Kurt

[toc] | [prev] | [next] | [standalone]


#37463

FromFerrous Cranus <nikos.gr33k@gmail.com>
Date2013-01-23 05:38 -0800
Message-ID<ab623ed2-5f3e-454b-b41e-86301fb0c89f@googlegroups.com>
In reply to#37459
Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
I will not create any kind of primary/unique keys to the database.
I will not store the filepath into the database, just the number which indicates the filepath(html page).
Also no external table associating fielpaths and numbers.
i want this to be solved only by Python Code, not database oriented.


That is:  I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string

int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.

But it's the % modulo that breaks the forth/back association.

So, the question is:

HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?

[toc] | [prev] | [next] | [standalone]


#37465

FromDave Angel <d@davea.name>
Date2013-01-23 08:58 -0500
Message-ID<mailman.894.1358949546.2939.python-list@python.org>
In reply to#37463
On 01/23/2013 08:38 AM, Ferrous Cranus wrote:
> Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
> I will not create any kind of primary/unique keys to the database.
> I will not store the filepath into the database, just the number which indicates the filepath(html page).
> Also no external table associating fielpaths and numbers.
> i want this to be solved only by Python Code, not database oriented.
>
>
> That is:  I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string
>
> int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.
>
> But it's the % modulo that breaks the forth/back association.
>
> So, the question is:
>
> HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?
>

Simple.  Predefine the 100,000 legal strings, and don't let the user use 
anything else.  One way to do that would be to require a path string of 
no more than 5 characters, and require them all to be of a restricted 
alphabet of 10 characters.  (eg. the alphabet could be 0-9, which is 
obvious, or it could be ".aehilmpst" (no uppercase, no underscore, no 
digits, no non-ascii, etc.)

In the realistic case of file paths or URLs, it CANNOT be done.


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#37481

FromFerrous Cranus <nikos.gr33k@gmail.com>
Date2013-01-23 07:30 -0800
Message-ID<eead39ac-cad2-4a7a-86d2-380cf610b61d@googlegroups.com>
In reply to#37465
Τη Τετάρτη, 23 Ιανουαρίου 2013 3:58:45 μ.μ. UTC+2, ο χρήστης Dave Angel έγραψε:
> On 01/23/2013 08:38 AM, Ferrous Cranus wrote:
> 
> > Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
> 
> > I will not create any kind of primary/unique keys to the database.
> 
> > I will not store the filepath into the database, just the number which indicates the filepath(html page).
> 
> > Also no external table associating fielpaths and numbers.
> 
> > i want this to be solved only by Python Code, not database oriented.
> 
> >
> 
> >
> 
> > That is:  I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string
> 
> >
> 
> > int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.
> 
> >
> 
> > But it's the % modulo that breaks the forth/back association.
> 
> >
> 
> > So, the question is:
> 
> >
> 
> > HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?
> 
> >
> 
> 
> 
> Simple.  Predefine the 100,000 legal strings, and don't let the user use 
> 
> anything else.  One way to do that would be to require a path string of 
> 
> no more than 5 characters, and require them all to be of a restricted 
> 
> alphabet of 10 characters.  (eg. the alphabet could be 0-9, which is 
> 
> obvious, or it could be ".aehilmpst" (no uppercase, no underscore, no 
> 
> digits, no non-ascii, etc.)
> 
> 
> 
> In the realistic case of file paths or URLs, it CANNOT be done.

OK, its not doable. I'll stop asking for it.
CHANGE of plans.
i will use the database solution which is the most easy wau to do it:

============================================================

	# insert new page record in table counters or update it if already exists
	try:
		cursor.execute( '''INSERT INTO counters(page, hits) VALUES(%s, %s) 
								ON DUPLICATE KEY UPDATE hits = hits + 1''', (htmlpage, 1) )
	except MySQLdb.Error, e:
		print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
		
	# update existing visitor record if same pin and same host found
	try:
		cursor.execute( '''UPDATE visitors SET hits = hits + 1, useros = %s, browser = %s, date = %s WHERE pin = %s AND host = %s''', (useros, browser, date, page, host))
	except MySQLdb.Error, e:
		print ( "Error %d: %s" % (e.args[0], e.args[1]) )
	
	# insert new visitor record if above update did not affect a row
	if cursor.rowcount == 0:
		cursor.execute( '''INSERT INTO visitors(hits, host, useros, browser, date) VALUES(%s, %s, %s, %s, %s)''', (1, host, useros, browser, date) )

============================================================

I can INSERT a row to the table "counter"
I cannot UPDATE or INSERT into the table "visitors" without knowing the "pin" primary key number the database created.

Can you help on this please?

[toc] | [prev] | [next] | [standalone]


#37483

FromFerrous Cranus <nikos.gr33k@gmail.com>
Date2013-01-23 07:30 -0800
Message-ID<mailman.904.1358955551.2939.python-list@python.org>
In reply to#37465
Τη Τετάρτη, 23 Ιανουαρίου 2013 3:58:45 μ.μ. UTC+2, ο χρήστης Dave Angel έγραψε:
> On 01/23/2013 08:38 AM, Ferrous Cranus wrote:
> 
> > Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
> 
> > I will not create any kind of primary/unique keys to the database.
> 
> > I will not store the filepath into the database, just the number which indicates the filepath(html page).
> 
> > Also no external table associating fielpaths and numbers.
> 
> > i want this to be solved only by Python Code, not database oriented.
> 
> >
> 
> >
> 
> > That is:  I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string
> 
> >
> 
> > int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.
> 
> >
> 
> > But it's the % modulo that breaks the forth/back association.
> 
> >
> 
> > So, the question is:
> 
> >
> 
> > HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?
> 
> >
> 
> 
> 
> Simple.  Predefine the 100,000 legal strings, and don't let the user use 
> 
> anything else.  One way to do that would be to require a path string of 
> 
> no more than 5 characters, and require them all to be of a restricted 
> 
> alphabet of 10 characters.  (eg. the alphabet could be 0-9, which is 
> 
> obvious, or it could be ".aehilmpst" (no uppercase, no underscore, no 
> 
> digits, no non-ascii, etc.)
> 
> 
> 
> In the realistic case of file paths or URLs, it CANNOT be done.

OK, its not doable. I'll stop asking for it.
CHANGE of plans.
i will use the database solution which is the most easy wau to do it:

============================================================

	# insert new page record in table counters or update it if already exists
	try:
		cursor.execute( '''INSERT INTO counters(page, hits) VALUES(%s, %s) 
								ON DUPLICATE KEY UPDATE hits = hits + 1''', (htmlpage, 1) )
	except MySQLdb.Error, e:
		print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
		
	# update existing visitor record if same pin and same host found
	try:
		cursor.execute( '''UPDATE visitors SET hits = hits + 1, useros = %s, browser = %s, date = %s WHERE pin = %s AND host = %s''', (useros, browser, date, page, host))
	except MySQLdb.Error, e:
		print ( "Error %d: %s" % (e.args[0], e.args[1]) )
	
	# insert new visitor record if above update did not affect a row
	if cursor.rowcount == 0:
		cursor.execute( '''INSERT INTO visitors(hits, host, useros, browser, date) VALUES(%s, %s, %s, %s, %s)''', (1, host, useros, browser, date) )

============================================================

I can INSERT a row to the table "counter"
I cannot UPDATE or INSERT into the table "visitors" without knowing the "pin" primary key number the database created.

Can you help on this please?

[toc] | [prev] | [next] | [standalone]


#37464

FromFerrous Cranus <nikos.gr33k@gmail.com>
Date2013-01-23 05:38 -0800
Message-ID<mailman.893.1358948291.2939.python-list@python.org>
In reply to#37459
Please DON'T tell me to save both the pin <=> filepath and associate them (that can be done by SQL commands, i know)
I will not create any kind of primary/unique keys to the database.
I will not store the filepath into the database, just the number which indicates the filepath(html page).
Also no external table associating fielpaths and numbers.
i want this to be solved only by Python Code, not database oriented.


That is:  I need to be able to map both ways, in a one to one relation, 5-digit-integer <=> string

int( hex ( string ) ) can encode a string to a number. Can this be decoded back? I gues that can also be decoded-converted back because its not losing any information. Its encoding, not compressing.

But it's the % modulo that breaks the forth/back association.

So, the question is:

HOW to map both ways, in a one to one relation, (5-digit-integer <=> string) without losing any information?

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web