Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #57608 > unrolled thread

Printing a drop down menu for a specific field.

Started byNick the Gr33k <nikos.gr33k@gmail.com>
First post2013-10-26 17:10 +0300
Last post2013-10-27 22:22 -0700
Articles 11 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-26 17:10 +0300
    Re: Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-26 18:40 +0300
      Re: Printing a drop down menu for a specific field. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-10-26 16:09 +0000
    Re: Printing a drop down menu for a specific field. rurpy@yahoo.com - 2013-10-26 11:33 -0700
      Re: Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-27 02:31 +0300
        Re: Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-27 02:52 +0300
          Re: Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-27 02:11 +0200
            Re: Printing a drop down menu for a specific field. rurpy@yahoo.com - 2013-10-26 21:00 -0700
              Re: Printing a drop down menu for a specific field. Nick the Gr33k <nikos.gr33k@gmail.com> - 2013-10-27 09:31 +0200
                Re: Printing a drop down menu for a specific field. Dave Angel <davea@davea.name> - 2013-10-27 11:52 +0000
                Re: Printing a drop down menu for a specific field. rurpy@yahoo.com - 2013-10-27 22:22 -0700

#57608 — Printing a drop down menu for a specific field.

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-26 17:10 +0300
SubjectPrinting a drop down menu for a specific field.
Message-ID<l4gihj$n8f$1@dont-email.me>
[QUOTE=turvey]Say your data is like the following:
data = [('alice', 1), ('alice', 2), ('bob', 5), ('bob', 10), ('carrie', 3)]

Where the first entry is your user and the second entry is a timestamp. 
Your data is structured basically like this, except I stripped the 
irrelevant details.

[CODE]
user_to_timestamps = {}

# Gather all your users together.
for user,timestamp in data:
     if user not in user_to_timestamp:
         user_to_timestamp[user] = []
     user_to_timestamp[user].append(timestamp)

# You now have a data structure like this
# {'alice': [1, 2], 'bob': [5, 10], 'carrie': [3]}

for user, timestamps in user_to_timestamps.iteritems():
     print user
     for timestamp in timestamps:
         print "<select>%s</select>" % timestamp
[/CODE]

There. That's how you would do it. It shouldn't be much work to get your 
code into that form.[/QUOTE]

I'am sorry but i still cannot transform my code:

[CODE]
	try:
		cur.execute( '''SELECT host, city, useros, browser, ref, hits, 
lastvisit FROM visitors
						WHERE counterID = (SELECT ID FROM counters WHERE url = %s) ORDER 
BY lastvisit DESC''', page )
		data = cur.fetchall()
		
		for row in data:
			(host, city, useros, browser, ref, hits, lastvisit) = row
			lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
			
			print( "<tr>" )
			for item in (host, city, useros, browser, ref, hits, lastvisit):
				print( "<td><center><b><font color=white> %s </td>" % item )
	except pymysql.ProgrammingError as e:
		print( repr(e) )
[/CODE]

to the solution you presented :(
I just dont know how to write it.

[toc] | [next] | [standalone]


#57624

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-26 18:40 +0300
Message-ID<l4gnq5$kl6$1@dont-email.me>
In reply to#57608
Στις 26/10/2013 5:10 μμ, ο/η Nick the Gr33k έγραψε:
> [QUOTE=turvey]Say your data is like the following:
> data = [('alice', 1), ('alice', 2), ('bob', 5), ('bob', 10), ('carrie', 3)]
>
> Where the first entry is your user and the second entry is a timestamp.
> Your data is structured basically like this, except I stripped the
> irrelevant details.
>
> [CODE]
> user_to_timestamps = {}
>
> # Gather all your users together.
> for user,timestamp in data:
>      if user not in user_to_timestamp:
>          user_to_timestamp[user] = []
>      user_to_timestamp[user].append(timestamp)
>
> # You now have a data structure like this
> # {'alice': [1, 2], 'bob': [5, 10], 'carrie': [3]}
>
> for user, timestamps in user_to_timestamps.iteritems():
>      print user
>      for timestamp in timestamps:
>          print "<select>%s</select>" % timestamp
> [/CODE]
>
> There. That's how you would do it. It shouldn't be much work to get your
> code into that form.[/QUOTE]
>
> I'am sorry but i still cannot transform my code:
>
> [CODE]
>      try:
>          cur.execute( '''SELECT host, city, useros, browser, ref, hits,
> lastvisit FROM visitors
>                          WHERE counterID = (SELECT ID FROM counters
> WHERE url = %s) ORDER BY lastvisit DESC''', page )
>          data = cur.fetchall()
>
>          for row in data:
>              (host, city, useros, browser, ref, hits, lastvisit) = row
>              lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>
>              print( "<tr>" )
>              for item in (host, city, useros, browser, ref, hits,
> lastvisit):
>                  print( "<td><center><b><font color=white> %s </td>" %
> item )
>      except pymysql.ProgrammingError as e:
>          print( repr(e) )
> [/CODE]
>
> to the solution you presented :(
> I just dont know how to write it.

Can someone write this properly? i tried but cannot make it work.

-- 
What is now proved was at first only imagined! & WebHost
<http://superhost.gr>

[toc] | [prev] | [next] | [standalone]


#57632

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-10-26 16:09 +0000
Message-ID<526be93f$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#57624
On Sat, 26 Oct 2013 18:40:52 +0300, Nick the Gr33k wrote:

> Can someone write this properly? i tried but cannot make it work.

Start by writing down what problem you are trying to solve with this 
code, and what you expect the code to do. In detail. What input data does 
it take, what result should it produce.

Then write down, in Greek, how *you* would solve this problem, again, in 
detail. What steps would *you* take? If you don't know how to solve the 
problem in real life, how do you expect to write instructions for a 
computer to solve it?

Once you know how to solve the problem yourself, then you need to change 
those steps into steps the computer can perform. Then, and only then, 
should you even think about writing Python code.

That is what programmers do. Since you want to be a programmer, you need 
to learn how to do this. Good luck! It's lots of hard work, and sometimes 
time consuming. That's all part of the job.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#57643

Fromrurpy@yahoo.com
Date2013-10-26 11:33 -0700
Message-ID<03c4c838-a6da-489f-b4b5-9342b3496fa3@googlegroups.com>
In reply to#57608
On 10/20/2013 05:30 PM, Νίκος Αλεξόπουλος wrote:
> try:
> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits, 
> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE 
> url = %s) ORDER BY lastvisit DESC''', page )
> 	data = cur.fetchall()
> 		
> 	for row in data:
> 		(host, city, useros, browser, ref, hits, lastvisit) = row
> 		lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
> 			
> 		print( "<tr>" )
> 		for item in (host, city, useros, browser, ref, hits, lastvisit):
> 			print( "<td><center><b><font color=white> %s </td>" % item )
> except pymysql.ProgrammingError as e:
> 	print( repr(e) )
> ===========================================
> 
> In the above code i print the record of the mysql table visitors in each 
> row like this:  http://superhost.gr/?show=log&page=index.html
> 
> Now, i wish to write the same thing but when it comes to print the 
> 'lastvisit' field to display it in a <select></select> tag so all prior 
> visits for the same host appear in a drop down menu opposed to as i have 
> it now which i only print the datetime of just the latest visit of that 
> host and not all its visit datetimes.

Perhaps something like this is what you are looking for?

> try:
> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits, 
> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE 
> url = %s) ORDER BY lastvisit DESC''', page )
> 	data = cur.fetchall()
>
        newdata = coalesce( data )
        for row in newdata:
            (host, city, useros, browser, ref, hits, visits) = row
              # Note that 'visits' is now a list of visit times.
            print( "<tr>" )
            for item in (host, city, useros, browser, ref, hits):
                print( "<td><center><b><font color=white> %s </td>" % item )
            print( "<td><select>" )
            for n, visit in enumerate (visits):
                visittime = visit.strftime('%A %e %b, %H:%M')
                if n == 0: op_selected = 'selected="selected"'
                else: op_selected = ''
                print( "<option %s>%s</option>" % (op_selected, visittime) )
            print( "</select></td>" )
            print( "</tr>" )

    def coalesce (data):
        '''Combine multiple data rows differing only in the 'hits' and
        'visits' fields into a single row with 'visits' changed into a
        list of the multiple visits values, and hits changed into the
        sum of the multiple 'hits' values.  Order of rows is preserved
        so that rows with most recent visits still come first.'''

        newdata = []
        seen = {}
        for host, city, useros, browser, ref, hits, visit in data:
            # Here you have to decide how to group the rows together.
            # For example, if you have
            #   178-20-236.static.cyta.gr | Europe/Athens | Windows | Explorer | Direct Hit | 1 | Παρασκευή 25 Οκτ, 20:48
            #   178-20-236.static.cyta.gr | Europe/Athens | Windows | Explorer | Direct Hit | 3 | Παρασκευή 25 Οκτ, 20:06
            # do you want those as one row on the html page, or two?
            # If one, what value do you want to show for 'hits' (Επανάληψη)?
            # "1", "3", "4"?
            # I'll assume that you want an html row for every unique
            # combination of (host, city, useros, browser) and that hits
            # should be summed together.
            key = host, city, useros, browser, ref
            if key not in seen:
                newdata.append ([host, city, useros, browser, ref, hits, [visit]])
                seen[key] = len (newdata) - 1    # Save index (for 'newdata') of this row.
            else:  # This row is a duplicate row with a different visit time.
                rowindex = seen[key]
                newdata[rowindex][5] += hits
                newdata[rowindex][6].append (visit)
        return newdata

Several caveats...  
The code above is untested, you'll probably have to fix some errors
but hopefully it is clear enough.
I only read this group intermittently these days so it you respond 
with question about the code, or need more help, it is likely I
will not see your message so don't be surprised if you don't get an
answer.

Hope this is more helpful than the other answers you got.

[toc] | [prev] | [next] | [standalone]


#57684

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-27 02:31 +0300
Message-ID<l4hjcm$lf6$1@dont-email.me>
In reply to#57643
Στις 26/10/2013 9:33 μμ, ο/η rurpy@yahoo.com έγραψε:
> On 10/20/2013 05:30 PM, Νίκος Αλεξόπουλος wrote:
>> try:
>> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE
>> url = %s) ORDER BY lastvisit DESC''', page )
>> 	data = cur.fetchall()
>> 		
>> 	for row in data:
>> 		(host, city, useros, browser, ref, hits, lastvisit) = row
>> 		lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>> 			
>> 		print( "<tr>" )
>> 		for item in (host, city, useros, browser, ref, hits, lastvisit):
>> 			print( "<td><center><b><font color=white> %s </td>" % item )
>> except pymysql.ProgrammingError as e:
>> 	print( repr(e) )
>> ===========================================
>>
>> In the above code i print the record of the mysql table visitors in each
>> row like this:  http://superhost.gr/?show=log&page=index.html
>>
>> Now, i wish to write the same thing but when it comes to print the
>> 'lastvisit' field to display it in a <select></select> tag so all prior
>> visits for the same host appear in a drop down menu opposed to as i have
>> it now which i only print the datetime of just the latest visit of that
>> host and not all its visit datetimes.
>
> Perhaps something like this is what you are looking for?
>
>> try:
>> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE
>> url = %s) ORDER BY lastvisit DESC''', page )
>> 	data = cur.fetchall()
>>
>          newdata = coalesce( data )
>          for row in newdata:
>              (host, city, useros, browser, ref, hits, visits) = row
>                # Note that 'visits' is now a list of visit times.
>              print( "<tr>" )
>              for item in (host, city, useros, browser, ref, hits):
>                  print( "<td><center><b><font color=white> %s </td>" % item )
>              print( "<td><select>" )
>              for n, visit in enumerate (visits):
>                  visittime = visit.strftime('%A %e %b, %H:%M')
>                  if n == 0: op_selected = 'selected="selected"'
>                  else: op_selected = ''
>                  print( "<option %s>%s</option>" % (op_selected, visittime) )
>              print( "</select></td>" )
>              print( "</tr>" )
>
>      def coalesce (data):
>          '''Combine multiple data rows differing only in the 'hits' and
>          'visits' fields into a single row with 'visits' changed into a
>          list of the multiple visits values, and hits changed into the
>          sum of the multiple 'hits' values.  Order of rows is preserved
>          so that rows with most recent visits still come first.'''
>
>          newdata = []
>          seen = {}
>          for host, city, useros, browser, ref, hits, visit in data:
>              # Here you have to decide how to group the rows together.
>              # For example, if you have
>              #   178-20-236.static.cyta.gr | Europe/Athens | Windows | Explorer | Direct Hit | 1 | Παρασκευή 25 Οκτ, 20:48
>              #   178-20-236.static.cyta.gr | Europe/Athens | Windows | Explorer | Direct Hit | 3 | Παρασκευή 25 Οκτ, 20:06
>              # do you want those as one row on the html page, or two?
>              # If one, what value do you want to show for 'hits' (Επανάληψη)?
>              # "1", "3", "4"?
>              # I'll assume that you want an html row for every unique
>              # combination of (host, city, useros, browser) and that hits
>              # should be summed together.
>              key = host, city, useros, browser, ref
>              if key not in seen:
>                  newdata.append ([host, city, useros, browser, ref, hits, [visit]])
>                  seen[key] = len (newdata) - 1    # Save index (for 'newdata') of this row.
>              else:  # This row is a duplicate row with a different visit time.
>                  rowindex = seen[key]
>                  newdata[rowindex][5] += hits
>                  newdata[rowindex][6].append (visit)
>          return newdata
>
> Several caveats...
> The code above is untested, you'll probably have to fix some errors
> but hopefully it is clear enough.
> I only read this group intermittently these days so it you respond
> with question about the code, or need more help, it is likely I
> will not see your message so don't be surprised if you don't get an
> answer.
>
> Hope this is more helpful than the other answers you got.

Thank you very much Rurpy, i appreciate your help very much.
Even of you havent tested it your code runs flawlessly.

Only 1 side effect.
If visitor comes from a referrer link then the visit[] list doesn't not 
add its timestamp into it.

It only adds it in case of a direct hit when there is no referer present.






-- 
What is now proved was at first only imagined! & WebHost
<http://superhost.gr>

[toc] | [prev] | [next] | [standalone]


#57686

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-27 02:52 +0300
Message-ID<l4hkkn$qba$1@dont-email.me>
In reply to#57684
Στις 27/10/2013 2:31 πμ, ο/η Nick the Gr33k έγραψε:
> Στις 26/10/2013 9:33 μμ, ο/η rurpy@yahoo.com έγραψε:
>> On 10/20/2013 05:30 PM, Νίκος Αλεξόπουλος wrote:
>>> try:
>>>     cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>>> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE
>>> url = %s) ORDER BY lastvisit DESC''', page )
>>>     data = cur.fetchall()
>>>
>>>     for row in data:
>>>         (host, city, useros, browser, ref, hits, lastvisit) = row
>>>         lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>>>
>>>         print( "<tr>" )
>>>         for item in (host, city, useros, browser, ref, hits, lastvisit):
>>>             print( "<td><center><b><font color=white> %s </td>" % item )
>>> except pymysql.ProgrammingError as e:
>>>     print( repr(e) )
>>> ===========================================
>>>
>>> In the above code i print the record of the mysql table visitors in each
>>> row like this:  http://superhost.gr/?show=log&page=index.html
>>>
>>> Now, i wish to write the same thing but when it comes to print the
>>> 'lastvisit' field to display it in a <select></select> tag so all prior
>>> visits for the same host appear in a drop down menu opposed to as i have
>>> it now which i only print the datetime of just the latest visit of that
>>> host and not all its visit datetimes.
>>
>> Perhaps something like this is what you are looking for?
>>
>>> try:
>>>     cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>>> lastvisit FROM visitors WHERE counterID = (SELECT ID FROM counters WHERE
>>> url = %s) ORDER BY lastvisit DESC''', page )
>>>     data = cur.fetchall()
>>>
>>          newdata = coalesce( data )
>>          for row in newdata:
>>              (host, city, useros, browser, ref, hits, visits) = row
>>                # Note that 'visits' is now a list of visit times.
>>              print( "<tr>" )
>>              for item in (host, city, useros, browser, ref, hits):
>>                  print( "<td><center><b><font color=white> %s </td>" %
>> item )
>>              print( "<td><select>" )
>>              for n, visit in enumerate (visits):
>>                  visittime = visit.strftime('%A %e %b, %H:%M')
>>                  if n == 0: op_selected = 'selected="selected"'
>>                  else: op_selected = ''
>>                  print( "<option %s>%s</option>" % (op_selected,
>> visittime) )
>>              print( "</select></td>" )
>>              print( "</tr>" )
>>
>>      def coalesce (data):
>>          '''Combine multiple data rows differing only in the 'hits' and
>>          'visits' fields into a single row with 'visits' changed into a
>>          list of the multiple visits values, and hits changed into the
>>          sum of the multiple 'hits' values.  Order of rows is preserved
>>          so that rows with most recent visits still come first.'''
>>
>>          newdata = []
>>          seen = {}
>>          for host, city, useros, browser, ref, hits, visit in data:
>>              # Here you have to decide how to group the rows together.
>>              # For example, if you have
>>              #   178-20-236.static.cyta.gr | Europe/Athens | Windows |
>> Explorer | Direct Hit | 1 | Παρασκευή 25 Οκτ, 20:48
>>              #   178-20-236.static.cyta.gr | Europe/Athens | Windows |
>> Explorer | Direct Hit | 3 | Παρασκευή 25 Οκτ, 20:06
>>              # do you want those as one row on the html page, or two?
>>              # If one, what value do you want to show for 'hits'
>> (Επανάληψη)?
>>              # "1", "3", "4"?
>>              # I'll assume that you want an html row for every unique
>>              # combination of (host, city, useros, browser) and that hits
>>              # should be summed together.
>>              key = host, city, useros, browser, ref
>>              if key not in seen:
>>                  newdata.append ([host, city, useros, browser, ref,
>> hits, [visit]])
>>                  seen[key] = len (newdata) - 1    # Save index (for
>> 'newdata') of this row.
>>              else:  # This row is a duplicate row with a different
>> visit time.
>>                  rowindex = seen[key]
>>                  newdata[rowindex][5] += hits
>>                  newdata[rowindex][6].append (visit)
>>          return newdata
>>
>> Several caveats...
>> The code above is untested, you'll probably have to fix some errors
>> but hopefully it is clear enough.
>> I only read this group intermittently these days so it you respond
>> with question about the code, or need more help, it is likely I
>> will not see your message so don't be surprised if you don't get an
>> answer.
>>
>> Hope this is more helpful than the other answers you got.
>
> Thank you very much Rurpy, i appreciate your help very much.
> Even of you havent tested it your code runs flawlessly.
>
> Only 1 side effect.
> If visitor comes from a referrer link then the visit[] list doesn't not
> add its timestamp into it.
>
> It only adds it in case of a direct hit when there is no referer present.
>
>
>
>
>
>
Ah foun it had to change in you code this line:
			key = host, city, useros, browser, ref

to this line:

			key = host, city, useros, browser

so 'ref' wouldnt be calculated in the unique combination key.

I'am still trying to understand the logic of your code and trying to 
create a history list column for the 'referrers'

I dont know how to write it though to produce the sam

-- 
What is now proved was at first only imagined! & WebHost
<http://superhost.gr>

[toc] | [prev] | [next] | [standalone]


#57689

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-27 02:11 +0200
Message-ID<l4hlnc$tp9$1@dont-email.me>
In reply to#57686
Στις 27/10/2013 2:52 πμ, ο/η Nick the Gr33k έγραψε:
> Ah foun it had to change in you code this line:
>              key = host, city, useros, browser, ref
>
> to this line:
>
>              key = host, city, useros, browser
>
> so 'ref' wouldnt be calculated in the unique combination key.
>
> I'am still trying to understand the logic of your code and trying to
> create a history list column for the 'referrers'
>
> I dont know how to write it though to produce the sam

Iam trying.

Ah foun it had to change in you code this line:
             key = host, city, useros, browser, ref

to this line:

             key = host, city, useros, browser

so 'ref' wouldnt be calculated in the unique combination key.

I'am still trying to understand the logic of your code and trying to 
create a history list column for the 'referrers'

I dont know how to write it though to produce the same output for referrers.

The bast i came up with is:

[code]
def coalesce( data ):
		newdata = []
		seen = {}
		for host, city, useros, browser, ref, hits, visit in data:
			# Here i have to decide how to group the rows together.
			# I want an html row for every unique combination of (host, city, 
useros, browser) and that hits should be summed together.
			key = host, city, useros, browser
			if key not in seen:
				newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
				seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this 
row.
			else:		# This row is a duplicate row with a different visit time.
				rowindex = seen[key]
				newdata[rowindex][4].append( ref )
				newdata[rowindex][5] += hits
				newdata[rowindex][6].append( visit )
		return newdata

		
	cur.execute( '''SELECT host, city, useros, browser, ref, hits, 
lastvisit FROM visitors
					WHERE counterID = (SELECT ID FROM counters WHERE url = %s) ORDER BY 
lastvisit DESC''', page )
	data = cur.fetchall()

	
	newdata = coalesce( data )
	for row in newdata:
		(host, city, useros, browser, refs, hits, visits) = row
		# Note that 'ref' & 'visits' are now lists of visit times.
		
		print( "<tr>" )
		for item in (host, city, useros, browser):
			print( "<td><center><b><font color=white> %s </td>" % item )
			
		print( "<td><select>" )
		for n, ref in enumerate( refs ):
			if n == 0:
				op_selected = 'selected="selected"'
			else:
				op_selected = ''
		print( "<option %s>%s</option>" % (op_selected, ref) )
		print( "</select></td>" )

		for item in (hits):
			print( "<td><center><b><font color=white> %s </td>" % item )
			
		print( "<td><select>" )
		for n, visit in enumerate( visits ):
			visittime = visit.strftime('%A %e %b, %H:%M')
			if n == 0:
				op_selected = 'selected="selected"'
			else:
				op_selected = ''
			print( "<option %s>%s</option>" % (op_selected, visittime) )
		print( "</select></td>" )
		
		print( "</tr>" )
[/code]

But this doesnt work correctly for refs and also doenst not print for 
some reason the hits and visit colums.

[toc] | [prev] | [next] | [standalone]


#57712

Fromrurpy@yahoo.com
Date2013-10-26 21:00 -0700
Message-ID<ee80b72a-1c2f-4aa5-8172-f2a0a990bbe5@googlegroups.com>
In reply to#57689
On 10/26/2013 06:11 PM, Nick the Gr33k wrote:
> Στις 27/10/2013 2:52 πμ, ο/η Nick the Gr33k έγραψε:
>> Ah foun it had to change in you code this line:
>>              key = host, city, useros, browser, ref
>>
>> to this line:
>>
>>              key = host, city, useros, browser
>>
>> so 'ref' wouldnt be calculated in the unique combination key.
>>
>> I'am still trying to understand the logic of your code and trying to
>> create a history list column for the 'referrers'
>>
>> I dont know how to write it though to produce the sam
> 
> Iam trying.
> 
> Ah foun it had to change in you code this line:
>              key = host, city, useros, browser, ref
> 
> to this line:
> 
>              key = host, city, useros, browser
> 
> so 'ref' wouldnt be calculated in the unique combination key.
> 
> I'am still trying to understand the logic of your code and trying to 
> create a history list column for the 'referrers'
> 
> I dont know how to write it though to produce the same output for referrers.
> 
> The bast i came up with is:
> 
> [code]
> def coalesce( data ):
> 		newdata = []
> 		seen = {}
> 		for host, city, useros, browser, ref, hits, visit in data:
> 			# Here i have to decide how to group the rows together.
> 			# I want an html row for every unique combination of (host, city, 
> useros, browser) and that hits should be summed together.
> 			key = host, city, useros, browser
> 			if key not in seen:
> 				newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
> 				seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this 
> row.
> 			else:		# This row is a duplicate row with a different visit time.
> 				rowindex = seen[key]
> 				newdata[rowindex][4].append( ref )
> 				newdata[rowindex][5] += hits
> 				newdata[rowindex][6].append( visit )
> 		return newdata
> 
> 		
> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits, 
> lastvisit FROM visitors
> 					WHERE counterID = (SELECT ID FROM counters WHERE url = %s) ORDER BY 
> lastvisit DESC''', page )
> 	data = cur.fetchall()
> 
> 	
> 	newdata = coalesce( data )
> 	for row in newdata:
> 		(host, city, useros, browser, refs, hits, visits) = row
> 		# Note that 'ref' & 'visits' are now lists of visit times.
> 		
> 		print( "<tr>" )
> 		for item in (host, city, useros, browser):
> 			print( "<td><center><b><font color=white> %s </td>" % item )
> 			
> 		print( "<td><select>" )
> 		for n, ref in enumerate( refs ):
> 			if n == 0:
> 				op_selected = 'selected="selected"'
> 			else:
> 				op_selected = ''
> 		print( "<option %s>%s</option>" % (op_selected, ref) )
> 		print( "</select></td>" )
> 
> 		for item in (hits):
> 			print( "<td><center><b><font color=white> %s </td>" % item )
> 			
> 		print( "<td><select>" )
> 		for n, visit in enumerate( visits ):
> 			visittime = visit.strftime('%A %e %b, %H:%M')
> 			if n == 0:
> 				op_selected = 'selected="selected"'
> 			else:
> 				op_selected = ''
> 			print( "<option %s>%s</option>" % (op_selected, visittime) )
> 		print( "</select></td>" )
> 		
> 		print( "</tr>" )
> [/code]
> 
> But this doesnt work correctly for refs and also doenst not print for 
> some reason the hits and visit colums.

Without a traceback it is hard to figure out what is happening.
(Actually in this case there is one obvious error, but there are
also some unobvious ones.)

Here is what I did to find the problems, and what you can do
the next time.  The main thing was to extract the code from
the cgi script so that I could run it outside of the web server
and without needing access to the database.  Then you can add
print statements (or run with the pdb debugger) and see tracebacks
and other errors easily.

1. Copy and paste the code from your message into a .py file.
2. Put a "def main(): line at the top of your main code.
3. Add a line at the bottom to call main()
4. Copy and paste a part of the web page you gave that list all the visits.
5. Edit it (change TAB to "|", add comma's after each line, etc, to 
 create a statement that will create variable, DATA.
6. Add a few statements to turn DATA into variable 'data' which has a 
 format similar to the format returned by your cur.fetchall() call.

This all took just 10 or 15 minutes, and I ended up with the 
following code:
--------------
    def main():
        data = [ln.split('|') for ln in DATA]
        for r in data: r[5] = int(r[5])  # Change the 'hit' values from str to int.
        print ('<table border="1">')

        newdata = coalesce( data )
        for row in newdata:
                (host, city, useros, browser, refs, hits, visits) = row
                # Note that 'ref' & 'visits' are now lists of visit times.

                print( "<tr>" )
                for item in (host, city, useros, browser):
                        print( "<td><center><b><font color=white> %s </td>" % item )

                print( "<td><select>" )
                for n, ref in enumerate( refs ):
                        if n == 0:
                                op_selected = 'selected="selected"'
                        else:
                                op_selected = ''
                print( "<option %s>%s</option>" % (op_selected, ref) )
                print( "</select></td>" )

                for item in (hits):
                        print( "<td><center><b><font color=white> %s </td>" % item )

                print( "<td><select>" )
                for n, visit in enumerate( visits ):
                        visittime = visit.strftime('%A %e %b, %H:%M')
                        if n == 0:
                                op_selected = 'selected="selected"'
                        else:
                                op_selected = ''
                        print( "<option %s>%s</option>" % (op_selected, visittime) )
                print( "</select></td>" )
                print( "</tr>" )

    def coalesce( data ):
                newdata = []
                seen = {}
                for host, city, useros, browser, ref, hits, visit in data:
                        # Here i have to decide how to group the rows together.
                        # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
                        key = host, city, useros, browser
                        if key not in seen:
                                newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
                                seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
                        else:		# This row is a duplicate row with a different visit time.
                                rowindex = seen[key]
                                newdata[rowindex][4].append( ref )
                                newdata[rowindex][5] += hits
                                newdata[rowindex][6].append( visit )
                return newdata

    DATA = [
    '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
    'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
    'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
    'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
    'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
    '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
    '89-145-108-206.as29017.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
    ]

    main()
-------------

I then ran it and it reported:
    [...some html output...]
    Traceback (most recent call last):
      File "xx2.py", line 88, in <module>
        if __name__ == '__main__': main()
      File "xx2.py", line 26, in main
        for item in (hits):
    TypeError: 'int' object is not iterable

Line 88 is:

        for item in (hits):

Remember that 'hits' is just an integer number, not a list.
So I changed:

        for item in (hits):
                print( "<td><center><b><font color=white> %s </td>" % item )

to:

        print( "<td><center><b><font color=white> %s </td>" % hits )

and ran again.  This time:

    Traceback (most recent call last):
      File "xx2.py", line 87, in <module>
        if __name__ == '__main__': main()
      File "xx2.py", line 30, in main
        visittime = visit.strftime('%A %e %b, %H:%M')
    AttributeError: 'str' object has no attribute 'strftime'

This is not a problem with the program but with the input data.  When
you get data from your database, 'visit' is a datetime object.  But
when it comes from the synthetic data we created, it is an already-
formatted string.

So I replaced
:
    visittime = visit.strftime('%A %e %b, %H:%M')

with

    visittime = visit   #.strftime('%A %e %b, %H:%M')

I also realized you use white fonts so I changed the table to have
a blue background color:

    print ('<table border="1">')

to 

    print ('<table border="1" bgcolor="blue">')

Now when I run the program it runs without errors and produces html
output.  So run again but save the output to a file.

    $ python3 test.py > test.html

And open the file with a browser.  Looks ok but none of the 'ref'
buttons has more than one entry.  So I edited the data to be:

    DATA = [
    '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
    'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
    'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://superhost.gr/|1|Σάββατο 26 Οκτ, 18:48',
    'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
    'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
    '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
    'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://mythosweb.gr/|1|Σάββατο 26 Οκτ, 18:22',
    ]

And ran again.  But still, the 'ref' button for mail14.ess.barracuda.com
showed on one item in the dropdown list.  After looking at the generated 
source code and adding some print statement to the python code I realized
that you had:
                for n, ref in enumerate( refs ):
                        if n == 0:
                                op_selected = 'selected="selected"'
                        else:
                                op_selected = ''
                print( "<option %s>%s</option>" % (op_selected, ref) )

but what you want is:

                for n, ref in enumerate( refs ):
                        if n == 0:
                                op_selected = 'selected="selected"'
                        else:
                                op_selected = ''
                        print( "<option %s>%s</option>" % (op_selected, ref) )

I also realized after looking at the HTML spec for the OPTION element
  (http://www.w3.org/TR/html401/interact/forms.html#h-17.6)
that 'selected="selected"' should be just 'selected'.  So in 
both places they occur, I changed

                        if n == 0:
                                op_selected = 'selected="selected"'
to 

                        if n == 0:
                                op_selected = 'selected'

Now the generated page looks right and the only thing left to do 
in to copy the fixed code back into your main cgi script, remember
to undo the temp change made for testing:

    visittime = visit   #.strftime('%A %e %b, %H:%M')

back to

    visittime = visit.strftime('%A %e %b, %H:%M')


So here is the fixed code:
----------------
        newdata = coalesce( data )
        for row in newdata:
                (host, city, useros, browser, refs, hits, visits) = row
                # Note that 'ref' & 'visits' are now lists of visit times.

                print( "<tr>" )
                for item in (host, city, useros, browser):
                        print( "<td><center><b><font color=white> %s </td>" % item )

                print( "<td><select>" )
                for n, ref in enumerate( refs ):
                        if n == 0:
                                op_selected = 'selected'
                        else:
                                op_selected = ''
                        print( "<option %s>%s</option>" % (op_selected, ref) )
                print( "</select></td>" )

                print( "<td><center><b><font color=white> %s </td>" % hits )

                print( "<td><select>" )
                for n, visit in enumerate( visits ):
                        visittime = visit.strftime('%A %e %b, %H:%M')
                        if n == 0:
                                op_selected = 'selected'
                        else:
                                op_selected = ''
                        print( "<option %s>%s</option>" % (op_selected, visittime) )
                print( "</select></td>" )
                print( "</tr>" )

        def coalesce( data ):
                newdata = []
                seen = {}
                for host, city, useros, browser, ref, hits, visit in data:
                        # Here i have to decide how to group the rows together.
                        # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
                        key = host, city, useros, browser
                        if key not in seen:
                                newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
                                seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
                        else:		# This row is a duplicate row with a different visit time.
                                rowindex = seen[key]
                                newdata[rowindex][4].append( ref )
                                newdata[rowindex][5] += hits
                                newdata[rowindex][6].append( visit )
                return newdata
----------------

[toc] | [prev] | [next] | [standalone]


#57727

FromNick the Gr33k <nikos.gr33k@gmail.com>
Date2013-10-27 09:31 +0200
Message-ID<l4iffs$uhb$1@dont-email.me>
In reply to#57712
Στις 27/10/2013 6:00 πμ, ο/η rurpy@yahoo.com έγραψε:
> On 10/26/2013 06:11 PM, Nick the Gr33k wrote:
>> Στις 27/10/2013 2:52 πμ, ο/η Nick the Gr33k έγραψε:
>>> Ah foun it had to change in you code this line:
>>>               key = host, city, useros, browser, ref
>>>
>>> to this line:
>>>
>>>               key = host, city, useros, browser
>>>
>>> so 'ref' wouldnt be calculated in the unique combination key.
>>>
>>> I'am still trying to understand the logic of your code and trying to
>>> create a history list column for the 'referrers'
>>>
>>> I dont know how to write it though to produce the sam
>>
>> Iam trying.
>>
>> Ah foun it had to change in you code this line:
>>               key = host, city, useros, browser, ref
>>
>> to this line:
>>
>>               key = host, city, useros, browser
>>
>> so 'ref' wouldnt be calculated in the unique combination key.
>>
>> I'am still trying to understand the logic of your code and trying to
>> create a history list column for the 'referrers'
>>
>> I dont know how to write it though to produce the same output for referrers.
>>
>> The bast i came up with is:
>>
>> [code]
>> def coalesce( data ):
>> 		newdata = []
>> 		seen = {}
>> 		for host, city, useros, browser, ref, hits, visit in data:
>> 			# Here i have to decide how to group the rows together.
>> 			# I want an html row for every unique combination of (host, city,
>> useros, browser) and that hits should be summed together.
>> 			key = host, city, useros, browser
>> 			if key not in seen:
>> 				newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>> 				seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this
>> row.
>> 			else:		# This row is a duplicate row with a different visit time.
>> 				rowindex = seen[key]
>> 				newdata[rowindex][4].append( ref )
>> 				newdata[rowindex][5] += hits
>> 				newdata[rowindex][6].append( visit )
>> 		return newdata
>>
>> 		
>> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>> lastvisit FROM visitors
>> 					WHERE counterID = (SELECT ID FROM counters WHERE url = %s) ORDER BY
>> lastvisit DESC''', page )
>> 	data = cur.fetchall()
>>
>> 	
>> 	newdata = coalesce( data )
>> 	for row in newdata:
>> 		(host, city, useros, browser, refs, hits, visits) = row
>> 		# Note that 'ref' & 'visits' are now lists of visit times.
>> 		
>> 		print( "<tr>" )
>> 		for item in (host, city, useros, browser):
>> 			print( "<td><center><b><font color=white> %s </td>" % item )
>> 			
>> 		print( "<td><select>" )
>> 		for n, ref in enumerate( refs ):
>> 			if n == 0:
>> 				op_selected = 'selected="selected"'
>> 			else:
>> 				op_selected = ''
>> 		print( "<option %s>%s</option>" % (op_selected, ref) )
>> 		print( "</select></td>" )
>>
>> 		for item in (hits):
>> 			print( "<td><center><b><font color=white> %s </td>" % item )
>> 			
>> 		print( "<td><select>" )
>> 		for n, visit in enumerate( visits ):
>> 			visittime = visit.strftime('%A %e %b, %H:%M')
>> 			if n == 0:
>> 				op_selected = 'selected="selected"'
>> 			else:
>> 				op_selected = ''
>> 			print( "<option %s>%s</option>" % (op_selected, visittime) )
>> 		print( "</select></td>" )
>> 		
>> 		print( "</tr>" )
>> [/code]
>>
>> But this doesnt work correctly for refs and also doenst not print for
>> some reason the hits and visit colums.
>
> Without a traceback it is hard to figure out what is happening.
> (Actually in this case there is one obvious error, but there are
> also some unobvious ones.)
>
> Here is what I did to find the problems, and what you can do
> the next time.  The main thing was to extract the code from
> the cgi script so that I could run it outside of the web server
> and without needing access to the database.  Then you can add
> print statements (or run with the pdb debugger) and see tracebacks
> and other errors easily.
>
> 1. Copy and paste the code from your message into a .py file.
> 2. Put a "def main(): line at the top of your main code.
> 3. Add a line at the bottom to call main()
> 4. Copy and paste a part of the web page you gave that list all the visits.
> 5. Edit it (change TAB to "|", add comma's after each line, etc, to
>   create a statement that will create variable, DATA.
> 6. Add a few statements to turn DATA into variable 'data' which has a
>   format similar to the format returned by your cur.fetchall() call.
>
> This all took just 10 or 15 minutes, and I ended up with the
> following code:
> --------------
>      def main():
>          data = [ln.split('|') for ln in DATA]
>          for r in data: r[5] = int(r[5])  # Change the 'hit' values from str to int.
>          print ('<table border="1">')
>
>          newdata = coalesce( data )
>          for row in newdata:
>                  (host, city, useros, browser, refs, hits, visits) = row
>                  # Note that 'ref' & 'visits' are now lists of visit times.
>
>                  print( "<tr>" )
>                  for item in (host, city, useros, browser):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                  print( "<option %s>%s</option>" % (op_selected, ref) )
>                  print( "</select></td>" )
>
>                  for item in (hits):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, visit in enumerate( visits ):
>                          visittime = visit.strftime('%A %e %b, %H:%M')
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, visittime) )
>                  print( "</select></td>" )
>                  print( "</tr>" )
>
>      def coalesce( data ):
>                  newdata = []
>                  seen = {}
>                  for host, city, useros, browser, ref, hits, visit in data:
>                          # Here i have to decide how to group the rows together.
>                          # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
>                          key = host, city, useros, browser
>                          if key not in seen:
>                                  newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>                                  seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
>                          else:		# This row is a duplicate row with a different visit time.
>                                  rowindex = seen[key]
>                                  newdata[rowindex][4].append( ref )
>                                  newdata[rowindex][5] += hits
>                                  newdata[rowindex][6].append( visit )
>                  return newdata
>
>      DATA = [
>      '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '89-145-108-206.as29017.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      ]
>
>      main()
> -------------
>
> I then ran it and it reported:
>      [...some html output...]
>      Traceback (most recent call last):
>        File "xx2.py", line 88, in <module>
>          if __name__ == '__main__': main()
>        File "xx2.py", line 26, in main
>          for item in (hits):
>      TypeError: 'int' object is not iterable
>
> Line 88 is:
>
>          for item in (hits):
>
> Remember that 'hits' is just an integer number, not a list.
> So I changed:
>
>          for item in (hits):
>                  print( "<td><center><b><font color=white> %s </td>" % item )
>
> to:
>
>          print( "<td><center><b><font color=white> %s </td>" % hits )
>
> and ran again.  This time:
>
>      Traceback (most recent call last):
>        File "xx2.py", line 87, in <module>
>          if __name__ == '__main__': main()
>        File "xx2.py", line 30, in main
>          visittime = visit.strftime('%A %e %b, %H:%M')
>      AttributeError: 'str' object has no attribute 'strftime'
>
> This is not a problem with the program but with the input data.  When
> you get data from your database, 'visit' is a datetime object.  But
> when it comes from the synthetic data we created, it is an already-
> formatted string.
>
> So I replaced
> :
>      visittime = visit.strftime('%A %e %b, %H:%M')
>
> with
>
>      visittime = visit   #.strftime('%A %e %b, %H:%M')
>
> I also realized you use white fonts so I changed the table to have
> a blue background color:
>
>      print ('<table border="1">')
>
> to
>
>      print ('<table border="1" bgcolor="blue">')
>
> Now when I run the program it runs without errors and produces html
> output.  So run again but save the output to a file.
>
>      $ python3 test.py > test.html
>
> And open the file with a browser.  Looks ok but none of the 'ref'
> buttons has more than one entry.  So I edited the data to be:
>
>      DATA = [
>      '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://superhost.gr/|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://mythosweb.gr/|1|Σάββατο 26 Οκτ, 18:22',
>      ]
>
> And ran again.  But still, the 'ref' button for mail14.ess.barracuda.com
> showed on one item in the dropdown list.  After looking at the generated
> source code and adding some print statement to the python code I realized
> that you had:
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                  print( "<option %s>%s</option>" % (op_selected, ref) )
>
> but what you want is:
>
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, ref) )
>
> I also realized after looking at the HTML spec for the OPTION element
>    (http://www.w3.org/TR/html401/interact/forms.html#h-17.6)
> that 'selected="selected"' should be just 'selected'.  So in
> both places they occur, I changed
>
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
> to
>
>                          if n == 0:
>                                  op_selected = 'selected'
>
> Now the generated page looks right and the only thing left to do
> in to copy the fixed code back into your main cgi script, remember
> to undo the temp change made for testing:
>
>      visittime = visit   #.strftime('%A %e %b, %H:%M')
>
> back to
>
>      visittime = visit.strftime('%A %e %b, %H:%M')
>
>
> So here is the fixed code:
> ----------------
>          newdata = coalesce( data )
>          for row in newdata:
>                  (host, city, useros, browser, refs, hits, visits) = row
>                  # Note that 'ref' & 'visits' are now lists of visit times.
>
>                  print( "<tr>" )
>                  for item in (host, city, useros, browser):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, ref) )
>                  print( "</select></td>" )
>
>                  print( "<td><center><b><font color=white> %s </td>" % hits )
>
>                  print( "<td><select>" )
>                  for n, visit in enumerate( visits ):
>                          visittime = visit.strftime('%A %e %b, %H:%M')
>                          if n == 0:
>                                  op_selected = 'selected'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, visittime) )
>                  print( "</select></td>" )
>                  print( "</tr>" )
>
>          def coalesce( data ):
>                  newdata = []
>                  seen = {}
>                  for host, city, useros, browser, ref, hits, visit in data:
>                          # Here i have to decide how to group the rows together.
>                          # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
>                          key = host, city, useros, browser
>                          if key not in seen:
>                                  newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>                                  seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
>                          else:		# This row is a duplicate row with a different visit time.
>                                  rowindex = seen[key]
>                                  newdata[rowindex][4].append( ref )
>                                  newdata[rowindex][5] += hits
>                                  newdata[rowindex][6].append( visit )
>                  return newdata
> ----------------


Once again i personally thank you very much for your help and specially 
for taking the time and effort to explain to me in detail the logic you 
followed to just wanted to  make the code work.

I read it thoroughly and tested it and it works as it should.

I just wanted to mention that the definition of the function coalesce() 
must come prior of:

>          newdata = coalesce( data )
>          for row in newdata:

because function must be defined first before we try to call it and pass 
data to ti, so i placed it just before that.

Also i have changed the data insertion to be:

# if first time visitor on this page, create new record, if visitor 
exists then update record
		cur.execute('''INSERT INTO visitors (counterID, host, city, useros, 
browser, ref, lastvisit) VALUES (%s, %s, %s, %s, %s, %s, %s)''',
					   (cID, host, city, useros, browser, ref, lastvisit) )

removing the 'ON DUPLICATE UPDATE' i had and also removed the unique 
index(CounterID, host)

so that every time a visitor comes into my website even with the same 
hostname a new database entry will appear for the same hostname.

I almost understand your code, but this part is not so clear to me:
f key not in seen:

seen[key] = len( newdata ) - 1 # Save index (for 'newdata') of this row.
else:    # This row is a duplicate row with a different referrer & visit 
time.
     rowindex = seen[key]
     newdata[rowindex][4].append( ref )
     newdata[rowindex][5] += hits
     newdata[rowindex][6].append( visit )


I couldn't at all be successfull in writing this myself even.



[toc] | [prev] | [next] | [standalone]


#57738

FromDave Angel <davea@davea.name>
Date2013-10-27 11:52 +0000
Message-ID<mailman.1646.1382874775.18130.python-list@python.org>
In reply to#57727
On 27/10/2013 03:31, Nick the Gr33k wrote:

> Στις 27/10/2013 6:00 πμ, ο/η rurpy@yahoo.com έγραψε:

     <snip>
>
> I read it thoroughly and tested it and it works as it should.
>
> I just wanted to mention that the definition of the function coalesce() 
> must come prior of:
>
>>          newdata = coalesce( data )
>>          for row in newdata:
>
> because function must be defined first before we try to call it and pass 
> data to ti, so i placed it just before that.
>

I found the above two lines in the function main(), in Rurpy's code.  If
that's where you were talking about, the comment about order does not
apply.

If you are calling from one function into another (in this case from
main() into coalesce()), and the two functions are defined at global
scope, then the functions may be appear in either order.

it's only when you're calling a function from global scope, or sometimes
when nesting functions inside each other, that order of
definition of the two functions matters.  Naturally, the call to main()
from top-level has to be after both functions are defined.

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#57785

Fromrurpy@yahoo.com
Date2013-10-27 22:22 -0700
Message-ID<4f80dce1-c878-48ae-adc5-5a8230fd55a9@googlegroups.com>
In reply to#57727
On 10/27/2013 01:31 AM, Nick the Gr33k wrote:
> Στις 27/10/2013 6:00 πμ, ο/η rurpy@yahoo.com έγραψε:
>[...] 
[following quote lightly edited for clarity] 
> I almost understand your code, but this part is not so clear to me:
>
  key = host, city, useros, browser
> if key not in seen:
       newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>      seen[key] = len( newdata ) - 1 # Save index (for 'newdata') of this row.
> else:    # This row is a duplicate row with a different referrer & visit time.
>      rowindex = seen[key]
>      newdata[rowindex][4].append( ref )
>      newdata[rowindex][5] += hits
>      newdata[rowindex][6].append( visit )

I'm not sure exactly what part is not clear to you so I'll give 
you a very long-winded explanation and you can ignore any parts 
that are already obvious to you.

The code above is inside a loop that looks at each row in <data>.

In <data> there can be several rows for the same visitor, where you
define a visitor as a unique combination of <host>, <city>, <useros> 
and <browser>.

What you want to do is combine all of the rows that are for the same
visitor into one row.  That one row, instead of having a single value
for <ref> and <lastvisit> will have lists of all the <ref>s and 
<lastvisit>s from all the rows that have the same visitor value.

So first, for each row, we set <key> to a tuple that identifies the 
visitor.  (Actually, I should have named that variable "visitor" 
instead of "key".)  Then we use an ordinary python dictionary <seen> 
to record each visitor as we see them.  Remember that a dictionary 
can use a tuple as a key (unlike Perl were a hash key has to be a 
string).  

For each row we look in the dictionary <seen> to see if this visitor
is a new one that we haven't seen before.  If we haven't seen them
before we create a new row in <newdata> for them that is a copy of 
the row in <data> except we change the <ref> and <lastvisit> fields 
from single values to lists.  We also add an entry to <seen> whose 
key is the visitor, and whose value is the index of the vistor's row 
in <newdata>. 

If the visitor *was* seen before (because we find an entry for the 
visitor in <seen>), then the value of that entry tells us the index 
of that visitor's row in <newdata> and instead of adding a new row
to <newdata> we update the visitors row that is already there.

Maybe it's easier to see what is happening by looking at how the 
code actually runs.

Suppose the data you get from your database is

    data = ['mail14.ess.barracuda.com',           'Άγνωστη Πόλη', 'Windows', 'Explorer', 'Direct Hit',           '1', 'Σάββατο 26 Οκτ, 18:49',
            '209.133.77.165.T01713-01.above.net', 'Άγνωστη Πόλη', 'Windows', 'Explorer', 'Direct Hit',           '1', 'Σάββατο 26 Οκτ, 18:59',
            'mail14.ess.barracuda.com',           'Άγνωστη Πόλη', 'Windows', 'Explorer', 'http://superhost.gr/', '1', 'Σάββατο 26 Οκτ, 18:48',
           ]

When the first row of <data> is processed, <key> will be
set to the 4-tuple:

  ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer').

Then, when "if key not in seen" is executed.  This will look in 
dictionary <seen> and see if there in an entry in it with a key that 
matches the tuple above.  Since <seen> is still an empty dictionary,
<key> is not in the dictionary because there is nothing in the
dictionary and "if key not in seen" is true.

So the first branch of the if statement runs:

       newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
       seen[key] = len( newdata ) - 1 # Save index (for 'newdata') of this row.

Now, <newdata> contains 1 row:

    [ 'mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer', ['Direct Hit'], 1, ['Σάββατο 26 Οκτ, 18:49'] ]

And, <seen> contains:

  { ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer'): 0 }

Note the the 0 value in the <seen> dictionary is the index of the 
corresponding row in <newdata>.

When the second row of <data> is processed, the same thing happens.
<key> is the tuple

  ('209.133.77.165.T01713-01.above.net','Άγνωστη Πόλη','Windows','Explorer')

but since the only key in <seen> is 

  ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer')

again the "not in" branch is executed.  When it runs this time it
adds another row to <newdata> so <newdata> now looks like:

    [ 'mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer', ['Direct Hit'], 1, ['Σάββατο 26 Οκτ, 18:49'], 
      '209.133.77.165.T01713-01.above.net','Άγνωστη Πόλη','Windows','Explorer', ['Direct Hit'], 1, ['Σάββατο 26 Οκτ, 18:59'], ]

and adds another entry to <seen> so that <seen> is now:

  { ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer'): 0, 
    ('209.133.77.165.T01713-01.above.net','Άγνωστη Πόλη','Windows','Explorer'): 1 }

Again, the 1 is the index of the corresponding row in <newdata>.

Now the third row of <data> is processed.  <key> is set to

  ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer')

This time when "if key not in seen" is executed, it is false because
that key *is* in seen, it was added the when the first <data> was
processed (look at <seen> above).  So the statements

       rowindex = seen[key]
       newdata[rowindex][4].append( ref )
       newdata[rowindex][5] += hits
       newdata[rowindex][6].append( visit )

are executed.  These statements will update the existing row for
visitor <key> in <newdata>.  The value of <key> is

  ('mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer')

and seen[key] is 0 and <rowindex> is set to that.  newdata[0] is
the row in <newdata> for the same visitor.  The next three lines 
just update that row by appending the current <data> row's 
<lastvisit> time to newdata[0]'s visits list.  Similarly for ref, 
and hits field of newdata[0] is incremented by the current <data> 
row's hits field.  After, <newdata> looks like:

    [ 'mail14.ess.barracuda.com','Άγνωστη Πόλη','Windows','Explorer', ['Direct Hit',            
                                                                       'http://superhost.gr/'], 2, ['Σάββατο 26 Οκτ, 18:49',
                                                                                                    'Σάββατο 26 Οκτ, 18:48'], 

      '209.133.77.165.T01713-01.above.net','Άγνωστη Πόλη','Windows','Explorer', ['Direct Hit'], 1, ['Σάββατο 26 Οκτ, 18:59'], ]

And so on for the rest of the data.  When a new visitor is seen
a row is added to <newdata> and the visitor (identified as the 
tupple <key>) is saved in <seen> along with that index of that
visitor's row in <newdata>.  If the same visitor is seen again
later in <data>, the corresponding row in <newdata> is updated 
rather then adding a new row to <newdata>.  

Did that help make it clearer?  It is a lot easier to write
code than to explain it :-) 

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web