Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #85939 > unrolled thread
| Started by | Jay T <jt11378@gmail.com> |
|---|---|
| First post | 2015-02-19 18:42 -0800 |
| Last post | 2015-02-20 15:27 +0100 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.lang.python
Python - parsing nested information and provide it in proper format from log file Jay T <jt11378@gmail.com> - 2015-02-19 18:42 -0800
Re: Python - parsing nested information and provide it in proper format from log file Peter Otten <__peter__@web.de> - 2015-02-20 14:10 +0100
Re: Python - parsing nested information and provide it in proper format from log file jt11380@gmail.com - 2015-02-20 05:31 -0800
Re: Python - parsing nested information and provide it in proper format from log file Peter Otten <__peter__@web.de> - 2015-02-20 15:27 +0100
| From | Jay T <jt11378@gmail.com> |
|---|---|
| Date | 2015-02-19 18:42 -0800 |
| Subject | Python - parsing nested information and provide it in proper format from log file |
| Message-ID | <0097dab0-301c-42e1-a6be-b21eb5356567@googlegroups.com> |
have some log file which has nested data which i want to filter and provide specific for student with total counts Here is my log file sample: Student name is ABC Student age is 12 student was late student was late student was late Student name is DEF student age is 13 student was late student was late i want to parse and show data as Student name, student age , number of counts how many times student was late e:g Name Age TotalCount ABC 12 3 DEF 13 2 Please help me with solution that will be really grateful. thanks, Jt
[toc] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-02-20 14:10 +0100 |
| Message-ID | <mailman.18920.1424437896.18130.python-list@python.org> |
| In reply to | #85939 |
Jay T wrote:
> have some log file which has nested data which i want to filter and
> provide specific for student with total counts
>
> Here is my log file sample:
> Student name is ABC
> Student age is 12
> student was late
> student was late
> student was late
> Student name is DEF
> student age is 13
> student was late
> student was late
>
> i want to parse and show data as Student name, student age , number of
> counts how many times student was late e:g Name Age TotalCount
> ABC 12 3
> DEF 13 2
>
> Please help me with solution that will be really grateful.
What have you tried? Please show us some code.
The basic idea would be to iterate over the lines and split the current line
into words.
If the second word is "name" and it's not the first iteration print the
student's name, age, and was_late count. Then set the name variable to the
new name and reset age and was_late to 0. To detect the first iteration you
can set
name = None
before you enter the loop and then check for that value before printing:
if name is not None:
... # print student data
If the second word is "age" convert the 4th word to integer and set the age
variable.
If the second word is "was" increment the was_late counter.
Remember that when the loop ends and the file was not empty you have one
more student's data to print.
[toc] | [prev] | [next] | [standalone]
| From | jt11380@gmail.com |
|---|---|
| Date | 2015-02-20 05:31 -0800 |
| Message-ID | <03cf1f25-b31b-4c48-96c9-be86a2ecdbc8@googlegroups.com> |
| In reply to | #85968 |
On Friday, February 20, 2015 at 8:11:59 AM UTC-5, Peter Otten wrote:
> Jay T wrote:
>
> > have some log file which has nested data which i want to filter and
> > provide specific for student with total counts
> >
> > Here is my log file sample:
> > Student name is ABC
> > Student age is 12
> > student was late
> > student was late
> > student was late
> > Student name is DEF
> > student age is 13
> > student was late
> > student was late
> >
> > i want to parse and show data as Student name, student age , number of
> > counts how many times student was late e:g Name Age TotalCount
> > ABC 12 3
> > DEF 13 2
> >
> > Please help me with solution that will be really grateful.
>
> What have you tried? Please show us some code.
>
> The basic idea would be to iterate over the lines and split the current line
> into words.
>
> If the second word is "name" and it's not the first iteration print the
> student's name, age, and was_late count. Then set the name variable to the
> new name and reset age and was_late to 0. To detect the first iteration you
> can set
>
> name = None
>
> before you enter the loop and then check for that value before printing:
>
> if name is not None:
> ... # print student data
>
> If the second word is "age" convert the 4th word to integer and set the age
> variable.
>
> If the second word is "was" increment the was_late counter.
>
> Remember that when the loop ends and the file was not empty you have one
> more student's data to print.
I tried to implent below code and got stucked how to do nested loop to count instead doing another logic and parsing:
import re
def GetName(input_string):
myName=input_string.split()
myName1= myName[1]
return myName1
def GetAge(input_string):
myAge=input_string.split()
myAge1= myAge[2]
return myAge1
file = open('mylogfile')
log_data = file.readlines()
print 'entered'
for eachline in log_data:
input_string = eachline
if 'name' in input_string:
sometextval = GetName(input_string)
print "name", sometextval
if 'Age' in input_string:
sometextval2 = GetAge(input_string)
print "Age", sometextval2
Now get stuck to get count for total_late time as it is part of name, age so how to write logic which counts as a part of group.
any help will be grateful.
-J
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-02-20 15:27 +0100 |
| Message-ID | <mailman.18922.1424442491.18130.python-list@python.org> |
| In reply to | #85970 |
jt11380@gmail.com wrote:
> On Friday, February 20, 2015 at 8:11:59 AM UTC-5, Peter Otten wrote:
>> Jay T wrote:
>>
>> > have some log file which has nested data which i want to filter and
>> > provide specific for student with total counts
>> >
>> > Here is my log file sample:
>> > Student name is ABC
>> > Student age is 12
>> > student was late
>> > student was late
>> > student was late
>> > Student name is DEF
>> > student age is 13
>> > student was late
>> > student was late
>> >
>> > i want to parse and show data as Student name, student age , number of
>> > counts how many times student was late e:g Name Age TotalCount
>> > ABC 12 3
>> > DEF 13 2
>> >
>> > Please help me with solution that will be really grateful.
>>
>> What have you tried? Please show us some code.
>>
>> The basic idea would be to iterate over the lines and split the current
>> line into words.
>>
>> If the second word is "name" and it's not the first iteration print the
>> student's name, age, and was_late count. Then set the name variable to
>> the new name and reset age and was_late to 0. To detect the first
>> iteration you can set
>>
>> name = None
>>
>> before you enter the loop and then check for that value before printing:
>>
>> if name is not None:
>> ... # print student data
>>
>> If the second word is "age" convert the 4th word to integer and set the
>> age variable.
>>
>> If the second word is "was" increment the was_late counter.
>>
>> Remember that when the loop ends and the file was not empty you have one
>> more student's data to print.
> Now get stuck to get count for total_late time as it is part of name, age
> so how to write logic which counts as a part of group.
>
> any help will be grateful.
Try to write code that does what I describe in my outline. Initialise name
before the loop and dump the collected data when you encounter a new name.
name = None
total_late = 0
age = "unknown"
with open("student.txt") as instream:
for line in instream:
words = line.split()
if words[1] == "name":
if name is not None:
print name, age, total_late
name = " ".join(words[3:])
age = "unknown"
total_late = 0
elif words[1] == "was":
total_late += 1
elif words[1] == "age":
age = int(words[3])
else:
print "don't know what to do with line %r" %line
if name is not None:
print name, age, total_late
Checking whole words has the advantage that there will be no match if the
string "name" or "age" is part of the student's name.
> I tried to implent below code and got stucked how to do nested loop to
> count instead doing another logic and parsing:
>
> import re
> def GetName(input_string):
> myName=input_string.split()
> myName1= myName[1]
That's the wrong index.
> return myName1
> def GetAge(input_string):
> myAge=input_string.split()
> myAge1= myAge[2]
That's the wrong index.
> return myAge1
In general you should test your functions independently from the whole
program. That way you can build on known-good components and thus reduce the
area where to look for remaining bugs.
> file = open('mylogfile')
> log_data = file.readlines()
The file is probably short enough that it doesn't matter here, but iterating
over the file directly is good habit to get into. Example:
with open("mylogfile") as log_data:
for eachline in log_data:
...
> print 'entered'
> for eachline in log_data:
> input_string = eachline
> if 'name' in input_string:
> sometextval = GetName(input_string)
> print "name", sometextval
> if 'Age' in input_string:
This test is problematic because Python takes case into account when
comparing strings:
>>> "age" == "AGE"
False
>>> s = "RAGE"
>>> "age" in s
False
If case isn't consistent you should convert the string to lowercase:
>>> "age" == "AGE".lower()
True
>>> "age" in s.lower()
True
> sometextval2 = GetAge(input_string)
> print "Age", sometextval2
>
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web