Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #103570 > unrolled thread

Lookahead while doing: for line in fh.readlines():

Started by"Veek. M" <vek.m1234@gmail.com>
First post2016-02-27 15:09 +0530
Last post2016-03-05 09:01 +0530
Articles 5 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Lookahead while doing: for line in fh.readlines(): "Veek. M" <vek.m1234@gmail.com> - 2016-02-27 15:09 +0530
    Re: Lookahead while doing: for line in fh.readlines(): Terry Reedy <tjreedy@udel.edu> - 2016-02-27 06:46 -0500
      Re: Lookahead while doing: for line in fh.readlines(): "Veek. M" <vek.m1234@gmail.com> - 2016-03-04 18:34 +0530
        Re: Lookahead while doing: for line in fh.readlines(): MRAB <python@mrabarnett.plus.com> - 2016-03-04 18:36 +0000
          Re: Lookahead while doing: for line in fh.readlines(): "Veek. M" <vek.m1234@gmail.com> - 2016-03-05 09:01 +0530

#103570 — Lookahead while doing: for line in fh.readlines():

From"Veek. M" <vek.m1234@gmail.com>
Date2016-02-27 15:09 +0530
SubjectLookahead while doing: for line in fh.readlines():
Message-ID<narqnp$4bm$1@dont-email.me>
I want to do something like:

#!/usr/bin/env python3

fh = open('/etc/motd')
for line in fh.readlines():
    print(fh.tell())

why doesn't this work as expected.. fh.readlines() should return a 
generator object and fh.tell() ought to start at 0 first.

Instead i get the final count repeated for the number of lines.

What i'm trying to do is lookahead:
#!whatever

fh = open(whatever)
for line in fh.readlines():
    x = fh.tell()
    temp = fh.readline()
    fh.seek(x)

[toc] | [next] | [standalone]


#103578

FromTerry Reedy <tjreedy@udel.edu>
Date2016-02-27 06:46 -0500
Message-ID<mailman.170.1456573604.20994.python-list@python.org>
In reply to#103570
On 2/27/2016 4:39 AM, Veek. M wrote:
> I want to do something like:
>
> #!/usr/bin/env python3
>
> fh = open('/etc/motd')
> for line in fh.readlines():
>      print(fh.tell())
>
> why doesn't this work as expected.. fh.readlines() should return a
> generator object and fh.tell() ought to start at 0 first.

Not after you have already read some data.  Readlines() reads the entire 
file and splits it into lines.  readline reads at least a single block. 
  Reading a single byte or character at a time looking for /n would be 
too slow, so even after readline, the file pointer will be somewhere 
past the end of the last line returned.

> Instead i get the final count repeated for the number of lines.
>
> What i'm trying to do is lookahead:
> #!whatever
>
> fh = open(whatever)
> for line in fh.readlines():
>      x = fh.tell()
>      temp = fh.readline()
>      fh.seek(x)
>


-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#104032

From"Veek. M" <vek.m1234@gmail.com>
Date2016-03-04 18:34 +0530
Message-ID<nbc0uf$b3m$1@dont-email.me>
In reply to#103578
Terry Reedy wrote:

> On 2/27/2016 4:39 AM, Veek. M wrote:
>> I want to do something like:
>>
>> #!/usr/bin/env python3
>>
>> fh = open('/etc/motd')
>> for line in fh.readlines():
>>      print(fh.tell())
>>
>> why doesn't this work as expected.. fh.readlines() should return a
>> generator object and fh.tell() ought to start at 0 first.
> 
> Not after you have already read some data.  Readlines() reads the
> entire
> file and splits it into lines.  readline reads at least a single
> block.
>   Reading a single byte or character at a time looking for /n would be
> too slow, so even after readline, the file pointer will be somewhere
> past the end of the last line returned.
> 
>> Instead i get the final count repeated for the number of lines.
>>
>> What i'm trying to do is lookahead:
>> #!whatever
>>
>> fh = open(whatever)
>> for line in fh.readlines():
>>      x = fh.tell()
>>      temp = fh.readline()
>>      fh.seek(x)
>>
> 
> 

I get that readlines() would slurp the whole file for efficiency 
reasons. Why doesn't fh.seek() work though. Object 'fh' is a data 
structure for the OS file descriptor similar to FILE in C.
<class '_io.TextIOWrapper'>

So if seek works in C, how come it doesn't work in python wrt 
readlines() which is just a method. What obviates the functioning of 
seek wrt readlines()?

fh.tell() works at the line level.. and fh.readline() works with 
fh.seek(0)

[toc] | [prev] | [next] | [standalone]


#104046

FromMRAB <python@mrabarnett.plus.com>
Date2016-03-04 18:36 +0000
Message-ID<mailman.195.1457116569.20602.python-list@python.org>
In reply to#104032
On 2016-03-04 13:04, Veek. M wrote:
> Terry Reedy wrote:
>
>> On 2/27/2016 4:39 AM, Veek. M wrote:
>>> I want to do something like:
>>>
>>> #!/usr/bin/env python3
>>>
>>> fh = open('/etc/motd')
>>> for line in fh.readlines():
>>>      print(fh.tell())
>>>
>>> why doesn't this work as expected.. fh.readlines() should return a
>>> generator object and fh.tell() ought to start at 0 first.
>>
>> Not after you have already read some data.  Readlines() reads the
>> entire
>> file and splits it into lines.  readline reads at least a single
>> block.
>>   Reading a single byte or character at a time looking for /n would be
>> too slow, so even after readline, the file pointer will be somewhere
>> past the end of the last line returned.
>>
>>> Instead i get the final count repeated for the number of lines.
>>>
>>> What i'm trying to do is lookahead:
>>> #!whatever
>>>
>>> fh = open(whatever)
>>> for line in fh.readlines():
>>>      x = fh.tell()
>>>      temp = fh.readline()
>>>      fh.seek(x)
>>>
>>
>>
>
> I get that readlines() would slurp the whole file for efficiency
> reasons. Why doesn't fh.seek() work though. Object 'fh' is a data
> structure for the OS file descriptor similar to FILE in C.
> <class '_io.TextIOWrapper'>
>
 > So if seek works in C, how come it doesn't work in python wrt
 > readlines() which is just a method. What obviates the functioning of
 > seek wrt readlines()?
 >
 > fh.tell() works at the line level.. and fh.readline() works with
 > fh.seek(0)
 >

fh.readlines() reads the entire file.

At this point, it's at the end of the file.

The 'body' of the 'for' loop is then executed.

fh.tell() returns the the position of the end of the file because it's 
at the end of the file.

fh.readline() returns an empty string because it's at the end of the file.

fh.seek(x) seeks to the end of the file, which is where it already is.

Is that clearer?

[toc] | [prev] | [next] | [standalone]


#104078

From"Veek. M" <vek.m1234@gmail.com>
Date2016-03-05 09:01 +0530
Message-ID<nbdjob$6qd$1@dont-email.me>
In reply to#104046
MRAB wrote:

> On 2016-03-04 13:04, Veek. M wrote:
>> Terry Reedy wrote:
>>
>>> On 2/27/2016 4:39 AM, Veek. M wrote:
>>>> I want to do something like:
>>>>
>>>> #!/usr/bin/env python3
>>>>
>>>> fh = open('/etc/motd')
>>>> for line in fh.readlines():
>>>>      print(fh.tell())
>>>>
>>>> why doesn't this work as expected.. fh.readlines() should return a
>>>> generator object and fh.tell() ought to start at 0 first.
>>>
>>> Not after you have already read some data.  Readlines() reads the
>>> entire
>>> file and splits it into lines.  readline reads at least a single
>>> block.
>>>   Reading a single byte or character at a time looking for /n would
>>>   be
>>> too slow, so even after readline, the file pointer will be somewhere
>>> past the end of the last line returned.
>>>
>>>> Instead i get the final count repeated for the number of lines.
>>>>
>>>> What i'm trying to do is lookahead:
>>>> #!whatever
>>>>
>>>> fh = open(whatever)
>>>> for line in fh.readlines():
>>>>      x = fh.tell()
>>>>      temp = fh.readline()
>>>>      fh.seek(x)
>>>>
>>>
>>>
>>
>> I get that readlines() would slurp the whole file for efficiency
>> reasons. Why doesn't fh.seek() work though. Object 'fh' is a data
>> structure for the OS file descriptor similar to FILE in C.
>> <class '_io.TextIOWrapper'>
>>
>  > So if seek works in C, how come it doesn't work in python wrt
>  > readlines() which is just a method. What obviates the functioning
>  > of seek wrt readlines()?
>  >
>  > fh.tell() works at the line level.. and fh.readline() works with
>  > fh.seek(0)
>  >
> 
> fh.readlines() reads the entire file.
> 
> At this point, it's at the end of the file.
> 
> The 'body' of the 'for' loop is then executed.
> 
> fh.tell() returns the the position of the end of the file because it's
> at the end of the file.
> 
> fh.readline() returns an empty string because it's at the end of the
> file.
> 
> fh.seek(x) seeks to the end of the file, which is where it already is.
> 
> Is that clearer?
Ah, right - got it - sorry for being thick. readlines() slurps the whole 
darn thing so the file pointer is at the EOF and within the loop body, 
i'm just saving that EOF position and restoring it back every time.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web