Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #12819 > unrolled thread

Relative seeks on string IO

Started byPierre Quentel <pierre.quentel@gmail.com>
First post2011-09-06 00:18 -0700
Last post2011-09-07 00:00 -0700
Articles 3 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  Relative seeks on string IO Pierre Quentel <pierre.quentel@gmail.com> - 2011-09-06 00:18 -0700
    Re: Relative seeks on string IO Terry Reedy <tjreedy@udel.edu> - 2011-09-06 16:49 -0400
      Re: Relative seeks on string IO Pierre Quentel <pierre.quentel@gmail.com> - 2011-09-07 00:00 -0700

#12819 — Relative seeks on string IO

FromPierre Quentel <pierre.quentel@gmail.com>
Date2011-09-06 00:18 -0700
SubjectRelative seeks on string IO
Message-ID<48a795dc-992f-4665-9ada-60b2807fc3b8@u19g2000vbm.googlegroups.com>
Hi,

I am wondering why relative seeks fail on string IO in Python 3.2

Example :

    from io import StringIO
    txt = StringIO('Favourite Worst Nightmare')
    txt.seek(8) # no problem with absolute seek

but

    txt.seek(2,1) # 2 characters from current position

raises "IOError: Can't do nonzero cur-relative seeks" (tested with
Python3.2.2 on WindowsXP)

A seek relative to the end of the string IO raises the same IOError

However, it is not difficult to simulate a class that performs
relative seeks on strings :

====================
class FakeIO:

    def __init__(self,value):
        self.value = value
        self.pos = 0

    def read(self,nb=None):
        if nb is None:
            return self.value[self.pos:]
        else:
            return self.value[self.pos:self.pos+nb]

    def seek(self,offset,whence=0):
        if whence==0:
            self.pos = offset
        elif whence==1: # relative to current position
            self.pos += offset
        elif whence==2: # relative to end of string
            self.pos = len(self.value)+offset

txt = FakeIO('Favourite Worst Nightmare')
txt.seek(8)
txt.seek(2,1)
txt.seek(-8,2)
=====================

Is there any reason why relative seeks on string IO are not allowed in
Python3.2, or is it a bug that could be fixed in a next version ?

- Pierre

[toc] | [next] | [standalone]


#12849

FromTerry Reedy <tjreedy@udel.edu>
Date2011-09-06 16:49 -0400
Message-ID<mailman.810.1315342227.27778.python-list@python.org>
In reply to#12819
On 9/6/2011 3:18 AM, Pierre Quentel wrote:

> I am wondering why relative seeks fail on string IO in Python 3.2

Good question.

>      from io import StringIO
>      txt = StringIO('Favourite Worst Nightmare')
>      txt.seek(8) # no problem with absolute seek

Please post code without non-code indents, like so:

from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8,0) # no problem with absolute seek
txt.seek(0,1) #  0 characters from current position ok, and useless
txt.seek(-2,2) # end-relative gives error message for cur-relative

so someone can copy and paste without deleting indents.
I verified with 3.2.2 on Win7. I am curious what 2.7 and 3.1 do.

What system are you using? Does it have a narrow or wide unicode build? 
(IE, what is the value of sys.maxunicode?)

>      txt.seek(2,1) # 2 characters from current position
>
> raises "IOError: Can't do nonzero cur-relative seeks" (tested with
> Python3.2.2 on WindowsXP)
>
> A seek relative to the end of the string IO raises the same IOError

> Is there any reason why relative seeks on string IO are not allowed in
> Python3.2, or is it a bug that could be fixed in a next version ?

Since StringIO seeks by fixed-size code units (depending on the build), 
making seeking from the current position and end trivial, I consider 
this a behavior bug. At minimum, it is a doc bug. I opened
http://bugs.python.org/issue12922

As noted there, I suspect the limitation in inherited from TextIOBase. 
But I challenge that it should be.

I was somewhat surprised that seeking (from the start) is not limited to 
the existing text. Seeking past the end fills in with nulls. (They are 
typically a nuisance though.)

from io import StringIO
txt = StringIO('0123456789')
txt.seek(15,0) # no problem with absolute seek
txt.write('xxx')
s  = txt.getvalue()
print(ord(s[12]))
# 0
-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#12886

FromPierre Quentel <pierre.quentel@gmail.com>
Date2011-09-07 00:00 -0700
Message-ID<30c6f7f0-b846-47a9-be49-e234a8bb8c16@e9g2000yqb.googlegroups.com>
In reply to#12849
>
> Please post code without non-code indents, like so:
>
Sorry about that. After the line "Example :" I indented the next
block, out of habit ;-)
>
> What system are you using? Does it have a narrow or wide unicode build?
> (IE, what is the value of sys.maxunicode?)
>
I use Windows XP Pro, version 2002, SP3. sys.maxunicode is 65535

I have the same behaviour with 3.1.1 and with 2.7

I don't understand why variable sized code units would cause problems.
On text file objects, read(nb) reads nb characters, regardless of the
number of bytes used to encode them, and tell() returns a position in
the text stream just after the next (unicode) character read

As for SringIO, a wrapper around file objects simulates a correct
behaviour for relative seeks :

====================
txt = "abcdef"
txt += "تخصيص هذه الطبعة"
txt += "머니투데이"
txt += "endof file"

out = open("test.txt","w",encoding="utf-8")
out.write(txt)
out.close()

fobj = open("test.txt",encoding="utf-8")
fobj.seek(3)
try:
    fobj.seek(2,1)
except IOError:
    print('raises IOError')

class _file:

    def __init__(self,file_obj):
        self.file_obj = file_obj

    def read(self,nb=None):
        if nb is None:
            return self.file_obj.read()
        else:
            return self.file_obj.read(nb)

    def seek(self,offset,whence=0):
        if whence==0:
            self.file_obj.seek(offset)
        else:
            if whence==2:
                # read till EOF
                while True:
                    buf = self.file_obj.read()
                    if not buf:
                        break
            self.file_obj.seek(self.file_obj.tell()+offset)

fobj = _file(open("test.txt",encoding="utf-8"))
fobj.seek(3)
fobj.seek(2,1)
fobj.seek(-5,2)
print(fobj.read(3))
==========================

- Pierre

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web