Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #12819 > unrolled thread
| Started by | Pierre Quentel <pierre.quentel@gmail.com> |
|---|---|
| First post | 2011-09-06 00:18 -0700 |
| Last post | 2011-09-07 00:00 -0700 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
Relative seeks on string IO Pierre Quentel <pierre.quentel@gmail.com> - 2011-09-06 00:18 -0700
Re: Relative seeks on string IO Terry Reedy <tjreedy@udel.edu> - 2011-09-06 16:49 -0400
Re: Relative seeks on string IO Pierre Quentel <pierre.quentel@gmail.com> - 2011-09-07 00:00 -0700
| From | Pierre Quentel <pierre.quentel@gmail.com> |
|---|---|
| Date | 2011-09-06 00:18 -0700 |
| Subject | Relative seeks on string IO |
| Message-ID | <48a795dc-992f-4665-9ada-60b2807fc3b8@u19g2000vbm.googlegroups.com> |
Hi,
I am wondering why relative seeks fail on string IO in Python 3.2
Example :
from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8) # no problem with absolute seek
but
txt.seek(2,1) # 2 characters from current position
raises "IOError: Can't do nonzero cur-relative seeks" (tested with
Python3.2.2 on WindowsXP)
A seek relative to the end of the string IO raises the same IOError
However, it is not difficult to simulate a class that performs
relative seeks on strings :
====================
class FakeIO:
def __init__(self,value):
self.value = value
self.pos = 0
def read(self,nb=None):
if nb is None:
return self.value[self.pos:]
else:
return self.value[self.pos:self.pos+nb]
def seek(self,offset,whence=0):
if whence==0:
self.pos = offset
elif whence==1: # relative to current position
self.pos += offset
elif whence==2: # relative to end of string
self.pos = len(self.value)+offset
txt = FakeIO('Favourite Worst Nightmare')
txt.seek(8)
txt.seek(2,1)
txt.seek(-8,2)
=====================
Is there any reason why relative seeks on string IO are not allowed in
Python3.2, or is it a bug that could be fixed in a next version ?
- Pierre
[toc] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2011-09-06 16:49 -0400 |
| Message-ID | <mailman.810.1315342227.27778.python-list@python.org> |
| In reply to | #12819 |
On 9/6/2011 3:18 AM, Pierre Quentel wrote:
> I am wondering why relative seeks fail on string IO in Python 3.2
Good question.
> from io import StringIO
> txt = StringIO('Favourite Worst Nightmare')
> txt.seek(8) # no problem with absolute seek
Please post code without non-code indents, like so:
from io import StringIO
txt = StringIO('Favourite Worst Nightmare')
txt.seek(8,0) # no problem with absolute seek
txt.seek(0,1) # 0 characters from current position ok, and useless
txt.seek(-2,2) # end-relative gives error message for cur-relative
so someone can copy and paste without deleting indents.
I verified with 3.2.2 on Win7. I am curious what 2.7 and 3.1 do.
What system are you using? Does it have a narrow or wide unicode build?
(IE, what is the value of sys.maxunicode?)
> txt.seek(2,1) # 2 characters from current position
>
> raises "IOError: Can't do nonzero cur-relative seeks" (tested with
> Python3.2.2 on WindowsXP)
>
> A seek relative to the end of the string IO raises the same IOError
> Is there any reason why relative seeks on string IO are not allowed in
> Python3.2, or is it a bug that could be fixed in a next version ?
Since StringIO seeks by fixed-size code units (depending on the build),
making seeking from the current position and end trivial, I consider
this a behavior bug. At minimum, it is a doc bug. I opened
http://bugs.python.org/issue12922
As noted there, I suspect the limitation in inherited from TextIOBase.
But I challenge that it should be.
I was somewhat surprised that seeking (from the start) is not limited to
the existing text. Seeking past the end fills in with nulls. (They are
typically a nuisance though.)
from io import StringIO
txt = StringIO('0123456789')
txt.seek(15,0) # no problem with absolute seek
txt.write('xxx')
s = txt.getvalue()
print(ord(s[12]))
# 0
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Pierre Quentel <pierre.quentel@gmail.com> |
|---|---|
| Date | 2011-09-07 00:00 -0700 |
| Message-ID | <30c6f7f0-b846-47a9-be49-e234a8bb8c16@e9g2000yqb.googlegroups.com> |
| In reply to | #12849 |
>
> Please post code without non-code indents, like so:
>
Sorry about that. After the line "Example :" I indented the next
block, out of habit ;-)
>
> What system are you using? Does it have a narrow or wide unicode build?
> (IE, what is the value of sys.maxunicode?)
>
I use Windows XP Pro, version 2002, SP3. sys.maxunicode is 65535
I have the same behaviour with 3.1.1 and with 2.7
I don't understand why variable sized code units would cause problems.
On text file objects, read(nb) reads nb characters, regardless of the
number of bytes used to encode them, and tell() returns a position in
the text stream just after the next (unicode) character read
As for SringIO, a wrapper around file objects simulates a correct
behaviour for relative seeks :
====================
txt = "abcdef"
txt += "تخصيص هذه الطبعة"
txt += "머니투데이"
txt += "endof file"
out = open("test.txt","w",encoding="utf-8")
out.write(txt)
out.close()
fobj = open("test.txt",encoding="utf-8")
fobj.seek(3)
try:
fobj.seek(2,1)
except IOError:
print('raises IOError')
class _file:
def __init__(self,file_obj):
self.file_obj = file_obj
def read(self,nb=None):
if nb is None:
return self.file_obj.read()
else:
return self.file_obj.read(nb)
def seek(self,offset,whence=0):
if whence==0:
self.file_obj.seek(offset)
else:
if whence==2:
# read till EOF
while True:
buf = self.file_obj.read()
if not buf:
break
self.file_obj.seek(self.file_obj.tell()+offset)
fobj = _file(open("test.txt",encoding="utf-8"))
fobj.seek(3)
fobj.seek(2,1)
fobj.seek(-5,2)
print(fobj.read(3))
==========================
- Pierre
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web