Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Tue, 12 Feb 2013 03:09:51 +0000
From: MRAB <python@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Python recv loop
References: <mailman.1612.1360544258.2939.python-list@python.org> <roy-CC4632.21243210022013@news.panix.com> <E4DA0A42-324E-43D0-A366-21267C538D5D@grep.my> <51190A32.7070105@mrabarnett.plus.com> <CAPTjJmoe0CJKEFc0dDZr9F68kNyerM1Fp8_ACpWRWjvU6sdtZg@mail.gmail.com> <6F9402BA-4753-4543-A8E7-05E5234660EA@grep.my> <CAPTjJmpGYJqofaL_R6W-TUTaRGTh=f4XVDASmskQKnRnCnu+CA@mail.gmail.com>
In-Reply-To: <CAPTjJmpGYJqofaL_R6W-TUTaRGTh=f4XVDASmskQKnRnCnu+CA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: python-list@python.org
Newsgroups: comp.lang.python
Message-ID: <mailman.1678.1360638583.2939.python-list@python.org>
Lines: 50
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:38718

On 2013-02-12 02:20, Chris Angelico wrote:
> On Tue, Feb 12, 2013 at 12:41 PM, Ihsan Junaidi Ibrahim <ihsan@grep.my> wrote:
>>
>> On Feb 11, 2013, at 11:24 PM, Chris Angelico <rosuav@gmail.com> wrote:
>>
>>> On Tue, Feb 12, 2013 at 2:11 AM, MRAB <python@mrabarnett.plus.com> wrote:
>>>> I probably wouldn't make it fixed length. I'd have the length in
>>>> decimal followed by, say, "\n".
>>>
>>> Or even "followed by any non-digit". Chances are your JSON data begins
>>> with a non-digit, so you'd just have to insert a space in the event
>>> that you're JSON-encoding a flat integer. (Which might not ever
>>> happen, if you know that your data will always be an object.)
>>>
>>> ChrisA
>>
>> So on the first recv() call, I set the buffer at 1 character and I iterate
 >> over single character until a non-digit character is encountered?
>
> More efficient would be to guess that it'll be, say, 10 bytes, and
> then retain any excess for your JSON read loop. But you'd need to sort
> that out between the halves of your code.
>
If the length is always followed by a space then it's easier to split
it off the input:

     buf = sock.recv(10)
     space_pos = buf.find(b" ")
     nbuf = int(buf[ : space_pos])
     buf = buf[space_pos+ 1 : ]

     while len(buf) < nbuf:
         chunk = sock.recv(nbuf - len(buf))
         if not chunk:
             break

         buf += chunk

I'm assuming that:

1. The initial recv returns the length followed by a space. It could,
of course, return fewer bytes (space_pos == -1), so you may need to
recv some more bytes, like what's done later on.

2. At least 10 bytes were sent. Imagine what would happen if the sender
sent b"2 []" immediately followed by b"2 []". The initial recv could
return all of it. In that case you could save the excess until next
time. Alternatively, the sender could guarantee that it would never
send fewer than the 10 bytes, padding with several b" " if necessary.