Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #96872 > unrolled thread
| Started by | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| First post | 2015-09-20 11:22 +0100 |
| Last post | 2015-09-22 22:28 +0100 |
| Articles | 14 — 8 participants |
Back to article view | Back to comp.lang.python
Lightwight socket IO wrapper "James Harris" <james.harris.1@gmail.com> - 2015-09-20 11:22 +0100
Re: Lightwight socket IO wrapper Akira Li <4kir4.1i@gmail.com> - 2015-09-20 16:15 +0300
Re: Lightwight socket IO wrapper "James Harris" <james.harris.1@gmail.com> - 2015-09-20 23:36 +0100
Re: Lightwight socket IO wrapper Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-09-20 20:19 -0400
Re: Lightwight socket IO wrapper Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-09-21 17:46 +1200
Re: Lightwight socket IO wrapper Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-09-21 11:25 +0000
Re: Lightwight socket IO wrapper "James Harris" <james.harris.1@gmail.com> - 2015-09-22 20:45 +0100
Re: Lightwight socket IO wrapper Random832 <random832@fastmail.com> - 2015-09-22 19:52 -0400
Re: Lightwight socket IO wrapper Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-09-23 12:47 +1200
Re: Lightwight socket IO wrapper Chris Angelico <rosuav@gmail.com> - 2015-09-21 10:34 +1000
Re: Lightwight socket IO wrapper Akira Li <4kir4.1i@gmail.com> - 2015-09-21 06:07 +0300
Re: Lightwight socket IO wrapper "James Harris" <james.harris.1@gmail.com> - 2015-09-22 21:05 +0100
Re: Lightwight socket IO wrapper Marko Rauhamaa <marko@pacujo.net> - 2015-09-23 00:00 +0300
Re: Lightwight socket IO wrapper "James Harris" <james.harris.1@gmail.com> - 2015-09-22 22:28 +0100
| From | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| Date | 2015-09-20 11:22 +0100 |
| Subject | Lightwight socket IO wrapper |
| Message-ID | <mtm18o$9fm$1@dont-email.me> |
I guess there have been many attempts to make socket IO easier to handle and a good number of those have been in Python. The trouble with trying to improve something which is already well designed (and conciously left as is) is that the so-called improvement can become much more complex and overly elaborate. That can apply to the initial idea, for sure, but when writing helper or convenience functions perhaps it applies more to the temptation to keep adding just a little bit extra. The end result can be overly elaborate such as a framework which is fine where such is needed but is overkill for simpler requirements. Do you guys have any recommendations of some *lightweight* additions to Python socket IO before I write any more of my own? Something built in to Python would be much preferred over any modules which have to be added. I had in the back of my mind that there was a high-level socket-IO library - much as threading was added as a wrapper to the basic thread module - but I cannot find anything above socket. Is there any? A current specific to illustrate where basic socket IO is limited: it normally provides no guarantees over how many bytes are transferred at a time (AFAICS that's true for both streams and datagrams) so the delimiting of messages/records needs to be handled by the sender and receiver. I do already handle some of this myself but I wondered if there was a prebuilt solution that I should be using instead - to save me adding just a little bit extra. ;-) James
[toc] | [next] | [standalone]
| From | Akira Li <4kir4.1i@gmail.com> |
|---|---|
| Date | 2015-09-20 16:15 +0300 |
| Message-ID | <mailman.37.1442754893.21674.python-list@python.org> |
| In reply to | #96872 |
"James Harris" <james.harris.1@gmail.com> writes: > I guess there have been many attempts to make socket IO easier to > handle and a good number of those have been in Python. > > The trouble with trying to improve something which is already well > designed (and conciously left as is) is that the so-called improvement > can become much more complex and overly elaborate. That can apply to > the initial idea, for sure, but when writing helper or convenience > functions perhaps it applies more to the temptation to keep adding > just a little bit extra. The end result can be overly elaborate such > as a framework which is fine where such is needed but is overkill for > simpler requirements. > > Do you guys have any recommendations of some *lightweight* additions > to Python socket IO before I write any more of my own? Something built > in to Python would be much preferred over any modules which have to be > added. I had in the back of my mind that there was a high-level > socket-IO library - much as threading was added as a wrapper to the > basic thread module - but I cannot find anything above socket. Is > there any? Does ØMQ qualify as lightweight? > A current specific to illustrate where basic socket IO is limited: it > normally provides no guarantees over how many bytes are transferred at > a time (AFAICS that's true for both streams and datagrams) so the > delimiting of messages/records needs to be handled by the sender and > receiver. I do already handle some of this myself but I wondered if > there was a prebuilt solution that I should be using instead - to save > me adding just a little bit extra. ;-) There are already convenience functions in stdlib such as sock.sendall(), sock.sendfile(), socket.create_connection() in addition to BSD Sockets API. If you want to extend this list and have specific suggestions; see https://docs.python.org/devguide/stdlibchanges.html Or just describe your current specific issue in more detail here.
[toc] | [prev] | [next] | [standalone]
| From | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| Date | 2015-09-20 23:36 +0100 |
| Message-ID | <mtnc9q$pqs$1@dont-email.me> |
| In reply to | #96875 |
"Akira Li" <4kir4.1i@gmail.com> wrote in message news:mailman.37.1442754893.21674.python-list@python.org... > "James Harris" <james.harris.1@gmail.com> writes: > >> I guess there have been many attempts to make socket IO easier to >> handle and a good number of those have been in Python. >> >> The trouble with trying to improve something which is already well >> designed (and conciously left as is) is that the so-called >> improvement >> can become much more complex and overly elaborate. That can apply to >> the initial idea, for sure, but when writing helper or convenience >> functions perhaps it applies more to the temptation to keep adding >> just a little bit extra. The end result can be overly elaborate such >> as a framework which is fine where such is needed but is overkill for >> simpler requirements. >> >> Do you guys have any recommendations of some *lightweight* additions >> to Python socket IO before I write any more of my own? Something >> built >> in to Python would be much preferred over any modules which have to >> be >> added. I had in the back of my mind that there was a high-level >> socket-IO library - much as threading was added as a wrapper to the >> basic thread module - but I cannot find anything above socket. Is >> there any? > > Does ØMQ qualify as lightweight? It's certainly interesting. It's puzzling, too. For example, http://zguide.zeromq.org/py:hwserver The Python code there includes message = socket.recv() but given that this is a TCP socket it doesn't look like there is any way for the stack to know how many bytes to return. Either ZeroMQ layers another end-to-end protocol on top of TCP (which would be no good) or it will be guessing (which would not be good either). There are probably answers to that query but there is a lot of documentation, including on reliable communication, and that in itself makes ZeroMQ seem overkill, even if it can be persuaded to do what I want. I am impressed that they show code in many languages. I may come back to it but for the moment it doesn't seem to be what I was looking for. And it is not built in. >> A current specific to illustrate where basic socket IO is limited: it >> normally provides no guarantees over how many bytes are transferred >> at >> a time (AFAICS that's true for both streams and datagrams) so the >> delimiting of messages/records needs to be handled by the sender and >> receiver. I do already handle some of this myself but I wondered if >> there was a prebuilt solution that I should be using instead - to >> save >> me adding just a little bit extra. ;-) > > There are already convenience functions in stdlib such as > sock.sendall(), sock.sendfile(), socket.create_connection() in > addition > to BSD Sockets API. > > If you want to extend this list and have specific suggestions; see > https://docs.python.org/devguide/stdlibchanges.html That may be a bit overkill just now but it's a good suggestion. > Or just describe your current specific issue in more detail here. There are a few things and more crop up as time goes on. For example, over TCP it would be helpful to have a function to receive a specific number of bytes or one to read bytes until reaching a certain delimiter such as newline or zero or space etc. Even better would be to be able to use the iteration protocol so you could just code next() and get the next such chunk of read in a for loop. When sending it would be good to just say to send a bunch of bytes but know that you will get told how many were sent (or didn't get sent) if it fails. Sock.sendall() doesn't do that. I thought UDP would deliver (or drop) a whole datagram but cannot find anything in the Python documentaiton to guarantee that. In fact documentation for the send() call says that apps are responsible for checking that all data has been sent. They may mean that to apply to stream protocols only but it doesn't state that. (Of course, UDP datagrams are limited in size so the call may validly indicate incomplete transmission even when the first part of a big message is sent successfully.) Receiving no bytes is taken as indicating the end of the communication. That's OK for TCP but not for UDP so there should be a way to distinguish between the end of data and receiving an empty datagram. The recv calls require a buffer size to be supplied which is a technical detail. A Python wrapper could save the programmer dealing with that. Reminder to self: encoding issues. None of the above is difficult to write and I have written the bits I need myself but, basically, there are things that would make socket IO easier and yet still compatible with more long-winded code. So I wondered if there were already some Python modules which were more convenient than what I found in the documentation. James
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2015-09-20 20:19 -0400 |
| Message-ID | <mailman.12.1442794762.28679.python-list@python.org> |
| In reply to | #96901 |
On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
<james.harris.1@gmail.com> declaimed the following:
>
>There are a few things and more crop up as time goes on. For example,
>over TCP it would be helpful to have a function to receive a specific
>number of bytes or one to read bytes until reaching a certain delimiter
>such as newline or zero or space etc. Even better would be to be able to
>use the iteration protocol so you could just code next() and get the
>next such chunk of read in a for loop. When sending it would be good to
>just say to send a bunch of bytes but know that you will get told how
>many were sent (or didn't get sent) if it fails. Sock.sendall() doesn't
>do that.
Note that the "buffer size" option on a TCP socket.recv() gives you
your "specific number of bytes" -- if available at that time.
I wouldn't want to user .recv(1) though to implement your "reaching a
certain delimiter"... Much better to read as much as available and search
it for the delimiter. I'll confess, adding a .readln() FOR TCP ONLY, might
be a nice extension over BSD sockets (might need to allow option for
whether line-ends are Internet standard <cr><lf> or some other marker, and
whether they should be converted upon reading to the native format for the
host).
>
>I thought UDP would deliver (or drop) a whole datagram but cannot find
>anything in the Python documentaiton to guarantee that. In fact
>documentation for the send() call says that apps are responsible for
>checking that all data has been sent. They may mean that to apply to
>stream protocols only but it doesn't state that. (Of course, UDP
>datagrams are limited in size so the call may validly indicate
>incomplete transmission even when the first part of a big message is
>sent successfully.)
>
Looking in the wrong documentation <G>
You probably should be looking at the UDP RFC. Or maybe just
http://www.diffen.com/difference/TCP_vs_UDP
"""
Packets are sent individually and are checked for integrity only if they
arrive. Packets have definite boundaries which are honored upon receipt,
meaning a read operation at the receiver socket will yield an entire
message as it was originally sent.
"""
Even if the IP layer has to fragment a UDP packet to meet limits of the
transport media, it should put them back together on the other end before
passing it up to the UDP layer. To my knowledge, UDP does not have a size
limit on the message (well -- a 16-bit length field in the UDP header). But
since it /is/ "got it all" or "dropped" with no inherent confirmation, one
would have to embed their own protocol within it -- sequence numbers with
ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol
would mean having LARGE resends should packets be dropped or arrive out of
sequence (and since the ACK/NAK could be dropped too, you may have to
handle the case of a duplicated packet -- also large).
TCP is a stream protocol -- the protocol will ensure that all data
arrives, and that it arrives in order, but does not enforce any boundaries
on the data; what started as a relatively large packet at one end may
arrive as lots of small packets due to intermediate transport limits (one
can visualize a worst case: each TCP packet is broken up to fit Hollerith
cards; 20bytes for header and 60 bytes of data -- then fed to a reader and
sent on AS-IS). Boundaries are the end-user responsibility... line endings
(look at SMTP, where an email message ends on a line containing just a ".")
or embedded length counter (not the TCP packet length).
>Receiving no bytes is taken as indicating the end of the communication.
>That's OK for TCP but not for UDP so there should be a way to
>distinguish between the end of data and receiving an empty datagram.
>
I don't believe UDP supports a truly empty datagram (length of 0) --
presuming a sending stack actually sends one, the receiving stack will
probably drop it as there is no data to pass on to a client (there is a PR
at work because we have a UDP driver that doesn't drop 0-length messages,
but also can't deliver them -- so the circular buffer might fill with
undeliverable headers)
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2015-09-21 17:46 +1200 |
| Message-ID | <d69jtnFp9qfU1@mid.individual.net> |
| In reply to | #96903 |
Dennis Lee Bieber wrote: > worst case: each TCP packet is broken up to fit Hollerith > cards; Or printed on strips of paper and tied to pigeons: https://en.wikipedia.org/wiki/IP_over_Avian_Carriers -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Jorgen Grahn <grahn+nntp@snipabacken.se> |
|---|---|
| Date | 2015-09-21 11:25 +0000 |
| Message-ID | <slrnmvvq8v.eij.grahn+nntp@frailea.sa.invalid> |
| In reply to | #96903 |
On Mon, 2015-09-21, Dennis Lee Bieber wrote:
> On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
> <james.harris.1@gmail.com> declaimed the following:
...
>>I thought UDP would deliver (or drop) a whole datagram but cannot find
>>anything in the Python documentaiton to guarantee that. In fact
>>documentation for the send() call says that apps are responsible for
>>checking that all data has been sent. They may mean that to apply to
>>stream protocols only but it doesn't state that. (Of course, UDP
>>datagrams are limited in size so the call may validly indicate
>>incomplete transmission even when the first part of a big message is
>>sent successfully.)
>>
> Looking in the wrong documentation <G>
>
> You probably should be looking at the UDP RFC. Or maybe just
>
> http://www.diffen.com/difference/TCP_vs_UDP
>
> """
> Packets are sent individually and are checked for integrity only if they
> arrive. Packets have definite boundaries which are honored upon receipt,
> meaning a read operation at the receiver socket will yield an entire
> message as it was originally sent.
> """
>
> Even if the IP layer has to fragment a UDP packet to meet limits of the
> transport media, it should put them back together on the other end before
> passing it up to the UDP layer. To my knowledge, UDP does not have a size
> limit on the message (well -- a 16-bit length field in the UDP header).
So they are "limited in size" like the OP wrote. (A TCP stream OTOH is
potentially infinite.)
But also, the IPv4 RFC says:
All hosts must be prepared to accept datagrams of up to 576 octets
(whether they arrive whole or in fragments). It is recommended
that hosts only send datagrams larger than 576 octets if they have
assurance that the destination is prepared to accept the larger
datagrams.
As for "all or nothing" with UDP datagrams, you also have the socket
layer case where the user does read() into a 1000 octet buffer and the
datagram was 1200 octets. With BSD sockets you can (if you try)
detect this, but the extra 200 octets are lost forever.
> But since it /is/ "got it all" or "dropped" with no inherent confirmation, one
> would have to embed their own protocol within it -- sequence numbers with
> ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol
> would mean having LARGE resends should packets be dropped or arrive out of
> sequence (and since the ACK/NAK could be dropped too, you may have to
> handle the case of a duplicated packet -- also large).
>
> TCP is a stream protocol -- the protocol will ensure that all data
> arrives, and that it arrives in order, but does not enforce any boundaries
> on the data; what started as a relatively large packet at one end may
> arrive as lots of small packets due to intermediate transport limits (one
> can visualize a worst case: each TCP packet is broken up to fit Hollerith
> cards; 20bytes for header and 60 bytes of data -- then fed to a reader and
> sent on AS-IS).
The problem is IMO more this: the chunks of data that the application
writes doesn't map to what the other application reads. In the lower
layers, I don't expect TCP segments to be split, and IP fragmentation
(if it happens at all) operates at an even lower level.
However the end result is still just as you write:
> Boundaries are the end-user responsibility... line endings
> (look at SMTP, where an email message ends on a line containing just a ".")
> or embedded length counter (not the TCP packet length).
>
>>Receiving no bytes is taken as indicating the end of the communication.
>>That's OK for TCP but not for UDP so there should be a way to
>>distinguish between the end of data and receiving an empty datagram.
>>
> I don't believe UDP supports a truly empty datagram (length of 0) --
> presuming a sending stack actually sends one, the receiving stack will
> probably drop it as there is no data to pass on to a client
UDP datagrams of length 0 work (just tried it on Linux). There's
nothing special about it.
> (there is a PR
> at work because we have a UDP driver that doesn't drop 0-length messages,
> but also can't deliver them -- so the circular buffer might fill with
> undeliverable headers)
Those messages should be delivered to the receiving socket, in the
sense that they are sanity-checked, used to wake up the application
and mark the socket readable, fill up one entry in the read queue and
so on ...
Of course your system at work may have the rights to be more
restrictive, if it's special-purpose.
/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
[toc] | [prev] | [next] | [standalone]
| From | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| Date | 2015-09-22 20:45 +0100 |
| Message-ID | <mtsb10$uoj$1@dont-email.me> |
| In reply to | #96903 |
"Dennis Lee Bieber" <wlfraed@ix.netcom.com> wrote in message news:mailman.12.1442794762.28679.python-list@python.org... > On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris" > <james.harris.1@gmail.com> declaimed the following: > > >> >>There are a few things and more crop up as time goes on. For example, >>over TCP it would be helpful to have a function to receive a specific >>number of bytes or one to read bytes until reaching a certain >>delimiter >>such as newline or zero or space etc. Even better would be to be able >>to >>use the iteration protocol so you could just code next() and get the >>next such chunk of read in a for loop. When sending it would be good >>to >>just say to send a bunch of bytes but know that you will get told how >>many were sent (or didn't get sent) if it fails. Sock.sendall() >>doesn't >>do that. > > Note that the "buffer size" option on a TCP socket.recv() gives you > your "specific number of bytes" -- if available at that time. "If" is a big word! AIUI the buffer size is not guaranteed to relate to the number of bytes returned except that you won't/shouldn't(!) get more than the buffer size. > I wouldn't want to user .recv(1) though to implement your "reaching a > certain delimiter"... Much better to read as much as available and > search > it for the delimiter. Yes, that's what I do at the moment. I keep a block of bytes, add any new stuff to it and scan it for delimiters. > I'll confess, adding a .readln() FOR TCP ONLY, might > be a nice extension over BSD sockets (might need to allow option for > whether line-ends are Internet standard <cr><lf> or some other marker, > and > whether they should be converted upon reading to the native format for > the > host). Akira Li pointed out that there is just such an extension: makefile. Scanning to <lf> is what I do just now as that includes <cr><lf> too and I leave them on the string. IIRC file.readline works in the same way. >>I thought UDP would deliver (or drop) a whole datagram but cannot find >>anything in the Python documentaiton to guarantee that. In fact >>documentation for the send() call says that apps are responsible for >>checking that all data has been sent. They may mean that to apply to >>stream protocols only but it doesn't state that. (Of course, UDP >>datagrams are limited in size so the call may validly indicate >>incomplete transmission even when the first part of a big message is >>sent successfully.) >> > Looking in the wrong documentation <G> > > You probably should be looking at the UDP RFC. Or maybe just > > http://www.diffen.com/difference/TCP_vs_UDP > > """ > Packets are sent individually and are checked for integrity only if > they > arrive. Packets have definite boundaries which are honored upon > receipt, > meaning a read operation at the receiver socket will yield an entire > message as it was originally sent. > """ I would rather see it in the Python docs because we program to the language standard and there can be - and often are, for good reason - areas where Python does not work in the same way as underlying systems. > Even if the IP layer has to fragment a UDP packet to meet limits of > the > transport media, it should put them back together on the other end > before > passing it up to the UDP layer. To my knowledge, UDP does not have a > size > limit on the message (well -- a 16-bit length field in the UDP > header). But > since it /is/ "got it all" or "dropped" with no inherent confirmation, > one > would have to embed their own protocol within it -- sequence numbers > with > ACK/NAK, for example. Problem: if using LARGE UDP packets, this > protocol > would mean having LARGE resends should packets be dropped or arrive > out of > sequence (and since the ACK/NAK could be dropped too, you may have to > handle the case of a duplicated packet -- also large). Yes, it was the 16-bit limitation that I was talking about. > TCP is a stream protocol -- the protocol will ensure that all data > arrives, and that it arrives in order, but does not enforce any > boundaries > on the data; what started as a relatively large packet at one end may > arrive as lots of small packets due to intermediate transport limits > (one > can visualize a worst case: each TCP packet is broken up to fit > Hollerith > cards; 20bytes for header and 60 bytes of data -- then fed to a reader > and > sent on AS-IS). Boundaries are the end-user responsibility... line > endings > (look at SMTP, where an email message ends on a line containing just a > ".") > or embedded length counter (not the TCP packet length). Yes. >>Receiving no bytes is taken as indicating the end of the >>communication. >>That's OK for TCP but not for UDP so there should be a way to >>distinguish between the end of data and receiving an empty datagram. >> > I don't believe UDP supports a truly empty datagram (length of 0) -- > presuming a sending stack actually sends one, the receiving stack will > probably drop it as there is no data to pass on to a client (there is > a PR > at work because we have a UDP driver that doesn't drop 0-length > messages, > but also can't deliver them -- so the circular buffer might fill with > undeliverable headers) As others have pointed out, UDP implementations do seem to work with zero-byte datagrams properly. Again, I would rather see that in the Python documentation which is what, effectively, forms a contract that we should be able to rely on. James
[toc] | [prev] | [next] | [standalone]
| From | Random832 <random832@fastmail.com> |
|---|---|
| Date | 2015-09-22 19:52 -0400 |
| Message-ID | <mailman.84.1442965978.28679.python-list@python.org> |
| In reply to | #96988 |
On Tue, Sep 22, 2015, at 15:45, James Harris wrote: > "Dennis Lee Bieber" <wlfraed@ix.netcom.com> wrote in message > news:mailman.12.1442794762.28679.python-list@python.org... > > On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris" > > <james.harris.1@gmail.com> declaimed the following: > >>Receiving no bytes is taken as indicating the end of the > >>communication. > >>That's OK for TCP but not for UDP so there should be a way to > >>distinguish between the end of data and receiving an empty datagram. > >> > > I don't believe UDP supports a truly empty datagram (length of 0) -- > > presuming a sending stack actually sends one, the receiving stack will > > probably drop it as there is no data to pass on to a client (there is > > a PR > > at work because we have a UDP driver that doesn't drop 0-length > > messages, > > but also can't deliver them -- so the circular buffer might fill with > > undeliverable headers) > > As others have pointed out, UDP implementations do seem to work with > zero-byte datagrams properly. Again, I would rather see that in the > Python documentation which is what, effectively, forms a contract that > we should be able to rely on. Isn't this technically the same problem as pressing ctrl-d at a terminal - it's not _really_ the end of the input (you can continue reading after), but it sends the program something it will interpret as such?
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2015-09-23 12:47 +1200 |
| Message-ID | <d6eb4uFi26U1@mid.individual.net> |
| In reply to | #97011 |
Random832 wrote: > Isn't this technically the same problem as pressing ctrl-d at a terminal > - it's not _really_ the end of the input (you can continue reading > after), but it sends the program something it will interpret as such? Yes. There's no concept of "closing the connection" with UDP, because there's no connection. So if a read returns 0 bytes, it must be because someone sent you a 0-length datagram. -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-09-21 10:34 +1000 |
| Message-ID | <mailman.14.1442795696.28679.python-list@python.org> |
| In reply to | #96901 |
On Mon, Sep 21, 2015 at 10:19 AM, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote: > Even if the IP layer has to fragment a UDP packet to meet limits of the > transport media, it should put them back together on the other end before > passing it up to the UDP layer. To my knowledge, UDP does not have a size > limit on the message (well -- a 16-bit length field in the UDP header). But > since it /is/ "got it all" or "dropped" with no inherent confirmation, one > would have to embed their own protocol within it -- sequence numbers with > ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol > would mean having LARGE resends should packets be dropped or arrive out of > sequence (and since the ACK/NAK could be dropped too, you may have to > handle the case of a duplicated packet -- also large). > If you're going to add sequencing and acknowledgements to UDP, wouldn't it be easier to use TCP and simply prefix every message with a two-byte length? UDP is great when order doesn't matter and each packet stands entirely alone. DNS is a well-known example - the question "What is the IP address for www.rosuav.com?" doesn't in any way affect the question "What is the mail server for gmail.com?", so you fire off UDP packets for each one, and get responses whenever you get them. UDP's also perfect for a heartbeat system - you send out a packet every however-often, and if the monitor hasn't heard from you in X seconds, it starts alerting people. No need for responses of any kind there. But for working with a stream, I usually find it's a lot easier to build on top of TCP than UDP. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Akira Li <4kir4.1i@gmail.com> |
|---|---|
| Date | 2015-09-21 06:07 +0300 |
| Message-ID | <mailman.18.1442804862.28679.python-list@python.org> |
| In reply to | #96901 |
"James Harris" <james.harris.1@gmail.com> writes:
...
> There are a few things and more crop up as time goes on. For example,
> over TCP it would be helpful to have a function to receive a specific
> number of bytes or one to read bytes until reaching a certain
> delimiter such as newline or zero or space etc.
The answer is sock.makefile('rb') then `file.read(nbytes)` returns a
specific number of bytes.
`file.readline()` reads until newline (b'\n') There is Python Issue:
"Add support for reading records with arbitrary separators to the
standard IO stack"
http://bugs.python.org/issue1152248
See also
http://bugs.python.org/issue17083
Perhaps, it is easier to implement read_until(sep) that is best suited
for a particular case.
> Even better would be to be able to use the iteration protocol so you
> could just code next() and get the next such chunk of read in a for
> loop.
file is an iterator over lines i.e., next(file) works.
> When sending it would be good to just say to send a bunch of bytes but
> know that you will get told how many were sent (or didn't get sent) if
> it fails. Sock.sendall() doesn't do that.
sock.send() returns the number of bytes sent that may be less than given.
You could reimplement sock.sendall() to include the number of bytes
successfully sent in case of an error.
> I thought UDP would deliver (or drop) a whole datagram but cannot find
> anything in the Python documentaiton to guarantee that. In fact
> documentation for the send() call says that apps are responsible for
> checking that all data has been sent. They may mean that to apply to
> stream protocols only but it doesn't state that. (Of course, UDP
> datagrams are limited in size so the call may validly indicate
> incomplete transmission even when the first part of a big message is
> sent successfully.)
>
> Receiving no bytes is taken as indicating the end of the
> communication. That's OK for TCP but not for UDP so there should be a
> way to distinguish between the end of data and receiving an empty
> datagram.
There is no end of communication in UDP and therefore there is no end of
data. If you've got a zero bytes in return then it means that you've
received a zero length datagram.
sock.recvfrom() is a thin wrapper around the corresponding C
function. You could read any docs you like about UDP sockets.
http://stackoverflow.com/questions/5307031/how-to-detect-receipt-of-a-0-length-udp-datagram
> The recv calls require a buffer size to be supplied which is a
> technical detail. A Python wrapper could save the programmer dealing
> with that.
It is not just a buffer size. It is the maximum amount of data to be
received at once i.e., sock.recv() may return less but never more.
You could use makefile() and read() if recv() is too low-level.
> Reminder to self: encoding issues.
>
> None of the above is difficult to write and I have written the bits I
> need myself but, basically, there are things that would make socket IO
> easier and yet still compatible with more long-winded code. So I
> wondered if there were already some Python modules which were more
> convenient than what I found in the documentation.
>
> James
[toc] | [prev] | [next] | [standalone]
| From | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| Date | 2015-09-22 21:05 +0100 |
| Message-ID | <mtsc60$41t$1@dont-email.me> |
| In reply to | #96910 |
"Akira Li" <4kir4.1i@gmail.com> wrote in message
news:mailman.18.1442804862.28679.python-list@python.org...
> "James Harris" <james.harris.1@gmail.com> writes:
> ...
>> There are a few things and more crop up as time goes on. For example,
>> over TCP it would be helpful to have a function to receive a specific
>> number of bytes or one to read bytes until reaching a certain
>> delimiter such as newline or zero or space etc.
>
> The answer is sock.makefile('rb') then `file.read(nbytes)` returns a
> specific number of bytes.
Thanks, I hadn't seen that. Now I know of it I see references to it all
over the place but beforehand it was in hiding....
It is exactly the type of convenience wrapper I was expecting Python to
have but expected it to be in another module. It looks as though it will
definitely cover some of the issues I had.
> `file.readline()` reads until newline (b'\n') There is Python Issue:
> "Add support for reading records with arbitrary separators to the
> standard IO stack"
> http://bugs.python.org/issue1152248
> See also
> http://bugs.python.org/issue17083
>
> Perhaps, it is easier to implement read_until(sep) that is best suited
> for a particular case.
OK.
...
>> When sending it would be good to just say to send a bunch of bytes
>> but
>> know that you will get told how many were sent (or didn't get sent)
>> if
>> it fails. Sock.sendall() doesn't do that.
>
> sock.send() returns the number of bytes sent that may be less than
> given.
> You could reimplement sock.sendall() to include the number of bytes
> successfully sent in case of an error.
I know. As mentioned, I wondered if there were already such functions to
save me using my own.
>> I thought UDP would deliver (or drop) a whole datagram but cannot
>> find
>> anything in the Python documentaiton to guarantee that. In fact
>> documentation for the send() call says that apps are responsible for
>> checking that all data has been sent. They may mean that to apply to
>> stream protocols only but it doesn't state that. (Of course, UDP
>> datagrams are limited in size so the call may validly indicate
>> incomplete transmission even when the first part of a big message is
>> sent successfully.)
>>
>> Receiving no bytes is taken as indicating the end of the
>> communication. That's OK for TCP but not for UDP so there should be a
>> way to distinguish between the end of data and receiving an empty
>> datagram.
>
> There is no end of communication in UDP and therefore there is no end
> of
> data. If you've got a zero bytes in return then it means that you've
> received a zero length datagram.
>
> sock.recvfrom() is a thin wrapper around the corresponding C
> function. You could read any docs you like about UDP sockets.
>
> http://stackoverflow.com/questions/5307031/how-to-detect-receipt-of-a-0-length-udp-datagram
As mentioned to Dennis just now, I would prefer to write code to conform
with the documented behaviour of Python and its libraries, as long as
they were known to be reliable implementations of what was documented,
of course.
I agree with what you say. A zero-length UDP datagram should be possible
and not indicate end of input but is that guaranteed and portable?
(Rhetorical.) It seems not. Even the Linux man page for recv says: "If
no messages are available at the socket, the receive calls wait
for a message to arrive, unless the socket is nonblocking...." In that
case, of course, what it defines as a "message" - and whether it can be
zero length or not - is not stated.
>> The recv calls require a buffer size to be supplied which is a
>> technical detail. A Python wrapper could save the programmer dealing
>> with that.
>
> It is not just a buffer size. It is the maximum amount of data to be
> received at once i.e., sock.recv() may return less but never more.
My point was that we might want to request the entire next line or next
field of input and not know a maximum length. *C* programmers are used
to giving buffers fixed sizes often because then they can avoid fiddling
with memory management but Python normally does that for us. I was
suggesting that the thin wrapper around the socket recv() call is too
thin! The makefile() approach that you mentioned seems more Pythonesque,
though.
> You could use makefile() and read() if recv() is too low-level.
Yes.
James
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-09-23 00:00 +0300 |
| Message-ID | <8737y6cgp6.fsf@elektro.pacujo.net> |
| In reply to | #96990 |
"James Harris" <james.harris.1@gmail.com>: > I agree with what you say. A zero-length UDP datagram should be > possible and not indicate end of input but is that guaranteed and > portable? The zero-length payload size shouldn't be an issue, but UDP doesn't make any guarantees about delivering the message. Your UDP application must be prepared for some, most or all of the messages disappearing without any error indication. In practice, you'd end up implementing your own TCP on top of UDP (retries, timeouts, acknowledgements, sequence numbers etc). Marko
[toc] | [prev] | [next] | [standalone]
| From | "James Harris" <james.harris.1@gmail.com> |
|---|---|
| Date | 2015-09-22 22:28 +0100 |
| Message-ID | <mtsh2k$obt$1@dont-email.me> |
| In reply to | #96993 |
"Marko Rauhamaa" <marko@pacujo.net> wrote in message news:8737y6cgp6.fsf@elektro.pacujo.net... > "James Harris" <james.harris.1@gmail.com>: > >> I agree with what you say. A zero-length UDP datagram should be >> possible and not indicate end of input but is that guaranteed and >> portable? > > The zero-length payload size shouldn't be an issue, but UDP doesn't > make > any guarantees about delivering the message. Your UDP application must > be prepared for some, most or all of the messages disappearing without > any error indication. > > In practice, you'd end up implementing your own TCP on top of UDP > (retries, timeouts, acknowledgements, sequence numbers etc). The unreliability of UDP was not the case in point here. Rather, it was about whether different platforms could be relied upon to deliver zero-length datagrams to the app if the datagrams got safely across the network. James
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web