Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.ruby > #3221 > unrolled thread
| Started by | hroyd hroyd <hroyd@mailinator.com> |
|---|---|
| First post | 2011-04-20 03:26 -0500 |
| Last post | 2011-04-21 12:18 -0500 |
| Articles | 9 — 4 participants |
Back to article view | Back to comp.lang.ruby
splitting binary data hroyd hroyd <hroyd@mailinator.com> - 2011-04-20 03:26 -0500
Re: splitting binary data 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-20 12:54 -0500
Re: splitting binary data Iñaki Baz Castillo <ibc@aliax.net> - 2011-04-21 05:52 -0500
Re: splitting binary data 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-21 12:13 -0500
Re: splitting binary data Iñaki Baz Castillo <ibc@aliax.net> - 2011-04-21 12:27 -0500
Re: splitting binary data 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-21 12:56 -0500
Re: splitting binary data "Y. NOBUOKA" <nobuoka@r-definition.com> - 2011-04-26 04:47 -0500
Re: splitting binary data hroyd hroyd <hroyd@mailinator.com> - 2011-04-21 05:01 -0500
Re: splitting binary data 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-21 12:18 -0500
| From | hroyd hroyd <hroyd@mailinator.com> |
|---|---|
| Date | 2011-04-20 03:26 -0500 |
| Subject | splitting binary data |
| Message-ID | <a27c4fbb30b28a4543b4ed0920037c24@ruby-forum.com> |
Hello First post (i am new to ruby :-)). Can you help? I am using eventmachine to read in TCP segments off the network. I read in a TCP segment that contains 4 messages. The TCP segment binary data is shown below, where \xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\ is the marker for each message. I would like to split the data intot he 4 messages, but am having trouble doing so. When I split the data, the whole message gets inserted into the first array element. I understand I may need to escape the \, but how would i do that for the following message. I can split it by unpacking to Hex, and the splitting, but that is inefficient for my needs as I use bindata to inspect the packet. Any help is appreciated Thanks \xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00R\x02\x00\x00\x00'@\x01\x01\x00@\x02\x00@\x03\x04\n\x10\x8E\xC8\x80\x04\x04\x00\x00\x002@\x05\x04\x00\x00\x00d\xC0\b\b\x00\x01\x00\x01\x00\x01\x00\x02\x18\n\x10\x8E\b\x04\x18\x02\x02\x02 \n\x13\x00\x01 \x01\x01\x01\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00A\x02\x00\x00\x00'@\x01\x01\x00@\x02\x00@\x03\x04\n\x10\x8E\xC8\x80\x04\x04\x00\x00\x002@\x05\x04\x00\x00\x00d\xC0\b\b\x00\x01\x00\x01\x00\x01\x00\x02\x10\x03\x03\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00d\x02\x00\x00\x00I@\x01\x01\x00@\x02\x1E\x02\x0E=\xD6R\x132H2H2H2H2H2H2H2H\x8A\xEA\x8A\xEA\x8A\xEA\x8A\xEA@\x03\x04\n\x10\x8E\n\x80\x04\x04\x00\x00\x00\x00@\x05\x04\x00\x00\x00d\xC0\b\f=\xD6\x01,=\xD6\x011=\xD6\v\xEB\x16.\xAE\xF0\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00N\x02\x00\x00\x003@\x01\x01\x00@\x02\b\x02\x03=\xD6R\xE3\xC0\x1F@\x03\x04\n\x10\x8E\n\x80\x04\x04\x00\x00\x00\x00@\x05\x04\x00\x00\x00d\xC0\b\f=\xD6\x01,=\xD6\x011=\xD6\v\xEB\x16.\xAD\xA8 -- Posted via http://www.ruby-forum.com/.
[toc] | [next] | [standalone]
| From | 7stud -- <bbxx789_05ss@yahoo.com> |
|---|---|
| Date | 2011-04-20 12:54 -0500 |
| Message-ID | <cfb58e3ca13b30817c52febc64a9b79e@ruby-forum.com> |
| In reply to | #3221 |
hroyd hroyd wrote in post #993957:
> Hello
>
> First post (i am new to ruby :-)). Can you help?
>
> I am using eventmachine to read in TCP segments off the network. I read
> in a TCP segment that contains 4 messages. The TCP segment binary data
> is shown below, where
> \xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\ is the
> marker for each message. I would like to split the data intot he 4
> messages, but am having trouble doing so. When I split the data, the
> whole message gets inserted into the first array element.
I'm not seeing that. Your message starts with the delimiter, so the
first element of the array will be a blank string:
str = "\xFF\xFF" +
"\x61" +
"\xFF\xFF" +
"\x62" +
"\xFF\xFF" +
"\x63" +
"\xFF\xFF" +
"\x64"
pattern = "\xFF\xFF"
p str.split(pattern)
--output:--
["", "a", "b", "c", "d"]
--
Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [next] | [standalone]
| From | Iñaki Baz Castillo <ibc@aliax.net> |
|---|---|
| Date | 2011-04-21 05:52 -0500 |
| Message-ID | <BANLkTiktwYj=Q-2B+FFDhEnymiucqHp8CQ@mail.gmail.com> |
| In reply to | #3261 |
2011/4/20 7stud -- <bbxx789_05ss@yahoo.com>:
> str = "\xFF\xFF" +
> "\x61" +
> "\xFF\xFF" +
> "\x62" +
> "\xFF\xFF" +
> "\x63" +
> "\xFF\xFF" +
> "\x64"
>
> pattern = "\xFF\xFF"
> p str.split(pattern)
>
> --output:--
> ["", "a", "b", "c", "d"]
Note that this fails under Ruby1.9:
p str.split(pattern)
ArgumentError: invalid byte sequence in UTF-8
from (irb):10:in `split'
--
Iñaki Baz Castillo
<ibc@aliax.net>
[toc] | [prev] | [next] | [standalone]
| From | 7stud -- <bbxx789_05ss@yahoo.com> |
|---|---|
| Date | 2011-04-21 12:13 -0500 |
| Message-ID | <bbe459046541b7ad39aca75e5822aa7d@ruby-forum.com> |
| In reply to | #3307 |
"Iñaki Baz Castillo" <ibc@aliax.net> wrote in post #994264: > 2011/4/20 7stud -- <bbxx789_05ss@yahoo.com>: >> p str.split(pattern) >> >> --output:-- >> ["", "a", "b", "c", "d"] > > Note that this fails under Ruby1.9: > > p str.split(pattern) > ArgumentError: invalid byte sequence in UTF-8 > from (irb):10:in `split' I guess you missed this: puts RUBY_VERSION .. .. .. --output:-- 1.9.2 -- Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [next] | [standalone]
| From | Iñaki Baz Castillo <ibc@aliax.net> |
|---|---|
| Date | 2011-04-21 12:27 -0500 |
| Message-ID | <BANLkTikpNdaK9nVbavU4BNzkaNY=s+SXJQ@mail.gmail.com> |
| In reply to | #3322 |
2011/4/21 7stud -- <bbxx789_05ss@yahoo.com>: >> Note that this fails under Ruby1.9: >> >> p str.split(pattern) >> ArgumentError: invalid byte sequence in UTF-8 >> from (irb):10:in `split' > > I guess you missed this: > > puts RUBY_VERSION > > ... > > --output:-- > 1.9.2 Interesting, I also use 1.9.2, but have realized that it fails under irb, but not in case I run the above code in a separate file. -- Iñaki Baz Castillo <ibc@aliax.net>
[toc] | [prev] | [next] | [standalone]
| From | 7stud -- <bbxx789_05ss@yahoo.com> |
|---|---|
| Date | 2011-04-21 12:56 -0500 |
| Message-ID | <1c812474cfe40a58c9c935e0b1559f78@ruby-forum.com> |
| In reply to | #3326 |
"Iñaki Baz Castillo" <ibc@aliax.net> wrote in post #994342: > 2011/4/21 7stud -- <bbxx789_05ss@yahoo.com>: >> ... >> >> --output:-- >> 1.9.2 > > > Interesting, I also use 1.9.2, but have realized that it fails under > irb, but not in case I run the above code in a separate file.a I never use irb like interfaces in any language anymore--they are unreliable. -- Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [next] | [standalone]
| From | "Y. NOBUOKA" <nobuoka@r-definition.com> |
|---|---|
| Date | 2011-04-26 04:47 -0500 |
| Message-ID | <BANLkTin=eOvsZfo5fD9bNc0twZUuBJ5A+g@mail.gmail.com> |
| In reply to | #3329 |
On ruby 1.9, a String object knows the encoding of itself. And, If a String object includes byte sequences unsuitable for the encoding, the String#split method raises error. Not using the magic comment, it's not the matter that a string literal includes non-ASCII characters. ## example: OK!! #------------------------------------------------- #! ruby-1.9.2 str = "\xFF\xFF\x61\xFF\xFF\x62\xFF\xFF\x63\xFF\xFF\x64" p str.encoding #=> #<Encoding:ASCII-8BIT> p str.valid_encoding? #=> true pattern = "\xFF\xFF" p str.split( pattern ) #=> ["", "a", "b", "c", "d"] #------------------------------------------------- However, using the magic comment to tell the file encoding is UTF-8, it's the matter that a string literal includes non-ASCII characters. ## example: NG #------------------------------------------------- #! ruby-1.9.2 # coding: UTF-8 str = "\xFF\xFF\x61\xFF\xFF\x62\xFF\xFF\x63\xFF\xFF\x64" p str.encoding #=> #<Encoding:UTF-8> p str.valid_encoding? #=> false pattern = "\xFF\xFF" p pattern.valid_encoding? #=> false p str.split( pattern ) # ERROR OCCURS!!! #------------------------------------------------- Avoiding this problem, you must change the encoding of the string which include non-ASCII characters into ASCII-8BIT. ## example: avoiding the problem #------------------------------------------------- #! ruby-1.9.2 # coding: UTF-8 str = "\xFF\xFF\x61\xFF\xFF\x62\xFF\xFF\x63\xFF\xFF\x64" # change the encoding of the string str.force_encoding Encoding::ASCII_8BIT p str.encoding #=> #<Encoding:ASCII-8BIT> p str.valid_encoding? #=> true pattern = "\xFF\xFF".force_encoding Encoding::ASCII_8BIT p pattern.valid_encoding? #=> true p str.split( pattern ) #=> ["", "a", "b", "c", "d"] #------------------------------------------------- Kind regards, -- NOBUOKA Yu
[toc] | [prev] | [next] | [standalone]
| From | hroyd hroyd <hroyd@mailinator.com> |
|---|---|
| Date | 2011-04-21 05:01 -0500 |
| Message-ID | <7568e966e6a587834b24d6893c5b6c41@ruby-forum.com> |
| In reply to | #3221 |
Thanks for the reply, that works I was trying to split on "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\" but dropping the last \ was what I was missing "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF" ["", "\x00R\x02\x00\x00\x00'@\x01\x01\x00@\x02\x00@\x03\x04\n\x10\x8E\xC8\x80\x04\x04\x00\x00\x002@\x05\x04\x00\x00\x00d\xC0\b\b\x00\x01\x00\x01\x00\x01\x00\x02\x18\n\x10\x8E\b\x04\x18\x02\x02\x02 \n\x13\x00\x01 \x01\x01\x01\x01", "\x00A\x02\x00\x00\x00'@\x01\x01\x00@\x02\x00@\x03\x04\n\x10\x8E\xC8\x80\x04\x04\x00\x00\x002@\x05\x04\x00\x00\x00d\xC0\b\b\x00\x01\x00\x01\x00\x01\x00\x02\x10\x03\x03", "\x00d\x02\x00\x00\x00I@\x01\x01\x00@\x02\x1E\x02\x0E=\xD6R\x132H2H2H2H2H2H2H2H\x8A\xEA\x8A\xEA\x8A\xEA\x8A\xEA@\x03\x04\n\x10\x8E\n\x80\x04\x04\x00\x00\x00\x00@\x05\x04\x00\x00\x00d\xC0\b\f=\xD6\x01,=\xD6\x011=\xD6\v\xEB\x16.\xAE\xF0", "\x00N\x02\x00\x00\x003@\x01\x01\x00@\x02\b\x02\x03=\xD6R\xE3\xC0\x1F@\x03\x04\n\x10\x8E\n\x80\x04\x04\x00\x00\x00\x00@\x05\x04\x00\x00\x00d\xC0\b\f=\xD6\x01,=\xD6\x011=\xD6\v\xEB\x16.\xAD\xA8"] Thanks for your help -- Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [next] | [standalone]
| From | 7stud -- <bbxx789_05ss@yahoo.com> |
|---|---|
| Date | 2011-04-21 12:18 -0500 |
| Message-ID | <bd8e6df73eed51dd17b0d11bfdc15f5b@ruby-forum.com> |
| In reply to | #3305 |
hroyd hroyd wrote in post #994257: > > Thanks for your help > Sure. Also, note that ruby lets you do this: pattern = "\xFF" * 16 p pattern --output:-- "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF" ..so that you don't have to write that out by hand. -- Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.ruby
csiph-web