Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.ruby > #2389 > unrolled thread
| Started by | "Aaron D. Gifford" <astounding@gmail.com> |
|---|---|
| First post | 2011-04-06 12:52 -0500 |
| Last post | 2011-04-06 18:33 -0500 |
| Articles | 5 — 3 participants |
Back to article view | Back to comp.lang.ruby
String#each_*slice* methods (like Enumerable#each_slice) "Aaron D. Gifford" <astounding@gmail.com> - 2011-04-06 12:52 -0500
Re: String#each_*slice* methods (like Enumerable#each_slice) Quintus <sutniuq@gmx.net> - 2011-04-06 13:42 -0500
Re: String#each_*slice* methods (like Enumerable#each_slice) "Aaron D. Gifford" <astounding@gmail.com> - 2011-04-06 13:49 -0500
Re: String#each_*slice* methods (like Enumerable#each_slice) "Aaron D. Gifford" <astounding@gmail.com> - 2011-04-06 14:09 -0500
Re: String#each_*slice* methods (like Enumerable#each_slice) 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-06 18:33 -0500
| From | "Aaron D. Gifford" <astounding@gmail.com> |
|---|---|
| Date | 2011-04-06 12:52 -0500 |
| Subject | String#each_*slice* methods (like Enumerable#each_slice) |
| Message-ID | <BANLkTinMv0-L01+pwiiLc0Gruk+56B1GeQ@mail.gmail.com> |
Hi,
I find I periodically need to iterate over slices of a string.
Enumerable has the useful each_slice method, but in Ruby 1.9, I don't
see an equivalent for the String class.
So I've monkey-patched String a bit like this:
## Monkeypatch String to add some each_*slice* methods:
class String
## Like Enumerable#each_slice() only it yields a string
## of chars characters (the slice):
def each_slice(chars)
self.scan(/.{1,#{chars}}/m).each do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields an array
## of Fixnum bytes from the string (the slice):
def each_byteslice(bytes)
self.bytes.to_a.each_slice(bytes) do |s|
yield s
end
end
## Like Enumerable#each_slice() only it yields a binary
## string of specified bytes (the slice):
def each_bslice(bytes)
if encoding == Encoding::BINARY
str = self
else
str = self.dup.force_encoding(Encoding::BINARY)
end
str.scan(/.{1,#{bytes}}/m).each do |s|
yield s
end
end
end
So now for the question. Is there a better way to accomplish
something similar? I'm not debating whether to do it as a monkey
patch or not--that's irrelevant to me. But is there a more efficient
way to slice up strings and iterate over fixed sized chunks?
One alternative each_bslice implementation I tried used
str.bytes.to_a.map(&:chr).each_slice(x){|c| p c.join} but it was a bit
slower in benchmarks versus the str.scan method.
Aaron out.
[toc] | [next] | [standalone]
| From | Quintus <sutniuq@gmx.net> |
|---|---|
| Date | 2011-04-06 13:42 -0500 |
| Message-ID | <4D9CB445.8000504@gmx.net> |
| In reply to | #2389 |
Am 06.04.2011 19:52, schrieb Aaron D. Gifford:
> So now for the question. Is there a better way to accomplish
> something similar? I'm not debating whether to do it as a monkey
> patch or not--that's irrelevant to me. But is there a more efficient
> way to slice up strings and iterate over fixed sized chunks?
>
> One alternative each_bslice implementation I tried used
> str.bytes.to_a.map(&:chr).each_slice(x){|c| p c.join} but it was a bit
> slower in benchmarks versus the str.scan method.
>
> Aaron out.
>
>
Use Enumarators:
================================
irb(main):001:0> str = "ÄÄÄÖÖÖÜÜÜ"
=> "ÄÄÄÖÖÖÜÜÜ"
irb(main):002:0> str.chars.each_slice(3){|x| p x}
["Ä", "Ä", "Ä"]
["Ö", "Ö", "Ö"]
["Ü", "Ü", "Ü"]
=> nil
irb(main):003:0> str.bytes.each_slice(3){|x| p x}
[195, 132, 195]
[132, 195, 132]
[195, 150, 195]
[150, 195, 150]
[195, 156, 195]
[156, 195, 156]
=> nil
irb(main):004:0>
================================
Vale,
Marvin
[toc] | [prev] | [next] | [standalone]
| From | "Aaron D. Gifford" <astounding@gmail.com> |
|---|---|
| Date | 2011-04-06 13:49 -0500 |
| Message-ID | <BANLkTi=HCJY8KW0VR6qZ+cRmXFLBFy0OUg@mail.gmail.com> |
| In reply to | #2392 |
Quintus <sutniuq@gmx.net> replied:
> Use Enumarators:
> ================================
> irb(main):001:0> str = "ÄÄÄÖÖÖÜÜÜ"
> => "ÄÄÄÖÖÖÜÜÜ"
> irb(main):002:0> str.chars.each_slice(3){|x| p x}
> ["Ä", "Ä", "Ä"]
> ["Ö", "Ö", "Ö"]
> ["Ü", "Ü", "Ü"]
> => nil
> irb(main):003:0> str.bytes.each_slice(3){|x| p x}
> [195, 132, 195]
> [132, 195, 132]
> [195, 150, 195]
> [150, 195, 150]
> [195, 156, 195]
> [156, 195, 156]
> => nil
> irb(main):004:0>
> ================================
>
> Vale,
> Marvin
Yes, I agree, that can work.
As I said in my original post:
>> One alternative each_bslice implementation I tried used
>> str.bytes.to_a.map(&:chr).each_slice(x){|c| p c.join} but it was a bit
>> slower in benchmarks versus the str.scan method.
That implementation did use enumerators. But it was slower than
str.scan. Hence my asking if there was a better (faster/more
efficient) way.
I didn't try benchmarking str.chars.each_slice vs str.scan. I'll have
to check that out. Thanks for pointing that out to me!
Aaron out.
[toc] | [prev] | [next] | [standalone]
| From | "Aaron D. Gifford" <astounding@gmail.com> |
|---|---|
| Date | 2011-04-06 14:09 -0500 |
| Message-ID | <BANLkTi=7bSbp32tPbDOL1vSBTOR=m+Ce+w@mail.gmail.com> |
| In reply to | #2393 |
Looking more closely on the use of str.scan vs str.chars.each_slice string slicing, it appears that the best one to use depends on what form of slice one needs. If I need a string yielded that is a substring (a slice) vs. an array of characters or array of bytes, then the scan method is consistently faster on my machine. However, if I want an array of characters or bytes, then str.chars.each_slice or str.bytes.each_slice is faster. Most of the time for me, however, I need a substring slice. Aaron out.
[toc] | [prev] | [next] | [standalone]
| From | 7stud -- <bbxx789_05ss@yahoo.com> |
|---|---|
| Date | 2011-04-06 18:33 -0500 |
| Message-ID | <14aa0a86278a4755ef0050004fb530c9@ruby-forum.com> |
| In reply to | #2389 |
Aaron D. Gifford wrote in post #991274:
> Hi,
>
> I find I periodically need to iterate over slices of a string.
> Enumerable has the useful each_slice method, but in Ruby 1.9, I don't
> see an equivalent for the String class.
>
How about:
str = "hello world"
while str.size > 0
substr = str.slice!(0, 3) #(offset, length)
puts "-->#{substr}<--"
end
--output:--
-->hel<--
-->lo <--
-->wor<--
-->ld<--
--
Posted via http://www.ruby-forum.com/.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.ruby
csiph-web