Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #21301 > unrolled thread
| Started by | Jan Burse <janburse@fastmail.fm> |
|---|---|
| First post | 2013-01-10 14:38 +0100 |
| Last post | 2013-01-11 14:30 +0100 |
| Articles | 20 on this page of 21 — 7 participants |
Back to article view | Back to comp.lang.java.programmer
Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 14:38 +0100
Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-10 08:15 -0800
Re: Substring changes (JDK 1.7) Joshua Cranmer <Pidgeot18@verizon.invalid> - 2013-01-10 10:48 -0600
Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-10 09:22 -0800
Re: Substring changes (JDK 1.7) Lars Enderin <lars.enderin@telia.com> - 2013-01-10 19:35 +0100
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 20:08 +0100
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 20:12 +0100
Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 12:22 -0800
Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-10 22:58 +0100
Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 15:20 -0800
Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-11 07:29 +0100
Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 23:19 -0800
Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-12 17:50 +0100
Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-12 09:59 -0800
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-12 19:14 +0100
Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-12 10:36 -0800
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-13 12:09 +0100
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 09:23 +0100
Re: Substring changes (JDK 1.7) "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2013-01-11 08:29 +0000
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 10:56 +0100
Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 14:30 +0100
Page 1 of 2 [1] 2 Next page →
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-10 14:38 +0100 |
| Subject | Substring changes (JDK 1.7) |
| Message-ID | <kcmg8p$7ee$1@news.albasani.net> |
Dear All, > Recent versions of the JDK do not reuse the backing char[]. > The reason is that the offset and length fields have been > removed from String to save memory. Did this affect some of your code? Bye
[toc] | [next] | [standalone]
| From | markspace <markspace@nospam.nospam> |
|---|---|
| Date | 2013-01-10 08:15 -0800 |
| Message-ID | <kcmpfs$6jd$1@dont-email.me> |
| In reply to | #21301 |
On 1/10/2013 5:38 AM, Jan Burse wrote: > Dear All, > > > Recent versions of the JDK do not reuse the backing char[]. > > The reason is that the offset and length fields have been > > removed from String to save memory. > > Did this affect some of your code? > > Bye Wrong on both counts. Where did you read this nonsense? <http://hg.openjdk.java.net/jdk7/jdk7-gate/jdk/file/tip/src/share/classes/java/lang/String.java>
[toc] | [prev] | [next] | [standalone]
| From | Joshua Cranmer <Pidgeot18@verizon.invalid> |
|---|---|
| Date | 2013-01-10 10:48 -0600 |
| Message-ID | <kcmrdb$joo$1@dont-email.me> |
| In reply to | #21302 |
On 1/10/2013 10:15 AM, markspace wrote: > On 1/10/2013 5:38 AM, Jan Burse wrote: >> Dear All, >> >> > Recent versions of the JDK do not reuse the backing char[]. >> > The reason is that the offset and length fields have been >> > removed from String to save memory. >> >> Did this affect some of your code? >> >> Bye > > > Wrong on both counts. Where did you read this nonsense? > > <http://hg.openjdk.java.net/jdk7/jdk7-gate/jdk/file/tip/src/share/classes/java/lang/String.java> <http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d> suggests differently... -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
[toc] | [prev] | [next] | [standalone]
| From | markspace <markspace@nospam.nospam> |
|---|---|
| Date | 2013-01-10 09:22 -0800 |
| Message-ID | <kcmtcg$1oq$1@dont-email.me> |
| In reply to | #21303 |
On 1/10/2013 8:48 AM, Joshua Cranmer wrote: > > <http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d> > suggests differently... That's 8, not 7. If you're going to ask about JDK 8, don't put "JDK 1.7" in your subject title.
[toc] | [prev] | [next] | [standalone]
| From | Lars Enderin <lars.enderin@telia.com> |
|---|---|
| Date | 2013-01-10 19:35 +0100 |
| Message-ID | <50EF0A03.20405@telia.com> |
| In reply to | #21304 |
2013-01-10 18:22, markspace skrev: > On 1/10/2013 8:48 AM, Joshua Cranmer wrote: > >> >> <http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d> >> suggests differently... > > > That's 8, not 7. If you're going to ask about JDK 8, don't put "JDK > 1.7" in your subject title. > > The only question was in the OP. Jan Burse set the title, not Joshua. -- Lars Enderin
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-10 20:08 +0100 |
| Message-ID | <kcn3ir$fak$1@news.albasani.net> |
| In reply to | #21301 |
Jan Burse schrieb:
> Dear All,
>
> > Recent versions of the JDK do not reuse the backing char[].
> > The reason is that the offset and length fields have been
> > removed from String to save memory.
>
> Did this affect some of your code?
>
> Bye
Its from JDK 1.7 Update 10
Look see:
C:\Users\Jan Burse>java -version
java version "1.7.0_10"
Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)
rt.jar:
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** Cache the hash code for the string */
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;
-- and --
public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > value.length) {
throw new StringIndexOutOfBoundsException(endIndex);
}
int subLen = endIndex - beginIndex;
if (subLen < 0) {
throw new StringIndexOutOfBoundsException(subLen);
}
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
}
-- and --
public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-10 20:12 +0100 |
| Message-ID | <kcn3pu$fak$2@news.albasani.net> |
| In reply to | #21307 |
Hi, It was originally observed in a Scala newsgroup: why is String grouped() so slow? https://groups.google.com/forum/?fromgroups=#!topic/scala-user/D1qmblInfyg Bye Jan Burse schrieb: > Jan Burse schrieb: >> Dear All, >> >> > Recent versions of the JDK do not reuse the backing char[]. >> > The reason is that the offset and length fields have been >> > removed from String to save memory. >> >> Did this affect some of your code? >> >> Bye > > Its from JDK 1.7 Update 10 > > Look see: > > C:\Users\Jan Burse>java -version > java version "1.7.0_10" > Java(TM) SE Runtime Environment (build 1.7.0_10-b18) > Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode) > > rt.jar:
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2013-01-10 12:22 -0800 |
| Message-ID | <dk8ue8p3nml7pv1rj34ojtgtc9i9cdn80n@4ax.com> |
| In reply to | #21301 |
On Thu, 10 Jan 2013 14:38:36 +0100, Jan Burse <janburse@fastmail.fm> wrote, quoted or indirectly quoted someone who said : > >Did this affect some of your code? If this change happens, you would no longer consider using new String( String) to unencumber a substring. You no longer have to worry a about a tiny substring holding a meg+ sized base string around in memory. -- Roedy Green Canadian Mind Products http://mindprod.com Students who hire or con others to do their homework are as foolish as couch potatoes who hire others to go to the gym for them.
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-01-10 22:58 +0100 |
| Message-ID | <al8rs6Fo963U1@mid.individual.net> |
| In reply to | #21309 |
On 10.01.2013 21:22, Roedy Green wrote: > If this change happens, you would no longer consider using new String( > String) to unencumber a substring. > > You no longer have to worry a about a tiny substring holding a meg+ > sized base string around in memory. Instead you have to worry about tons of substrings drawn from the same input String to occupy a lot more memory and slowing down GC. Trade offs, trade offs... Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2013-01-10 15:20 -0800 |
| Message-ID | <iviue8df2qf1v83jaljd9rmpnkbi6ojlt2@4ax.com> |
| In reply to | #21312 |
On Thu, 10 Jan 2013 22:58:24 +0100, Robert Klemme <shortcutter@googlemail.com> wrote, quoted or indirectly quoted someone who said : >Instead you have to worry about tons of substrings drawn from the same >input String to occupy a lot more memory and slowing down GC. Trade >offs, trade offs... I wonder if it could work like this. Perhaps GC could notice a giant string encumbered by a few small strings, and could do a new String for you and gc the base string. If you don't need the base string itself, I think most of the time you are best off to do the new string. For what I do, I am peeling off small strings from a big string which represents a file image. I keep the big string to the last minute. Encumbering works well for me. -- Roedy Green Canadian Mind Products http://mindprod.com Students who hire or con others to do their homework are as foolish as couch potatoes who hire others to go to the gym for them.
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-01-11 07:29 +0100 |
| Message-ID | <al9pq3Fua06U1@mid.individual.net> |
| In reply to | #21312 |
On 11.01.2013 06:26, Stefan Ram wrote: > Robert Klemme <shortcutter@googlemail.com> writes: >> On 10.01.2013 21:22, Roedy Green wrote: >>> You no longer have to worry a about a tiny substring holding a meg+ >>> sized base string around in memory. >> Instead you have to worry about tons of substrings drawn from the same >> input String to occupy a lot more memory and slowing down GC. Trade >> offs, trade offs... > > But this is more natural, it fulfills the expection of non-expert > programmers. But it would be a significant change. There is so much software written under the assumption of the old implementation. That change might actually break existing programs (break in the sense of less performance or new GC issues). Then again it might be that there are just not that many programs which make use of that knowledge. Who knows? > Expert programmers can implement a custom string class > with the previous behaviour, Well, shouldn't such a basic thing be part of the standard library? > or, - possibly better - a custom > implementation of CharSequence (if only more APIs would use > CharSequence instead of String!). I agree. But unfortunately public classes and APIs are set in stone. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2013-01-10 23:19 -0800 |
| Message-ID | <n5fve81rlc7vk6rpogslgse33hfiov73rj@4ax.com> |
| In reply to | #21317 |
On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme <shortcutter@googlemail.com> wrote, quoted or indirectly quoted someone who said : >Well, shouldn't such a basic thing be part of the standard library? String is final and many things take a String parm and nothing else. You can create something similar and use it like String. -- Roedy Green Canadian Mind Products http://mindprod.com Students who hire or con others to do their homework are as foolish as couch potatoes who hire others to go to the gym for them.
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-01-12 17:50 +0100 |
| Message-ID | <aldijmFpv7lU1@mid.individual.net> |
| In reply to | #21318 |
On 11.01.2013 08:19, Roedy Green wrote: > On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme > <shortcutter@googlemail.com> wrote, quoted or indirectly quoted > someone who said : > >> Well, shouldn't such a basic thing be part of the standard library? > > String is final and many things take a String parm and nothing else. > You can create something similar and use it like String. There is no reason in what you say that it should not be part of the std lib. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | markspace <markspace@nospam.nospam> |
|---|---|
| Date | 2013-01-12 09:59 -0800 |
| Message-ID | <kcs8b2$lkd$1@dont-email.me> |
| In reply to | #21361 |
On 1/12/2013 8:50 AM, Robert Klemme wrote: > On 11.01.2013 08:19, Roedy Green wrote: >> On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme >> <shortcutter@googlemail.com> wrote, quoted or indirectly quoted >> someone who said : >> >>> Well, shouldn't such a basic thing be part of the standard library? >> >> String is final and many things take a String parm and nothing else. >> You can create something similar and use it like String. > > There is no reason in what you say that it should not be part of the std > lib. javax.swing.text.Segment preserves the semantics of a shared buffer. It's not a drop-in replacement for String (many of the methods differ or are absent). But Segment is extensible, so critical missing methods could be added. I wonder if the best way to go would be to cheat and have String extended into a SharedString with the old implementation. This would violate the finality of String, but it's possible to synthesize these sorts of things if one has control of the JVM. Obviously, this needs to come from Oracle.
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-12 19:14 +0100 |
| Message-ID | <kcs964$6pd$1@news.albasani.net> |
| In reply to | #21362 |
markspace schrieb: > javax.swing.text.Segment preserves the semantics of a shared buffer. > It's not a drop-in replacement for String (many of the methods differ or > are absent). But Segment is extensible, so critical missing methods > could be added. Not available on Android I guess, :-(
[toc] | [prev] | [next] | [standalone]
| From | markspace <markspace@nospam.nospam> |
|---|---|
| Date | 2013-01-12 10:36 -0800 |
| Message-ID | <kcsafr$48r$1@dont-email.me> |
| In reply to | #21363 |
On 1/12/2013 10:14 AM, Jan Burse wrote: > markspace schrieb: >> javax.swing.text.Segment preserves the semantics of a shared buffer. >> It's not a drop-in replacement for String (many of the methods differ or >> are absent). But Segment is extensible, so critical missing methods >> could be added. > > Not available on Android I guess, :-( Or you could just write your own from scratch. It's not hard. But again I kind of doubt anyone is doing enough heavy string processing on a small embedded device like Android where this kind of thing is going to affect actual performance. Didn't your original complaint come from the Scala group? What does Scala have to do with Android?
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-13 12:09 +0100 |
| Message-ID | <kcu4l0$jaf$1@news.albasani.net> |
| In reply to | #21366 |
markspace schrieb: > Or you could just write your own from scratch. It's not hard. But > again I kind of doubt anyone is doing enough heavy string processing on > a small embedded device like Android where this kind of thing is going > to affect actual performance. Yes, Android Dalvik is quite slow. I observe 100x slower code execution compared to Windoze. But this has part todo with the lower CPU frequencies on Android devices and also with the absence of heavy JITing due to battery conservation. > Didn't your original complaint come from the Scala group? What > does Scala have to do with Android? I am not drawing any connection with where I picked up the observation with my interest to portably develop for Android and non-Android. Bye
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-11 09:23 +0100 |
| Message-ID | <kcoi5q$dng$1@news.albasani.net> |
| In reply to | #21309 |
Roedy Green schrieb:
> On Thu, 10 Jan 2013 14:38:36 +0100, Jan Burse <janburse@fastmail.fm>
> wrote, quoted or indirectly quoted someone who said :
>
>>
>> Did this affect some of your code?
>
> If this change happens, you would no longer consider using new String(
> String) to unencumber a substring.
>
> You no longer have to worry a about a tiny substring holding a meg+
> sized base string around in memory.
>
Have to sift through my code and check
every line that uses substring() whether
there is some better solution.
For example I trapped myself doing things like:
int k = path.lastIndexOf('/');
while (k!=-1) {
String name = path.substring(k+1);
/* do something with name */
path = path.substring(0,k);
k = path.lastIndexOf('/');
}
I guess the compiler cannot eliminate the copying
in the last substring(0,k), since String does not
have a length field anymore.
It would need to introduce an extra field in the
code, this also how I would rewrite the code and
used the two arguments variant of lastIndexOf.
But I guess the JIT cannot do it automatically,
or will it? Ever seen a tool that shows the
JITed assembler?
Bye
P.S.: I also wonder how performant java.io.File
now is.
[toc] | [prev] | [next] | [standalone]
| From | "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> |
|---|---|
| Date | 2013-01-11 08:29 +0000 |
| Message-ID | <g6WdneAGWPt1UHLNnZ2dnUVZ8qSdnZ2d@bt.com> |
| In reply to | #21320 |
Jan Burse wrote:
> Have to sift through my code and check
> every line that uses substring() whether
> there is some better solution.
>
> For example I trapped myself doing things like:
>
> int k = path.lastIndexOf('/');
> while (k!=-1) {
> String name = path.substring(k+1);
> /* do something with name */
> path = path.substring(0,k);
> k = path.lastIndexOf('/');
> }
>
And what's wrong with that ? Seems a sensible approach to me.
If you mean that it's suddenly /significantly/ slower, then I don't believe
you. (Though I freely admit that there will be a tiny few cases where it
/does/ matter -- in which cases I will be wrong.)
-- chris
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2013-01-11 10:56 +0100 |
| Message-ID | <kconk8$q09$1@news.albasani.net> |
| In reply to | #21321 |
Chris Uppal schrieb:
>> For example I trapped myself doing things like:
>> >
>> > int k = path.lastIndexOf('/');
>> > while (k!=-1) {
>> > String name = path.substring(k+1);
>> > /* do something with name */
>> > path = path.substring(0,k);
>> > k = path.lastIndexOf('/');
>> > }
>> >
> And what's wrong with that ? Seems a sensible approach to me.
>
> If you mean that it's suddenly/significantly/ slower, then I don't believe
> you. (Though I freely admit that there will be a tiny few cases where it
> /does/ matter -- in which cases I will be wrong.)
>
> -- chris
>
>
With the sharing semantics, its complexity is O(n+m), where
n is the length of the string and m is the number of
backslashes. The m counts for the number of creation of
shared String shells.
Without the sharing semantics, when substring copies, its
complexity is O(n^2), assuming m is not too small. In each
of the m interation you do not anymore create a String shell,
but instead in the following statement
path = path.substring(0,k);
you do copy a fair amount of path. JDK 1.7 Update 10 has not
anymore the sharing semantics. So when my m are not too small,
its probably a good idea to rewrite the code.
Bye
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web