Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #21301 > unrolled thread

Substring changes (JDK 1.7)

Started byJan Burse <janburse@fastmail.fm>
First post2013-01-10 14:38 +0100
Last post2013-01-11 14:30 +0100
Articles 20 on this page of 21 — 7 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 14:38 +0100
    Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-10 08:15 -0800
      Re: Substring changes (JDK 1.7) Joshua Cranmer <Pidgeot18@verizon.invalid> - 2013-01-10 10:48 -0600
        Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-10 09:22 -0800
          Re: Substring changes (JDK 1.7) Lars Enderin <lars.enderin@telia.com> - 2013-01-10 19:35 +0100
    Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 20:08 +0100
      Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-10 20:12 +0100
    Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 12:22 -0800
      Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-10 22:58 +0100
        Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 15:20 -0800
        Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-11 07:29 +0100
          Re: Substring changes (JDK 1.7) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-10 23:19 -0800
            Re: Substring changes (JDK 1.7) Robert Klemme <shortcutter@googlemail.com> - 2013-01-12 17:50 +0100
              Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-12 09:59 -0800
                Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-12 19:14 +0100
                  Re: Substring changes (JDK 1.7) markspace <markspace@nospam.nospam> - 2013-01-12 10:36 -0800
                    Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-13 12:09 +0100
      Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 09:23 +0100
        Re: Substring changes (JDK 1.7) "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2013-01-11 08:29 +0000
          Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 10:56 +0100
            Re: Substring changes (JDK 1.7) Jan Burse <janburse@fastmail.fm> - 2013-01-11 14:30 +0100

Page 1 of 2  [1] 2  Next page →


#21301 — Substring changes (JDK 1.7)

FromJan Burse <janburse@fastmail.fm>
Date2013-01-10 14:38 +0100
SubjectSubstring changes (JDK 1.7)
Message-ID<kcmg8p$7ee$1@news.albasani.net>
Dear All,

 > Recent versions of the JDK do not reuse the backing char[].
 > The reason is that the offset and length fields have been
 > removed from String to save memory.

Did this affect some of your code?

Bye

[toc] | [next] | [standalone]


#21302

Frommarkspace <markspace@nospam.nospam>
Date2013-01-10 08:15 -0800
Message-ID<kcmpfs$6jd$1@dont-email.me>
In reply to#21301
On 1/10/2013 5:38 AM, Jan Burse wrote:
> Dear All,
>
>  > Recent versions of the JDK do not reuse the backing char[].
>  > The reason is that the offset and length fields have been
>  > removed from String to save memory.
>
> Did this affect some of your code?
>
> Bye


Wrong on both counts.  Where did you read this nonsense?

<http://hg.openjdk.java.net/jdk7/jdk7-gate/jdk/file/tip/src/share/classes/java/lang/String.java>

[toc] | [prev] | [next] | [standalone]


#21303

FromJoshua Cranmer <Pidgeot18@verizon.invalid>
Date2013-01-10 10:48 -0600
Message-ID<kcmrdb$joo$1@dont-email.me>
In reply to#21302
On 1/10/2013 10:15 AM, markspace wrote:
> On 1/10/2013 5:38 AM, Jan Burse wrote:
>> Dear All,
>>
>>  > Recent versions of the JDK do not reuse the backing char[].
>>  > The reason is that the offset and length fields have been
>>  > removed from String to save memory.
>>
>> Did this affect some of your code?
>>
>> Bye
>
>
> Wrong on both counts.  Where did you read this nonsense?
>
> <http://hg.openjdk.java.net/jdk7/jdk7-gate/jdk/file/tip/src/share/classes/java/lang/String.java>

<http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d> 
suggests differently...


-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

[toc] | [prev] | [next] | [standalone]


#21304

Frommarkspace <markspace@nospam.nospam>
Date2013-01-10 09:22 -0800
Message-ID<kcmtcg$1oq$1@dont-email.me>
In reply to#21303
On 1/10/2013 8:48 AM, Joshua Cranmer wrote:

>
> <http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d>
> suggests differently...


That's 8, not 7.  If you're going to ask about JDK 8, don't put "JDK 
1.7" in your subject title.

[toc] | [prev] | [next] | [standalone]


#21306

FromLars Enderin <lars.enderin@telia.com>
Date2013-01-10 19:35 +0100
Message-ID<50EF0A03.20405@telia.com>
In reply to#21304
2013-01-10 18:22, markspace skrev:
> On 1/10/2013 8:48 AM, Joshua Cranmer wrote:
> 
>>
>> <http://hg.openjdk.java.net/jdk8/jdk8-gate/jdk/rev/2c773daa825d>
>> suggests differently...
> 
> 
> That's 8, not 7.  If you're going to ask about JDK 8, don't put "JDK
> 1.7" in your subject title.
> 
> 
The only question was in the OP. Jan Burse set the title, not Joshua.

-- 
Lars Enderin

[toc] | [prev] | [next] | [standalone]


#21307

FromJan Burse <janburse@fastmail.fm>
Date2013-01-10 20:08 +0100
Message-ID<kcn3ir$fak$1@news.albasani.net>
In reply to#21301
Jan Burse schrieb:
> Dear All,
>
>  > Recent versions of the JDK do not reuse the backing char[].
>  > The reason is that the offset and length fields have been
>  > removed from String to save memory.
>
> Did this affect some of your code?
>
> Bye

Its from JDK 1.7 Update 10

Look see:

C:\Users\Jan Burse>java -version
java version "1.7.0_10"
Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

rt.jar:

public final class String
     implements java.io.Serializable, Comparable<String>, CharSequence {
     /** The value is used for character storage. */
     private final char value[];

     /** Cache the hash code for the string */
     private int hash; // Default to 0

     /** use serialVersionUID from JDK 1.0.2 for interoperability */
     private static final long serialVersionUID = -6849794470754667710L;

-- and --

     public String substring(int beginIndex, int endIndex) {
         if (beginIndex < 0) {
             throw new StringIndexOutOfBoundsException(beginIndex);
         }
         if (endIndex > value.length) {
             throw new StringIndexOutOfBoundsException(endIndex);
         }
         int subLen = endIndex - beginIndex;
         if (subLen < 0) {
             throw new StringIndexOutOfBoundsException(subLen);
         }
         return ((beginIndex == 0) && (endIndex == value.length)) ? this
                 : new String(value, beginIndex, subLen);
     }

-- and --

     public String(char value[], int offset, int count) {
         if (offset < 0) {
             throw new StringIndexOutOfBoundsException(offset);
         }
         if (count < 0) {
             throw new StringIndexOutOfBoundsException(count);
         }
         // Note: offset or count might be near -1>>>1.
         if (offset > value.length - count) {
             throw new StringIndexOutOfBoundsException(offset + count);
         }
         this.value = Arrays.copyOfRange(value, offset, offset+count);
     }

[toc] | [prev] | [next] | [standalone]


#21308

FromJan Burse <janburse@fastmail.fm>
Date2013-01-10 20:12 +0100
Message-ID<kcn3pu$fak$2@news.albasani.net>
In reply to#21307
Hi,

It was originally observed in a Scala newsgroup:

why is String grouped() so slow?
https://groups.google.com/forum/?fromgroups=#!topic/scala-user/D1qmblInfyg

Bye

Jan Burse schrieb:
> Jan Burse schrieb:
>> Dear All,
>>
>>  > Recent versions of the JDK do not reuse the backing char[].
>>  > The reason is that the offset and length fields have been
>>  > removed from String to save memory.
>>
>> Did this affect some of your code?
>>
>> Bye
>
> Its from JDK 1.7 Update 10
>
> Look see:
>
> C:\Users\Jan Burse>java -version
> java version "1.7.0_10"
> Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
> Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)
>
> rt.jar:

[toc] | [prev] | [next] | [standalone]


#21309

FromRoedy Green <see_website@mindprod.com.invalid>
Date2013-01-10 12:22 -0800
Message-ID<dk8ue8p3nml7pv1rj34ojtgtc9i9cdn80n@4ax.com>
In reply to#21301
On Thu, 10 Jan 2013 14:38:36 +0100, Jan Burse <janburse@fastmail.fm>
wrote, quoted or indirectly quoted someone who said :

>
>Did this affect some of your code?

If this change happens, you would no longer consider using new String(
String) to unencumber a substring.

You no longer have to worry a about a tiny substring holding a meg+
sized base string around in memory.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
Students who hire or con others to do their homework are as foolish 
as couch potatoes who hire others to go to the gym for them. 

[toc] | [prev] | [next] | [standalone]


#21312

FromRobert Klemme <shortcutter@googlemail.com>
Date2013-01-10 22:58 +0100
Message-ID<al8rs6Fo963U1@mid.individual.net>
In reply to#21309
On 10.01.2013 21:22, Roedy Green wrote:

> If this change happens, you would no longer consider using new String(
> String) to unencumber a substring.
>
> You no longer have to worry a about a tiny substring holding a meg+
> sized base string around in memory.

Instead you have to worry about tons of substrings drawn from the same 
input String to occupy a lot more memory and slowing down GC.  Trade 
offs, trade offs...

Cheers

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]


#21314

FromRoedy Green <see_website@mindprod.com.invalid>
Date2013-01-10 15:20 -0800
Message-ID<iviue8df2qf1v83jaljd9rmpnkbi6ojlt2@4ax.com>
In reply to#21312
On Thu, 10 Jan 2013 22:58:24 +0100, Robert Klemme
<shortcutter@googlemail.com> wrote, quoted or indirectly quoted
someone who said :

>Instead you have to worry about tons of substrings drawn from the same 
>input String to occupy a lot more memory and slowing down GC.  Trade 
>offs, trade offs...

I wonder if it could work like this.

Perhaps GC could notice a giant string encumbered by a few small
strings, and could do a new String for you and gc the base string.

If you don't need the base string itself, I think most of the time you
are best off to do the new string.

For what I do, I am peeling off small strings from a big string which
represents a file image.  I keep the big string to the last minute.
Encumbering works well for me.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
Students who hire or con others to do their homework are as foolish 
as couch potatoes who hire others to go to the gym for them. 

[toc] | [prev] | [next] | [standalone]


#21317

FromRobert Klemme <shortcutter@googlemail.com>
Date2013-01-11 07:29 +0100
Message-ID<al9pq3Fua06U1@mid.individual.net>
In reply to#21312
On 11.01.2013 06:26, Stefan Ram wrote:
> Robert Klemme <shortcutter@googlemail.com> writes:
>> On 10.01.2013 21:22, Roedy Green wrote:
>>> You no longer have to worry a about a tiny substring holding a meg+
>>> sized base string around in memory.
>> Instead you have to worry about tons of substrings drawn from the same
>> input String to occupy a lot more memory and slowing down GC.  Trade
>> offs, trade offs...
>
>    But this is more natural, it fulfills the expection of non-expert
>    programmers.

But it would be a significant change.  There is so much software written 
under the assumption of the old implementation.  That change might 
actually break existing programs (break in the sense of less performance 
or new GC issues).

Then again it might be that there are just not that many programs which 
make use of that knowledge.  Who knows?

> Expert programmers can implement a custom string class
>    with the previous behaviour,

Well, shouldn't such a basic thing be part of the standard library?

> or, - possibly better - a custom
>    implementation of CharSequence (if only more APIs would use
>    CharSequence instead of String!).

I agree.  But unfortunately public classes and APIs are set in stone.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]


#21318

FromRoedy Green <see_website@mindprod.com.invalid>
Date2013-01-10 23:19 -0800
Message-ID<n5fve81rlc7vk6rpogslgse33hfiov73rj@4ax.com>
In reply to#21317
On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme
<shortcutter@googlemail.com> wrote, quoted or indirectly quoted
someone who said :

>Well, shouldn't such a basic thing be part of the standard library?

String is final and many things take a String parm and nothing else.
You can create something similar and use it like String.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
Students who hire or con others to do their homework are as foolish 
as couch potatoes who hire others to go to the gym for them. 

[toc] | [prev] | [next] | [standalone]


#21361

FromRobert Klemme <shortcutter@googlemail.com>
Date2013-01-12 17:50 +0100
Message-ID<aldijmFpv7lU1@mid.individual.net>
In reply to#21318
On 11.01.2013 08:19, Roedy Green wrote:
> On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme
> <shortcutter@googlemail.com> wrote, quoted or indirectly quoted
> someone who said :
>
>> Well, shouldn't such a basic thing be part of the standard library?
>
> String is final and many things take a String parm and nothing else.
> You can create something similar and use it like String.

There is no reason in what you say that it should not be part of the std 
lib.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]


#21362

Frommarkspace <markspace@nospam.nospam>
Date2013-01-12 09:59 -0800
Message-ID<kcs8b2$lkd$1@dont-email.me>
In reply to#21361
On 1/12/2013 8:50 AM, Robert Klemme wrote:
> On 11.01.2013 08:19, Roedy Green wrote:
>> On Fri, 11 Jan 2013 07:29:16 +0100, Robert Klemme
>> <shortcutter@googlemail.com> wrote, quoted or indirectly quoted
>> someone who said :
>>
>>> Well, shouldn't such a basic thing be part of the standard library?
>>
>> String is final and many things take a String parm and nothing else.
>> You can create something similar and use it like String.
>
> There is no reason in what you say that it should not be part of the std
> lib.

javax.swing.text.Segment preserves the semantics of a shared buffer. 
It's not a drop-in replacement for String (many of the methods differ or 
are absent). But Segment is extensible, so critical missing methods 
could be added.

I wonder if the best way to go would be to cheat and have String 
extended into a SharedString with the old implementation.  This would 
violate the finality of String, but it's possible to synthesize these 
sorts of things if one has control of the JVM.  Obviously, this needs to 
come from Oracle.


[toc] | [prev] | [next] | [standalone]


#21363

FromJan Burse <janburse@fastmail.fm>
Date2013-01-12 19:14 +0100
Message-ID<kcs964$6pd$1@news.albasani.net>
In reply to#21362
markspace schrieb:
> javax.swing.text.Segment preserves the semantics of a shared buffer.
> It's not a drop-in replacement for String (many of the methods differ or
> are absent). But Segment is extensible, so critical missing methods
> could be added.

Not available on Android I guess, :-(

[toc] | [prev] | [next] | [standalone]


#21366

Frommarkspace <markspace@nospam.nospam>
Date2013-01-12 10:36 -0800
Message-ID<kcsafr$48r$1@dont-email.me>
In reply to#21363
On 1/12/2013 10:14 AM, Jan Burse wrote:
> markspace schrieb:
>> javax.swing.text.Segment preserves the semantics of a shared buffer.
>> It's not a drop-in replacement for String (many of the methods differ or
>> are absent). But Segment is extensible, so critical missing methods
>> could be added.
>
> Not available on Android I guess, :-(


Or you could just write your own from scratch.  It's not hard.  But 
again I kind of doubt anyone is doing enough heavy string processing on 
a small embedded device like Android where this kind of thing is going 
to affect actual performance.

Didn't your original complaint come from the Scala group?  What does 
Scala have to do with Android?


[toc] | [prev] | [next] | [standalone]


#21376

FromJan Burse <janburse@fastmail.fm>
Date2013-01-13 12:09 +0100
Message-ID<kcu4l0$jaf$1@news.albasani.net>
In reply to#21366
markspace schrieb:
> Or you could just write your own from scratch.  It's not hard.  But
> again I kind of doubt anyone is doing enough heavy string processing on
> a small embedded device like Android where this kind of thing is going
> to affect actual performance.

Yes, Android Dalvik is quite slow. I
observe 100x slower code execution
compared to Windoze.

But this has part todo with the lower CPU
frequencies on Android devices and also with
the absence of heavy JITing due to battery
conservation.

> Didn't your original complaint come from the Scala group?  What
> does Scala have to do with Android?

I am not drawing any connection with where
I picked up the observation with my interest
to portably develop for Android and non-Android.

Bye


[toc] | [prev] | [next] | [standalone]


#21320

FromJan Burse <janburse@fastmail.fm>
Date2013-01-11 09:23 +0100
Message-ID<kcoi5q$dng$1@news.albasani.net>
In reply to#21309
Roedy Green schrieb:
> On Thu, 10 Jan 2013 14:38:36 +0100, Jan Burse <janburse@fastmail.fm>
> wrote, quoted or indirectly quoted someone who said :
>
>>
>> Did this affect some of your code?
>
> If this change happens, you would no longer consider using new String(
> String) to unencumber a substring.
>
> You no longer have to worry a about a tiny substring holding a meg+
> sized base string around in memory.
>

Have to sift through my code and check
every line that uses substring() whether
there is some better solution.

For example I trapped myself doing things like:

     int k = path.lastIndexOf('/');
     while (k!=-1) {
         String name = path.substring(k+1);
         /* do something with name */
         path = path.substring(0,k);
         k = path.lastIndexOf('/');
     }

I guess the compiler cannot eliminate the copying
in the last substring(0,k), since String does not
have a length field anymore.

It would need to introduce an extra field in the
code, this also how I would rewrite the code and
used the two arguments variant of lastIndexOf.

But I guess the JIT cannot do it automatically,
or will it? Ever seen a tool that shows the
JITed assembler?

Bye

P.S.: I also wonder how performant java.io.File
now is.


[toc] | [prev] | [next] | [standalone]


#21321

From"Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org>
Date2013-01-11 08:29 +0000
Message-ID<g6WdneAGWPt1UHLNnZ2dnUVZ8qSdnZ2d@bt.com>
In reply to#21320
Jan Burse wrote:

> Have to sift through my code and check
> every line that uses substring() whether
> there is some better solution.
>
> For example I trapped myself doing things like:
>
>     int k = path.lastIndexOf('/');
>     while (k!=-1) {
>         String name = path.substring(k+1);
>         /* do something with name */
>         path = path.substring(0,k);
>         k = path.lastIndexOf('/');
>     }
>

And what's wrong with that ?  Seems a sensible approach to me.

If you mean that it's suddenly /significantly/ slower, then I don't believe 
you.  (Though I freely admit that there will be a tiny few cases where it 
/does/ matter -- in which cases I will be wrong.)

    -- chris 

[toc] | [prev] | [next] | [standalone]


#21323

FromJan Burse <janburse@fastmail.fm>
Date2013-01-11 10:56 +0100
Message-ID<kconk8$q09$1@news.albasani.net>
In reply to#21321
Chris Uppal schrieb:
>> For example I trapped myself doing things like:
>> >
>> >     int k = path.lastIndexOf('/');
>> >     while (k!=-1) {
>> >         String name = path.substring(k+1);
>> >         /* do something with name */
>> >         path = path.substring(0,k);
>> >         k = path.lastIndexOf('/');
>> >     }
>> >
> And what's wrong with that ?  Seems a sensible approach to me.
>
> If you mean that it's suddenly/significantly/  slower, then I don't believe
> you.  (Though I freely admit that there will be a tiny few cases where it
> /does/  matter -- in which cases I will be wrong.)
>
>      -- chris
>
>

With the sharing semantics, its complexity is O(n+m), where
n is the length of the string and m is the number of
backslashes. The m counts for the number of creation of
shared String shells.

Without the sharing semantics, when substring copies, its
complexity is O(n^2), assuming m is not too small. In each
of the m interation you do not anymore create a String shell,
but instead in the following statement

          path = path.substring(0,k);

you do copy a fair amount of path. JDK 1.7 Update 10 has not
anymore the sharing semantics. So when my m are not too small,
its probably a good idea to rewrite the code.

Bye

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.java.programmer


csiph-web