Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #8108

Re: StringBuilder

From Jan Burse <janburse@fastmail.fm>
Newsgroups comp.lang.java.programmer
Subject Re: StringBuilder
Date 2011-09-17 20:35 +0200
Organization albasani.net
Message-ID <j52pa6$ptl$1@news.albasani.net> (permalink)
References <96f358c8-a024-40db-b60b-300186c2f813@o10g2000vby.googlegroups.com> <j41fik$3qb$1@news.albasani.net> <j52jgd$iij$1@dont-email.me>

Show all headers | View raw


Stanimir Stamenkov schrieb:
> Mon, 05 Sep 2011 05:27:15 +0200, /Jan Burse/:
>
>> If you then explicitly use StringBuilder you are
>> faster, because you save the new StringBuilder() and toString().
>>
>> So this is faster, since it uses 1 new and 1 toString():
>
> The StringBuilder.toString() is really fast - that's the point, and I
> don't think it is worth mentioning it.
>

I am not sure whether I can agree directly.

The StringBuilder is a mutable object. The String is a
immutable object. Therefore the obvious fast implementation
that would share the buffer between StringBuilder and
String does not work. Because the following code would
break the immutability of String:

    StringBuilder buf=new StringBuilder();

    buf.append("Hello World!");

    String str=buf.toString();

    buf.replace(6,11,"Java");

    System.out.println("str="+str);

By a side effect via buf replace the value of the
string str would change. Therefore we find the following
slow implementation of toString() in the reference
implementation. Please note the comment:

429     public String toString() {
430         // Create a copy, don't share the array
431         return new String(value, 0, count);
432     }

http://kickjava.com/src/java/lang/StringBuilder.java.htm

And if we look at the used constructor, it does really
make a copy. There would be a non public constructor
in String that allows some sharing, and that is for
example used to implement substring. But this time
a constructor is used that does not do a sharing:

197     public String(char value[], int offset, int count) {
198         if (offset < 0) {
199             throw new StringIndexOutOfBoundsException(offset);
200         }
201         if (count < 0) {
202             throw new StringIndexOutOfBoundsException(count);
203         }
204         // Note: offset or count might be near -1>>>1.
205         if (offset > value.length - count) {
206             throw new StringIndexOutOfBoundsException
                               (offset + count);
207         }
208         char[] v = new char[count];
209         System.arraycopy(value, offset, v, 0, count);
210         this.offset = 0;
211         this.count = count;
212         this.value = v;
213     }

http://kickjava.com/src/java/lang/String.java.htm

Eventually some programm analysis would allow sharing.
But the copying has also a positive effect. When
the StringBuilder by manipulation has gained a much
greater capacity than necessary, then the copying will
create a smaller char array, so that less space is used
as soon as the StringBuilder is reclaimed.

But maybe you are right, that toString() is nevertheless
fast. Since a) allocating objects is usually fast and
b) System array copy can also be fast. And together
with the capacity reducing effect this could all lead
to a small overhead.

BTW: OpenJDK uses the same code. In Harmony we find
a shared flag in the AbstractStringBuilder, and a
heuristic when sharing is done or not. The non public
String constructor is used for sharing:

    public String toString() {
        if (count == 0) {
            return ""; //$NON-NLS-1$
        }
        // Optimize String sharing for more performance
        int wasted = value.length - count;
        if (wasted >= 256
                || (wasted >= INITIAL_CAPACITY &&
                             wasted >= (count >> 1))) {
                     return new String(value, 0, count);
        }
        shared = true;
        return new String(0, count, value);
    }

http://www.java2s.com/Open-Source/Java-Document/Apache-Harmony-Java-SE/java-package/java/lang/AbstractStringBuilder.java.htm

There is then a little overhead in the basic operations
of StringBuilder to check for sharing, and in case that
there is sharing, a copy is made.

    final void replace0(int start, int end, String string) {
      [...]
      if (!shared) {
          // index == count case is no-op
          System.arraycopy(value, end, value, start
                   + stringLength, count - end);
      } else {
          char[] newData = new char[value.length];
          System.arraycopy(value, 0, newData, 0, start);
          // index == count case is no-op
          System.arraycopy(value, end, newData, start
                   + stringLength, count - end);
          value = newData;
          shared = false;
      }

Probably gain in speed by the sharing compensates for
this little extra check needed everwhere. So probably
toString() is relatively fast here, assuming that sharing
happens enough often. When we look at the loop example
then we can positively influence sharing when we give
a good initial capacity, because then waste is small.

But giving an initial capacity for the whole loop is
propably non trivial. How does the digit size of
squares develop. So assume our StringBuilder grows
according to its enlargeBuffer rule. In the case of
Harmony the capacity is growing by a factor 1.5 and by
adding 2.

So initially we will have waste >= count/2 whenever
an enlargement happend, because of the adding of two
we have waste = count/2 + 2. So no sharing will happen.
When we then have added n characters, we will have
waste' = count/2 + 2 - n and count' = count + n.
We have only waste' < count' / 2 when 2 - n < n / 2.
So only after adding 2 characters sharing will happen
again for shure.

So the heuristic has a little glitch. But never mind.

Best Regards

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: StringBuilder Stanimir Stamenkov <s7an10@netscape.net> - 2011-09-17 19:56 +0300
  Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-17 20:35 +0200
  Re: StringBuilder Roedy Green <see_website@mindprod.com.invalid> - 2011-09-17 15:34 -0700
    Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 01:33 +0200
      Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 01:56 +0200
      Re: StringBuilder Roedy Green <see_website@mindprod.com.invalid> - 2011-09-17 20:58 -0700
    Re: StringBuilder Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-09-17 22:44 -0700
      Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 09:54 +0200
        Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 09:59 +0200
        Re: StringBuilder Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-09-18 07:28 -0700
      Re: StringBuilder Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-09-18 08:15 -0400
        Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 15:32 +0200
          Re: StringBuilder Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-09-18 09:50 -0400
            Re: StringBuilder Stanimir Stamenkov <s7an10@netscape.net> - 2011-09-18 17:08 +0300
            Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 22:13 +0200
              Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 22:29 +0200
                Re: StringBuilder Jan Burse <janburse@fastmail.fm> - 2011-09-18 22:39 +0200
      Re: StringBuilder Roedy Green <see_website@mindprod.com.invalid> - 2011-09-19 09:45 -0700

csiph-web