Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #10041 > unrolled thread

alias for Integer

Started byjrobinss <julien.robinson2@gmail.com>
First post2011-11-18 06:50 -0800
Last post2011-11-21 15:38 -0500
Articles 14 — 9 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-18 06:50 -0800
    Re: alias for Integer markspace <-@.> - 2011-11-18 07:11 -0800
    Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 07:32 -0800
    Re: alias for Integer Lew <lewbloch@gmail.com> - 2011-11-18 07:34 -0800
      Re: alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-18 08:26 -0800
        Re: alias for Integer Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-11-18 18:00 +0000
        Re: alias for Integer markspace <-@.> - 2011-11-18 10:17 -0800
        Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 20:30 -0800
      Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 20:23 -0800
    Re: alias for Integer Jim Janney <jjanney@shell.xmission.com> - 2011-11-18 13:14 -0700
    Re: alias for Integer Patricia Shanahan <pats@acm.org> - 2011-11-18 12:43 -0800
    Re: alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-21 01:50 -0800
    Re: alias for Integer Robert Klemme <shortcutter@googlemail.com> - 2011-11-21 20:58 +0100
    Re: alias for Integer David Lamb <dalamb@cs.queensu.ca> - 2011-11-21 15:38 -0500

#10041 — alias for Integer

Fromjrobinss <julien.robinson2@gmail.com>
Date2011-11-18 06:50 -0800
Subjectalias for Integer
Message-ID<21374513.220.1321627842805.JavaMail.geo-discussion-forums@yqmj32>
Hi all,

this is a simple question, so it may be silly...

I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in

  // Map<matrix index, DB index> <- I'd like to remove this comment!
  public static Map<Integer, Integer> myMap = ...;

Now what I'd like is to write
  public static Map<MatrixIndex, DbIndex> myMap = ...;

The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.

Usually, I would do this by extending the relevant class. But here it's Integer, which is final.

Questions: 
1. is this a bad good idea, and I should proceed with Integer?
2. if not, how would you implement this?
3. is there any performance issue in defining my own classes instead of Integer?

My current solution is to define my own classes for replacing Integer, such as:

public final class MatrixIndex {
  public final int value;
  public MyIndex(int i) {value = i;}
}

I won't benefit from autoboxing, then... :-(
(I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)

Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.

thanks for any tips!
JRobinss

[toc] | [next] | [standalone]


#10042

Frommarkspace <-@.>
Date2011-11-18 07:11 -0800
Message-ID<ja5sjp$d8p$1@dont-email.me>
In reply to#10041
On 11/18/2011 6:50 AM, jrobinss wrote:
>
> 3. is there any
> performance issue in defining my own classes instead of Integer?


To me yes, there is a performance issue, and you should definitely 
consider using something besides Integer and Map for this.


> Note that (exceptionnally for me) performance *is* an issue here. I
> haven't yet narrowed it down, but the code executes very slowly and
> eats up much too much memory at the moment. I'm starting to optimize
> it, that's why I'm starting with strongly typing it to prevent
> errors.


Good, you're doing this the right way.  You should look into profiling 
your code.  NetBeans has an excellent profiler built in, and will handle 
most "user made" projects as well as its own format.  You should look 
into it.

The most important thing is to pinpoint where the code is slow and work 
on those bits.  I suspect that auto-boxing and unboxing is costing you 
too much time, but I couldn't prove it.  The profiler could.



[toc] | [prev] | [next] | [standalone]


#10044

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-11-18 07:32 -0800
Message-ID<3hucc79gq3disp9hlbrqb7ks7er4k7r4lc@4ax.com>
In reply to#10041
On Fri, 18 Nov 2011 06:50:42 -0800 (PST), jrobinss
<julien.robinson2@gmail.com> wrote, quoted or indirectly quoted
someone who said :

>Now what I'd like is to write
>  public static Map<MatrixIndex, DbIndex> myMap =3D ...;

one way to do it would be to write

public static MatrixMap myMap = new MatrixMap ( 2000 );
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
I can't come to bed just yet. Somebody is wrong on the Internet. 

[toc] | [prev] | [next] | [standalone]


#10045

FromLew <lewbloch@gmail.com>
Date2011-11-18 07:34 -0800
Message-ID<12712433.869.1321630449979.JavaMail.geo-discussion-forums@prmf13>
In reply to#10041
jrobinss wrote:
> I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in
> 
>   // Map<matrix index, DB index> <- I'd like to remove this comment!
>   public static Map<Integer, Integer> myMap = ...;
> 
> Now what I'd like is to write
>   public static Map<MatrixIndex, DbIndex> myMap = ...;
> 
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.
> 
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.

That is kind of an antipattern.  Joshua Bloch suggest to "Prefer composition to inheritance" in /Effective Java/ (2nd ed.).  You could write a wrapper for 'Integer', but then 'Integer' already *is* a wrapper type.

> Questions: 
> 1. is this a bad good idea, and I should proceed with Integer?

It's not a terrible idea, necessarily, but it seems unnecessary. 

> 2. if not, how would you implement this?

I would pick a type that actually helps.  I don't see what you object to in 'Integer', quite frankly.

> 3. is there any performance issue in defining my own classes instead of Integer?

How could anyone possibly know the answer to this question?

> My current solution is to define my own classes for replacing Integer, such as:
> 
> public final class MatrixIndex {
>   public final int value;
>   public MyIndex(int i) {value = i;}
> }

Congratulations, you just re-invented 'Integer', but without any of its features.

Now you have to go through all sorts of gyrations to translate your custom class to and from 'int'.  Your code complexity goes up, and you have to maintain your own custom substitute for a fundamental API type.  Thus your risk of bugs and code-maintenance costs skyrocket.

How again does this help you?

> I won't benefit from autoboxing, then... :-(

No one does.

> (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)

Why waste *any* time on this?  Just use 'Integer'.

> Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.

"Optimize" and "prevent errors" are, at best, orthogonal, and at worst (and typically in this kind of premature action) the former interferes with the latter.

Yet you say them in the same breath as though they were the same thing.

They aren't.

Your "strong typing" isn't, really.  'Integer' already is a strong type.

As for what you choose to "optimize", what evidence (that is, factual data derived from actual tests whose protocols are publicized) do you have that you are attacking the slow parts?

IOW, what /actual/ tests have you performed, and how did you control the variables like system load, HotSpot warmup and application load?

You have a performance problem.  So you decide to obfuscate random parts of your code base with zero foundation for your actions.  Now you have two problems.

-- 
Lew

[toc] | [prev] | [next] | [standalone]


#10046

Fromjrobinss <julien.robinson2@gmail.com>
Date2011-11-18 08:26 -0800
Message-ID<19404286.1644.1321633587044.JavaMail.geo-discussion-forums@yqhd1>
In reply to#10045
Aaah, it had been some time, but I'm not disapointed. :-)

I'm sorry Lew, but I feel that I haven't been quite clear.

This is a large piece of code, not extraordinary but still a reasonable size, and I didn't write it. I end up manipulating indexes all over the place, so that I have for example a table (or map or whatever) that associates database indexes to matrix indexes, or identifiers to indexes, etc. These are all ints or Integers, and take part in ringamaroles of loops, indexes of indexes of arrays of arrays and such joyous constructs. In short, it's a lot of ints and hard to understand.

That these indexes and identifiers are all integers is in fact nearly a coincidence (it *is* very handy to reuse matrix libraries), conceptually they are not the same thing; the ID of an object could be a string, it just happens to be an int. I am certainly not obfuscating my code by replacing a type such as Integer with DbIndex or MyObjId, on the contrary I am stating to everyone, including the compiler, that this should be congruent with indexes, not to matrix sizes or number of chars in string of what-have-we-nots. The error I am trying to avoid is to use a DB index in place of a matrix index in the middle of a large set of loops.
(of course a nice side effect is that I could replace some IDs with strings or objects or anything, benefiting from encapsulation, but here it's not the objective)

See this as the same as when you call a method that takes ten ints as entry params: the main risk of error is to get confused in the order of parameters, and nothing will warn you except some strange bug a year later. A way to avoid it is to type strongly parameters, so that the caller may call
  new Rect(new Rect.Length(x), new Rect.Width(y))
instead of
  new Rect(x, y)
which is kind of verbose overkill in this particular example of course, but it's just an illustration.

Ok, so that's the reason why the typing.

As for the optimisation, I am perfectly aware that one of the most-cited mottos here is "premature optimization is the root of all evil" (with which I wholeheartedly agree, BTW). I am not prematurely optimizing: I am about to optimize some code that I didn't write, and it's certainly not premature because the code *is* slow. For me, this always starts with a bit of local rewriting in order to better understand the code and to guarantee that my changes won't break anything, in particular by letting the compiler help me out. The o-word, which acts as a kind of magnet for your reactions, was not the core subject of my post, which is probably why you feel you don't have any elements about it: I didn't provide any, because that was not my question. I merely stated that performances were potentially an issue, because that is generally one of the parameters to take into account when choosing a particular implementation.

I hope this clarifies.

Many thanks to you, Lew, and also to Roedy and Mark for answering. I'll keep your answers in mind while I progress.

For those who wonder about structures and matrixes etc, I'm starting by replacing types, but I may do another code writing iteration where I replace structures. So that it may go like this...
  original code: Blob<Integer>
  better: Blob<MyIndex>
  even better: MyBlob extends Blob<Integer>, or extends Blob<MyIndex>
  gettin' better: MyBlob with methods using ints, bye bye boxing :-)

JRobinss

[toc] | [prev] | [next] | [standalone]


#10049

FromAndreas Leitgeb <avl@gamma.logic.tuwien.ac.at>
Date2011-11-18 18:00 +0000
Message-ID<slrnjcd7a8.fvg.avl@gamma.logic.tuwien.ac.at>
In reply to#10046
jrobinss <julien.robinson2@gmail.com> wrote:
> This is a large piece of code, not extraordinary but still a 
> reasonable size, and I didn't write it. I end up manipulating
> indexes all over the place, so that I have for example a table
> (or map or whatever) that associates database indexes to matrix
> indexes, or identifiers to indexes, etc. These are all ints or
> Integers, and take part in ringamaroles of loops, indexes of
> indexes of arrays of arrays and such joyous constructs. In short,
> it's a lot of ints and hard to understand.

My comment will not be much of help for this part of your problem,
but I'll mention it anyway, for the other part of your problem.

There exist third-party libraries (I think from apache) that offer
variants of the usual JSL collection-classes - but for primitive types.
So, just in case it turns out, that most of the slowness comes from
boxing and unboxing, then having a look at those libraries might
help.

I haven't used them, myself, though, so do not really know, if 
they are indeed faster...

just fwiw.

[toc] | [prev] | [next] | [standalone]


#10052

Frommarkspace <-@.>
Date2011-11-18 10:17 -0800
Message-ID<ja67ff$nn8$1@dont-email.me>
In reply to#10046
On 11/18/2011 8:26 AM, jrobinss wrote:

> See this as the same as when you call a method that takes ten ints as
> entry params: the main risk of error is to get confused in the order
> of parameters,


This is actually a bit of an anti-pattern too.  It's hard for anyone to 
remember the order of more than about 4 parameters, according to 
Effective Java.


> and nothing will warn you except some strange bug a
> year later. A way to avoid it is to type strongly parameters, so that
> the caller may call new Rect(new Rect.Length(x), new Rect.Width(y))
> instead of new Rect(x, y) which is kind of verbose overkill in this
> particular example of course, but it's just an illustration.
>


This is also verbose, but one recommended pattern here is to use the 
builder pattern

Rect r = new RectBuilder().length( 10 ).width( 12 ).make();

It gains value as you have more and more parameters to remember, and 
also obviates the problem with remembering their order, because the 
builder will accept them in any order.  I personally would not use it 
for Rect here as it only has two parameters.  For a method or ctor that 
has 10+ parameters, I would consider it.

Other standard patterns:

1. Use an IDE.  I good idea will show the names of the parameters when 
you type them in, so you don't have to remember their order.  This is a 
good form of reflection that costs you nothing at runtime, and doesn't 
add any lines of code either.

2. Take a cue from the IDE, get a lexical parser for Java, and build 
your own custom source code formatter.  Break up long lists of 
parameters automatically and comment them to include their names.  This 
again is a big win that has no runtime costs, but will keep your current 
and future code base formatted according to a standard.  This is a huge 
win, imo.

Try to think outside of the "code" box.  There's more to developing 
software than things that run inside your code.  Study the Unix 
operating system and try to learn from its "tool building" examples.



[toc] | [prev] | [next] | [standalone]


#10061

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-11-18 20:30 -0800
Message-ID<8sbec7l21u7pv5n633pkle9dav1nnao2un@4ax.com>
In reply to#10046
I don't what you are trying to do, but in general you can sometimes
replace a map by sorting two sets then processing them batch-style
sequentially much like a  tape merge.

See http://mindprod.com/jgloss/products1.html#SORTED
for a classes that sort and process pairs of sets sequentially.

Giving the HashMap more RAM will improve its performance.
see http://mindprod.com/jgloss/hashmap.html

HashMaps are faster that Hashtables.
 
HashMaps are faster than TreeMaps.

Of course you want to prove that the HashMap lookup truly is the
bottleneck or anything you do may end up just slowing things down.

http://mindprod.com/jgloss/profiler.html

-- 
Roedy Green Canadian Mind Products
http://mindprod.com
I can't come to bed just yet. Somebody is wrong on the Internet. 

[toc] | [prev] | [next] | [standalone]


#10060

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-11-18 20:23 -0800
Message-ID<8pbec7dqfddrdsfqd84bavh9d4odmpfa73@4ax.com>
In reply to#10045
On Fri, 18 Nov 2011 07:34:09 -0800 (PST), Lew <lewbloch@gmail.com>
wrote, quoted or indirectly quoted someone who said :

>Congratulations, you just re-invented 'Integer', but without any of its fea=
>tures.

in particular autoboxing/unboxing.

see http://mindprod.com/jgloss/autoboxing.html
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
I can't come to bed just yet. Somebody is wrong on the Internet. 

[toc] | [prev] | [next] | [standalone]


#10055

FromJim Janney <jjanney@shell.xmission.com>
Date2011-11-18 13:14 -0700
Message-ID<2p4ny19oj8.fsf@shell.xmission.com>
In reply to#10041
jrobinss <julien.robinson2@gmail.com> writes:

> Hi all,
>
> this is a simple question, so it may be silly...
>
> I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in
>
>   // Map<matrix index, DB index> <- I'd like to remove this comment!
>   public static Map<Integer, Integer> myMap = ...;
>
> Now what I'd like is to write
>   public static Map<MatrixIndex, DbIndex> myMap = ...;
>
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.
>
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
>
> Questions: 
> 1. is this a bad good idea, and I should proceed with Integer?
> 2. if not, how would you implement this?
> 3. is there any performance issue in defining my own classes instead of Integer?
>
> My current solution is to define my own classes for replacing Integer, such as:
>
> public final class MatrixIndex {
>   public final int value;
>   public MyIndex(int i) {value = i;}
> }
>
> I won't benefit from autoboxing, then... :-(
> (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)
>
> Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.
>
> thanks for any tips!
> JRobinss

Switch to Ada?  If that's not an option, you should probably just stick
with Integer.  I think I understand what you're trying to do, but Java
just isn't good at expressing those kinds of constraints.

As far as performance is concerned, I think there are some open-source
projects that implement map-like structures with primitive keys, but I
don't have any experience using them.

-- 
Jim Janney

[toc] | [prev] | [next] | [standalone]


#10057

FromPatricia Shanahan <pats@acm.org>
Date2011-11-18 12:43 -0800
Message-ID<cZKdnWXfz5bmXlvTnZ2dnUVZ_gydnZ2d@earthlink.com>
In reply to#10041
jrobinss wrote:
...
> I won't benefit from autoboxing, then... :-(

If performance is an issue, autoboxing is a hindrance, not a benefit. It
means a non-trivial operation happening without being immediately
obvious from the code. You may need to change algorithms and/or data
structures to reduce the frequency of creating new objects.

Patricia

> (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)
> 
> Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.
...

[toc] | [prev] | [next] | [standalone]


#10141

Fromjrobinss <julien.robinson2@gmail.com>
Date2011-11-21 01:50 -0800
Message-ID<2301acca-a979-4c8d-972c-a4d3c729f08e@a16g2000yqk.googlegroups.com>
In reply to#10041
Thanks all for your answers. I didn't reply individually, but be sure
I read them with interest.

Back to code (or out of its box...)
JRobinss

[toc] | [prev] | [next] | [standalone]


#10163

FromRobert Klemme <shortcutter@googlemail.com>
Date2011-11-21 20:58 +0100
Message-ID<9ivorbFcn7U1@mid.individual.net>
In reply to#10041
On 18.11.2011 15:50, jrobinss wrote:

> public final class MatrixIndex {
>    public final int value;
>    public MyIndex(int i) {value = i;}
> }

I'd rather use Integer but if you want a specific type you could 
implement a class which inherits Number and works exactly same way 
Integer does and has a generic type parameter for the container type. 
While that type parameter would be otherwise useless it would help in 
catching type errors.

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]


#10166

FromDavid Lamb <dalamb@cs.queensu.ca>
Date2011-11-21 15:38 -0500
Message-ID<EFyyq.45957$Ra6.20000@newsfe07.iad>
In reply to#10041
On 18/11/2011 9:50 AM, jrobinss wrote:
> Now what I'd like is to write
>    public static Map<MatrixIndex, DbIndex>  myMap = ...;
>
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes,
 > by relying on strong typing.
>
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
>
> Questions:
> 1. is this a bad good idea, and I should proceed with Integer?
> 2. if not, how would you implement this?
> 3. is there any performance issue in defining my own classes instead of Integer?
>
> My current solution is to define my own classes for replacing Integer, such as:
>
> public final class MatrixIndex {
>    public final int value;
>    public MyIndex(int i) {value = i;}
> }

To me the critical question is: in what way do each of these different 
kinds of "integer" differ from each other and from the language-defined 
"Integer"?  Do they differ in upper and lower bounds, for example? is it 
important to keep track of where each number came from? is one of them a 
mere identifier (such as a lot number in a city plan) rather than 
something you could add and subtract from each other? All of those 
suggest that you need a new type(s) that "has an int" instead of "is an 
Integer".

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.java.programmer


csiph-web