Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #10041 > unrolled thread
| Started by | jrobinss <julien.robinson2@gmail.com> |
|---|---|
| First post | 2011-11-18 06:50 -0800 |
| Last post | 2011-11-21 15:38 -0500 |
| Articles | 14 — 9 participants |
Back to article view | Back to comp.lang.java.programmer
alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-18 06:50 -0800
Re: alias for Integer markspace <-@.> - 2011-11-18 07:11 -0800
Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 07:32 -0800
Re: alias for Integer Lew <lewbloch@gmail.com> - 2011-11-18 07:34 -0800
Re: alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-18 08:26 -0800
Re: alias for Integer Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-11-18 18:00 +0000
Re: alias for Integer markspace <-@.> - 2011-11-18 10:17 -0800
Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 20:30 -0800
Re: alias for Integer Roedy Green <see_website@mindprod.com.invalid> - 2011-11-18 20:23 -0800
Re: alias for Integer Jim Janney <jjanney@shell.xmission.com> - 2011-11-18 13:14 -0700
Re: alias for Integer Patricia Shanahan <pats@acm.org> - 2011-11-18 12:43 -0800
Re: alias for Integer jrobinss <julien.robinson2@gmail.com> - 2011-11-21 01:50 -0800
Re: alias for Integer Robert Klemme <shortcutter@googlemail.com> - 2011-11-21 20:58 +0100
Re: alias for Integer David Lamb <dalamb@cs.queensu.ca> - 2011-11-21 15:38 -0500
| From | jrobinss <julien.robinson2@gmail.com> |
|---|---|
| Date | 2011-11-18 06:50 -0800 |
| Subject | alias for Integer |
| Message-ID | <21374513.220.1321627842805.JavaMail.geo-discussion-forums@yqmj32> |
Hi all,
this is a simple question, so it may be silly...
I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in
// Map<matrix index, DB index> <- I'd like to remove this comment!
public static Map<Integer, Integer> myMap = ...;
Now what I'd like is to write
public static Map<MatrixIndex, DbIndex> myMap = ...;
The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.
Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
Questions:
1. is this a bad good idea, and I should proceed with Integer?
2. if not, how would you implement this?
3. is there any performance issue in defining my own classes instead of Integer?
My current solution is to define my own classes for replacing Integer, such as:
public final class MatrixIndex {
public final int value;
public MyIndex(int i) {value = i;}
}
I won't benefit from autoboxing, then... :-(
(I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)
Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.
thanks for any tips!
JRobinss
[toc] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-11-18 07:11 -0800 |
| Message-ID | <ja5sjp$d8p$1@dont-email.me> |
| In reply to | #10041 |
On 11/18/2011 6:50 AM, jrobinss wrote: > > 3. is there any > performance issue in defining my own classes instead of Integer? To me yes, there is a performance issue, and you should definitely consider using something besides Integer and Map for this. > Note that (exceptionnally for me) performance *is* an issue here. I > haven't yet narrowed it down, but the code executes very slowly and > eats up much too much memory at the moment. I'm starting to optimize > it, that's why I'm starting with strongly typing it to prevent > errors. Good, you're doing this the right way. You should look into profiling your code. NetBeans has an excellent profiler built in, and will handle most "user made" projects as well as its own format. You should look into it. The most important thing is to pinpoint where the code is slow and work on those bits. I suspect that auto-boxing and unboxing is costing you too much time, but I couldn't prove it. The profiler could.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2011-11-18 07:32 -0800 |
| Message-ID | <3hucc79gq3disp9hlbrqb7ks7er4k7r4lc@4ax.com> |
| In reply to | #10041 |
On Fri, 18 Nov 2011 06:50:42 -0800 (PST), jrobinss <julien.robinson2@gmail.com> wrote, quoted or indirectly quoted someone who said : >Now what I'd like is to write > public static Map<MatrixIndex, DbIndex> myMap =3D ...; one way to do it would be to write public static MatrixMap myMap = new MatrixMap ( 2000 ); -- Roedy Green Canadian Mind Products http://mindprod.com I can't come to bed just yet. Somebody is wrong on the Internet.
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2011-11-18 07:34 -0800 |
| Message-ID | <12712433.869.1321630449979.JavaMail.geo-discussion-forums@prmf13> |
| In reply to | #10041 |
jrobinss wrote:
> I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in
>
> // Map<matrix index, DB index> <- I'd like to remove this comment!
> public static Map<Integer, Integer> myMap = ...;
>
> Now what I'd like is to write
> public static Map<MatrixIndex, DbIndex> myMap = ...;
>
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.
>
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
That is kind of an antipattern. Joshua Bloch suggest to "Prefer composition to inheritance" in /Effective Java/ (2nd ed.). You could write a wrapper for 'Integer', but then 'Integer' already *is* a wrapper type.
> Questions:
> 1. is this a bad good idea, and I should proceed with Integer?
It's not a terrible idea, necessarily, but it seems unnecessary.
> 2. if not, how would you implement this?
I would pick a type that actually helps. I don't see what you object to in 'Integer', quite frankly.
> 3. is there any performance issue in defining my own classes instead of Integer?
How could anyone possibly know the answer to this question?
> My current solution is to define my own classes for replacing Integer, such as:
>
> public final class MatrixIndex {
> public final int value;
> public MyIndex(int i) {value = i;}
> }
Congratulations, you just re-invented 'Integer', but without any of its features.
Now you have to go through all sorts of gyrations to translate your custom class to and from 'int'. Your code complexity goes up, and you have to maintain your own custom substitute for a fundamental API type. Thus your risk of bugs and code-maintenance costs skyrocket.
How again does this help you?
> I won't benefit from autoboxing, then... :-(
No one does.
> (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)
Why waste *any* time on this? Just use 'Integer'.
> Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.
"Optimize" and "prevent errors" are, at best, orthogonal, and at worst (and typically in this kind of premature action) the former interferes with the latter.
Yet you say them in the same breath as though they were the same thing.
They aren't.
Your "strong typing" isn't, really. 'Integer' already is a strong type.
As for what you choose to "optimize", what evidence (that is, factual data derived from actual tests whose protocols are publicized) do you have that you are attacking the slow parts?
IOW, what /actual/ tests have you performed, and how did you control the variables like system load, HotSpot warmup and application load?
You have a performance problem. So you decide to obfuscate random parts of your code base with zero foundation for your actions. Now you have two problems.
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | jrobinss <julien.robinson2@gmail.com> |
|---|---|
| Date | 2011-11-18 08:26 -0800 |
| Message-ID | <19404286.1644.1321633587044.JavaMail.geo-discussion-forums@yqhd1> |
| In reply to | #10045 |
Aaah, it had been some time, but I'm not disapointed. :-) I'm sorry Lew, but I feel that I haven't been quite clear. This is a large piece of code, not extraordinary but still a reasonable size, and I didn't write it. I end up manipulating indexes all over the place, so that I have for example a table (or map or whatever) that associates database indexes to matrix indexes, or identifiers to indexes, etc. These are all ints or Integers, and take part in ringamaroles of loops, indexes of indexes of arrays of arrays and such joyous constructs. In short, it's a lot of ints and hard to understand. That these indexes and identifiers are all integers is in fact nearly a coincidence (it *is* very handy to reuse matrix libraries), conceptually they are not the same thing; the ID of an object could be a string, it just happens to be an int. I am certainly not obfuscating my code by replacing a type such as Integer with DbIndex or MyObjId, on the contrary I am stating to everyone, including the compiler, that this should be congruent with indexes, not to matrix sizes or number of chars in string of what-have-we-nots. The error I am trying to avoid is to use a DB index in place of a matrix index in the middle of a large set of loops. (of course a nice side effect is that I could replace some IDs with strings or objects or anything, benefiting from encapsulation, but here it's not the objective) See this as the same as when you call a method that takes ten ints as entry params: the main risk of error is to get confused in the order of parameters, and nothing will warn you except some strange bug a year later. A way to avoid it is to type strongly parameters, so that the caller may call new Rect(new Rect.Length(x), new Rect.Width(y)) instead of new Rect(x, y) which is kind of verbose overkill in this particular example of course, but it's just an illustration. Ok, so that's the reason why the typing. As for the optimisation, I am perfectly aware that one of the most-cited mottos here is "premature optimization is the root of all evil" (with which I wholeheartedly agree, BTW). I am not prematurely optimizing: I am about to optimize some code that I didn't write, and it's certainly not premature because the code *is* slow. For me, this always starts with a bit of local rewriting in order to better understand the code and to guarantee that my changes won't break anything, in particular by letting the compiler help me out. The o-word, which acts as a kind of magnet for your reactions, was not the core subject of my post, which is probably why you feel you don't have any elements about it: I didn't provide any, because that was not my question. I merely stated that performances were potentially an issue, because that is generally one of the parameters to take into account when choosing a particular implementation. I hope this clarifies. Many thanks to you, Lew, and also to Roedy and Mark for answering. I'll keep your answers in mind while I progress. For those who wonder about structures and matrixes etc, I'm starting by replacing types, but I may do another code writing iteration where I replace structures. So that it may go like this... original code: Blob<Integer> better: Blob<MyIndex> even better: MyBlob extends Blob<Integer>, or extends Blob<MyIndex> gettin' better: MyBlob with methods using ints, bye bye boxing :-) JRobinss
[toc] | [prev] | [next] | [standalone]
| From | Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> |
|---|---|
| Date | 2011-11-18 18:00 +0000 |
| Message-ID | <slrnjcd7a8.fvg.avl@gamma.logic.tuwien.ac.at> |
| In reply to | #10046 |
jrobinss <julien.robinson2@gmail.com> wrote: > This is a large piece of code, not extraordinary but still a > reasonable size, and I didn't write it. I end up manipulating > indexes all over the place, so that I have for example a table > (or map or whatever) that associates database indexes to matrix > indexes, or identifiers to indexes, etc. These are all ints or > Integers, and take part in ringamaroles of loops, indexes of > indexes of arrays of arrays and such joyous constructs. In short, > it's a lot of ints and hard to understand. My comment will not be much of help for this part of your problem, but I'll mention it anyway, for the other part of your problem. There exist third-party libraries (I think from apache) that offer variants of the usual JSL collection-classes - but for primitive types. So, just in case it turns out, that most of the slowness comes from boxing and unboxing, then having a look at those libraries might help. I haven't used them, myself, though, so do not really know, if they are indeed faster... just fwiw.
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-11-18 10:17 -0800 |
| Message-ID | <ja67ff$nn8$1@dont-email.me> |
| In reply to | #10046 |
On 11/18/2011 8:26 AM, jrobinss wrote: > See this as the same as when you call a method that takes ten ints as > entry params: the main risk of error is to get confused in the order > of parameters, This is actually a bit of an anti-pattern too. It's hard for anyone to remember the order of more than about 4 parameters, according to Effective Java. > and nothing will warn you except some strange bug a > year later. A way to avoid it is to type strongly parameters, so that > the caller may call new Rect(new Rect.Length(x), new Rect.Width(y)) > instead of new Rect(x, y) which is kind of verbose overkill in this > particular example of course, but it's just an illustration. > This is also verbose, but one recommended pattern here is to use the builder pattern Rect r = new RectBuilder().length( 10 ).width( 12 ).make(); It gains value as you have more and more parameters to remember, and also obviates the problem with remembering their order, because the builder will accept them in any order. I personally would not use it for Rect here as it only has two parameters. For a method or ctor that has 10+ parameters, I would consider it. Other standard patterns: 1. Use an IDE. I good idea will show the names of the parameters when you type them in, so you don't have to remember their order. This is a good form of reflection that costs you nothing at runtime, and doesn't add any lines of code either. 2. Take a cue from the IDE, get a lexical parser for Java, and build your own custom source code formatter. Break up long lists of parameters automatically and comment them to include their names. This again is a big win that has no runtime costs, but will keep your current and future code base formatted according to a standard. This is a huge win, imo. Try to think outside of the "code" box. There's more to developing software than things that run inside your code. Study the Unix operating system and try to learn from its "tool building" examples.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2011-11-18 20:30 -0800 |
| Message-ID | <8sbec7l21u7pv5n633pkle9dav1nnao2un@4ax.com> |
| In reply to | #10046 |
I don't what you are trying to do, but in general you can sometimes replace a map by sorting two sets then processing them batch-style sequentially much like a tape merge. See http://mindprod.com/jgloss/products1.html#SORTED for a classes that sort and process pairs of sets sequentially. Giving the HashMap more RAM will improve its performance. see http://mindprod.com/jgloss/hashmap.html HashMaps are faster that Hashtables. HashMaps are faster than TreeMaps. Of course you want to prove that the HashMap lookup truly is the bottleneck or anything you do may end up just slowing things down. http://mindprod.com/jgloss/profiler.html -- Roedy Green Canadian Mind Products http://mindprod.com I can't come to bed just yet. Somebody is wrong on the Internet.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2011-11-18 20:23 -0800 |
| Message-ID | <8pbec7dqfddrdsfqd84bavh9d4odmpfa73@4ax.com> |
| In reply to | #10045 |
On Fri, 18 Nov 2011 07:34:09 -0800 (PST), Lew <lewbloch@gmail.com> wrote, quoted or indirectly quoted someone who said : >Congratulations, you just re-invented 'Integer', but without any of its fea= >tures. in particular autoboxing/unboxing. see http://mindprod.com/jgloss/autoboxing.html -- Roedy Green Canadian Mind Products http://mindprod.com I can't come to bed just yet. Somebody is wrong on the Internet.
[toc] | [prev] | [next] | [standalone]
| From | Jim Janney <jjanney@shell.xmission.com> |
|---|---|
| Date | 2011-11-18 13:14 -0700 |
| Message-ID | <2p4ny19oj8.fsf@shell.xmission.com> |
| In reply to | #10041 |
jrobinss <julien.robinson2@gmail.com> writes:
> Hi all,
>
> this is a simple question, so it may be silly...
>
> I am processing structures that contain integers, structures such as matrixes used in statistical analysis. As I implement these as Maps of indexes, I use the class Integer, as in
>
> // Map<matrix index, DB index> <- I'd like to remove this comment!
> public static Map<Integer, Integer> myMap = ...;
>
> Now what I'd like is to write
> public static Map<MatrixIndex, DbIndex> myMap = ...;
>
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes, by relying on strong typing.
>
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
>
> Questions:
> 1. is this a bad good idea, and I should proceed with Integer?
> 2. if not, how would you implement this?
> 3. is there any performance issue in defining my own classes instead of Integer?
>
> My current solution is to define my own classes for replacing Integer, such as:
>
> public final class MatrixIndex {
> public final int value;
> public MyIndex(int i) {value = i;}
> }
>
> I won't benefit from autoboxing, then... :-(
> (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time)
>
> Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors.
>
> thanks for any tips!
> JRobinss
Switch to Ada? If that's not an option, you should probably just stick
with Integer. I think I understand what you're trying to do, but Java
just isn't good at expressing those kinds of constraints.
As far as performance is concerned, I think there are some open-source
projects that implement map-like structures with primitive keys, but I
don't have any experience using them.
--
Jim Janney
[toc] | [prev] | [next] | [standalone]
| From | Patricia Shanahan <pats@acm.org> |
|---|---|
| Date | 2011-11-18 12:43 -0800 |
| Message-ID | <cZKdnWXfz5bmXlvTnZ2dnUVZ_gydnZ2d@earthlink.com> |
| In reply to | #10041 |
jrobinss wrote: ... > I won't benefit from autoboxing, then... :-( If performance is an issue, autoboxing is a hindrance, not a benefit. It means a non-trivial operation happening without being immediately obvious from the code. You may need to change algorithms and/or data structures to reduce the frequency of creating new objects. Patricia > (I'm just hoping it doesn't break too much of the code, because even though it's mine to break, I don't have infinite time) > > Note that (exceptionnally for me) performance *is* an issue here. I haven't yet narrowed it down, but the code executes very slowly and eats up much too much memory at the moment. I'm starting to optimize it, that's why I'm starting with strongly typing it to prevent errors. ...
[toc] | [prev] | [next] | [standalone]
| From | jrobinss <julien.robinson2@gmail.com> |
|---|---|
| Date | 2011-11-21 01:50 -0800 |
| Message-ID | <2301acca-a979-4c8d-972c-a4d3c729f08e@a16g2000yqk.googlegroups.com> |
| In reply to | #10041 |
Thanks all for your answers. I didn't reply individually, but be sure I read them with interest. Back to code (or out of its box...) JRobinss
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2011-11-21 20:58 +0100 |
| Message-ID | <9ivorbFcn7U1@mid.individual.net> |
| In reply to | #10041 |
On 18.11.2011 15:50, jrobinss wrote:
> public final class MatrixIndex {
> public final int value;
> public MyIndex(int i) {value = i;}
> }
I'd rather use Integer but if you want a specific type you could
implement a class which inherits Number and works exactly same way
Integer does and has a generic type parameter for the container type.
While that type parameter would be otherwise useless it would help in
catching type errors.
Kind regards
robert
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | David Lamb <dalamb@cs.queensu.ca> |
|---|---|
| Date | 2011-11-21 15:38 -0500 |
| Message-ID | <EFyyq.45957$Ra6.20000@newsfe07.iad> |
| In reply to | #10041 |
On 18/11/2011 9:50 AM, jrobinss wrote:
> Now what I'd like is to write
> public static Map<MatrixIndex, DbIndex> myMap = ...;
>
> The advantage is that the code is auto-documented, but even better that the compiler will check that I never get mixed up in different indexes,
> by relying on strong typing.
>
> Usually, I would do this by extending the relevant class. But here it's Integer, which is final.
>
> Questions:
> 1. is this a bad good idea, and I should proceed with Integer?
> 2. if not, how would you implement this?
> 3. is there any performance issue in defining my own classes instead of Integer?
>
> My current solution is to define my own classes for replacing Integer, such as:
>
> public final class MatrixIndex {
> public final int value;
> public MyIndex(int i) {value = i;}
> }
To me the critical question is: in what way do each of these different
kinds of "integer" differ from each other and from the language-defined
"Integer"? Do they differ in upper and lower bounds, for example? is it
important to keep track of where each number came from? is one of them a
mere identifier (such as a lot number in a city plan) rather than
something you could add and subtract from each other? All of those
suggest that you need a new type(s) that "has an int" instead of "is an
Integer".
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.java.programmer
csiph-web