Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #14089

Re: Sorting numeric strings

From Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Newsgroups comp.lang.java.programmer
Subject Re: Sorting numeric strings
References <2012043021273098748-no@waycom>
Message-ID <CqVnr.242$go4.98@newsfe14.iad> (permalink)
Date 2012-05-01 10:53 -0700

Show all headers | View raw


On 4/30/12 6:27 PM, Ben wrote:
> Given the following data:
>
> Col1, Col2, Col3
> 438.23, 991897664, ccc
> 22.12, 991897631, bbb
> 100.99, 881897631, aaa
> 50.12, 991884803, ddd
>
> The class below will sort the data based on the column specified, except
> Col1, which contains float values. If you set the SortCol variable below
> to 0, sorting does not work. If you set it to 1 or 2, sorting does work.
> How can I sort Col1 which is a column of numeric strings?
>
> import java.io.BufferedReader;
> import java.io.FileReader;
> import java.io.FileWriter;
> import java.util.LinkedList;
> import java.util.List;
> import java.util.Map;
> import java.util.TreeMap;
>
> public class SortColumn {
>
> public static void main(String[] args) throws Exception {
>
> BufferedReader reader = new BufferedReader(new FileReader("file.csv"));
> //BufferedReader reader = new BufferedReader(new
> FileReader("jtp-input2-test.csv"));
> Map<String, List<String>> map = new TreeMap<String, List<String>>();
> String line = reader.readLine(); //read header
> while ((line = reader.readLine()) != null) {
> String key = getField(line).toString();
> List<String> l = map.get(key);
> if (l == null) {
> l = new LinkedList<String>();
> map.put(key, l);
> System.out.println(key);
> }
> l.add(line);
>
> }
> reader.close();
>
> FileWriter writer = new FileWriter("sorted_numbers.txt");
> writer.write("Col1, Col2, Col3\n");
> // writer.write("billnumber, Copay, Discount, NonAllow, unknown\n");
> for (List<String> list : map.values()) {
> for (String val : list) {
> writer.write(val);
> writer.write("\n");
> }
> }
> writer.close();
> }
>
> private static String getField(String line) {
> // Column you want to sort on (Zero based)
> int SortCol = 0;
> return line.split(",")[SortCol];
> }
> }

In order to compare two strings as numbers, you need to pad zeros on 
both extremes away from any "dot".

In other words, in order to compare "123" with "3.141", you'd need to 
"normalize" them to "123.000" and "003.141".

I've actually recently written something that does this, and handles 
arbitrary "." designations. This was actually designed to work with 
revision numbering, which can have multiple ".".

import org.apache.commons.lang.StringUtils;
import java.util.Comparator;

public class StringAsNumberComparator implements Comparator<String> {
     private int compare(String left, String right) {
         final String[] a = StringUtils.split(left, '.');
         final String[] b = StringUtils.split(right, '.');
         for (int i = 0; i < a.length; ++i) {
             if (i >= b.length) {
                 return 1;
             }
             final int compare = compareMaybeNumeric(left, right);
             if (compare != 0) {
                 return compare;
             }
         }
         return a.length - b.length;
     }

     private static int compareMaybeNumeric(String a, String b) {
         if (StringUtils.isNumeric(a) && StringUtils.isNumeric(b)) {
             final int length = Math.max(a.length(), b.length());
             return StringUtils.leftPad(a, length, 
'0').compareTo(StringUtils.leftPad(b, length, '0'));
         } else {
             return a.compareTo(b);
         }
     }
}


Although, now that I'm looking at this, I see a few optimizations I can 
make that don't involve padding.  If two numbers aren't the same length, 
then the longer string is larger magnitude.

Of course, this code doesn't consider negative values, but can be 
adjusted to do so.

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Sorting numeric strings Ben <no@way.com> - 2012-04-30 21:27 -0400
  Re: Sorting numeric strings Arne Vajhøj <arne@vajhoej.dk> - 2012-04-30 21:39 -0400
  Re: Sorting numeric strings Gene Wirchenko <genew@ocis.net> - 2012-05-01 10:30 -0700
  Re: Sorting numeric strings Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-05-01 10:53 -0700
    Re: Sorting numeric strings Roedy Green <see_website@mindprod.com.invalid> - 2012-05-01 14:50 -0700
      Re: Sorting numeric strings Patricia Shanahan <pats@acm.org> - 2012-05-01 15:02 -0700
        Re: Sorting numeric strings Roedy Green <see_website@mindprod.com.invalid> - 2012-05-02 14:36 -0700
          Re: Sorting numeric strings Gene Wirchenko <genew@ocis.net> - 2012-05-02 19:57 -0700
          Re: Sorting numeric strings Dr J R Stockton <reply1218@merlyn.demon.co.uk.not.invalid> - 2012-05-03 19:41 +0100
            Re: Sorting numeric strings Roedy Green <see_website@mindprod.com.invalid> - 2012-05-03 17:40 -0700
              Re: Sorting numeric strings Lew <lewbloch@gmail.com> - 2012-05-03 18:11 -0700
                Re: Sorting numeric strings Martin Gregorie <martin@address-in-sig.invalid> - 2012-05-04 20:01 +0000
                Re: Sorting numeric strings Gene Wirchenko <genew@ocis.net> - 2012-05-04 14:19 -0700
                Re: Sorting numeric strings Martin Gregorie <martin@address-in-sig.invalid> - 2012-05-04 23:36 +0000
                Re: Sorting numeric strings Dr J R Stockton <reply1218@merlyn.demon.co.uk.not.invalid> - 2012-05-06 17:50 +0100
                Re: Sorting numeric strings Gene Wirchenko <genew@ocis.net> - 2012-05-07 10:34 -0700
                Re: Sorting numeric strings Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-05-07 12:38 -0500
                Re: Sorting numeric strings glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-05-07 17:48 +0000
                Re: Sorting numeric strings Gene Wirchenko <genew@ocis.net> - 2012-05-07 11:42 -0700
  Re: Sorting numeric strings Roedy Green <see_website@mindprod.com.invalid> - 2012-05-01 13:38 -0700

csiph-web