Home > Programming > What you don’t want to know about Java Collections

What you don’t want to know about Java Collections

In one of my previous posts I wrote (I deleted “very” later, after the post was already published, as it seemed to strong):

Prefer lists to arrays unless you have a very good reason not to.

This was mostly in reaction to Lists are overrated, which has since been edited. Before anything else, l want to state clearly that I still stick to that statement. Unfortunately, due deficiencies in the Java language design and standard library APIs, some of these “good reasons” are not obvious:

Lists containing lots of boxed primitives are huge memory wasters: This isn’t really surprising, if you just do the math. Imagine you have 1024 integers. If you store them in an int array, you should be able to get away with about 4kb of memory. But storing them in a list of boxed primitives will take at least 8kb on a 32bit system, and at least 12kb on a 64bit system, because we now also need an additional object reference for every int value we store. Last but not least, we also need an implementation dependent amount of storage for 1024 Java objects, that make matters even worse. Note that this reasoning isn’t limited to Collections, but to boxed primitives in general (see also Item 49 “Prefer primitive types to boxed primitives” in EffJava2).

Collections.sort(…) favors idiots over experts: Let me cite the relevant Javadocs:

This implementation dumps the specified list into an array, sorts the array, and iterates over the list resetting each element from the corresponding position in the array. This avoids the n2 log(n) performance that would result from attempting to sort a linked list in place.

So, yes, the implementors of the Java standard library favor programmers who are clueless enough to sort huge linked lists, over programmers that did their homework. In fact, this simple Qicksort implementation, that sorts the given List in place, outperforms Collections.sort(…) by about 20% for ArrayList<Integer> and doesn’t needlessly allocate possibly huge amounts of memory:

static <T extends Comparable<? super T>> void qsort(final List<T> list)
    qsort(list, 0, list.size());
private static <T extends Comparable<? super T>> void qsort(final List<T> list, final int start, final int end)
    // Handle simple cases:
    if(end - start < 2)
    if(end - start == 2)
        if(list.get(start).compareTo(list.get(end - 1)) > 0)
            swap(list, start, end - 1);
   // Select a new pivot:
   int i1 = start;
   int i2 =  start + (end - start)/2;
   int i3 = end - 1;
   T p1 = list.get(i1);
   T p2 = list.get(i2);
   T p3 = list.get(i3);
   int pivotInd = -1;
   if(p1.compareTo(p2) == -p1.compareTo(p3))
       pivotInd = i1;
   else if(p2.compareTo(p1) == -p2.compareTo(p3))
       pivotInd = i2;
       pivotInd = i3;
   T pivot = list.get(pivotInd);
   // Partition the list using the selected pivot:
   swap(list, pivotInd, end - 1);
   int storeInd = start;
   for(int i = start; i != end - 1; ++i)
       if(pivot.compareTo(list.get(i)) > 0)
           swap(list, i, storeInd++);
    swap(list, storeInd, end - 1);
    // Recurse:
    qsort(list, start, storeInd);
    qsort(list, storeInd + 1, end);

private static <T> void swap(final List<T> list, final int i1, final int i2)
    T tmp = list.get(i1);
    list.set(i1, list.get(i2));
    list.set(i2, tmp);

If you don’t believe me, try it yourself – but please don’t use this code in production without testing it thoroughly.

Categories: Programming Tags: ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: