Archive
IP SQUARE Commons Servlet 2.1.0 Released
I’ve just released IP SQUARE Commons Servlet 2.1.0 to the Central Maven Repository:
- Two new filters have been added: PerformanceLogFilter and RequestEncodingFilter.
- The library now comes with a manual and improved Javadocs.
- The implementation takes advantage of APIs added with IP SQUARE Commons Core 2.1.0.
IP SQUARE Commons Hibernate 2.0.1 Released
I’ve just released IP SQUARE Commons Hibernate 2.1.0 to the Central Maven Repository:
IP SQUARE Commons Core 2.1.0 Released
I’ve just released IP SQUARE Commons Core 2.1.0 to the Central Maven Repository:
- PerformanceLogger now supports custom formatters.
- A registry for class loaders was added, that is used consistently, especially in Classes and LocalResources.
- The library now comes with a manual and improved Javadocs.
Series About Java Concurrency – Pt. 6
This is the solution to my previous post, so make sure that you read it before you continue. For your convenience, here is the program we are going to discuss once again:
public class ThreadNoMore extends Thread
{
private static final int N_THREADS = 16;
private static final AtomicInteger threadsStarted = new AtomicInteger();
private final long ctorThreadId = Thread.currentThread().getId();
@Override
public synchronized void start()
{
if(ctorThreadId != Thread.currentThread().getId())
threadsStarted.incrementAndGet();
}
public static void main(String[] args) throws InterruptedException
{
Thread[] threads = new Thread[N_THREADS];
for(int i = 0; i != N_THREADS; ++i)
threads[i] = new ThreadNoMore();
for(Thread th : threads)
th.start();
for(Thread th : threads)
th.join();
System.out.println("threadsStarted: " + threadsStarted);
}
}
As already pointed out by Ortwin Glück, the program always prints
threadsStarted: 0
because ThreadNoMore.start() is always executed in thread that invoked it. And if you think that over twice, this should not surprise you at all, as methods are always executed in the thread that invoked them. This even applies to Thread.run(), which is the method we should have overridden instead, but unlike Thread.start(), Thread.run() is normally invoked by the Java Virtual Machine, as the Javadocs for Thread.start() tell us:
Causes this thread to begin execution; the Java Virtual Machine calls the run method of this thread.
So apart from not doing what we might have intended, carelessly overriding Thread.start(), that is without calling super.start(), so that the JVM can do its magic, leaves us with a crippled class, that no longer has anything to do with a thread at all. Considering these facts, it should not take you by surprise that
public class Thrap extends Thread
{
private static final int N_THREADS = 42;
private static final AtomicInteger
started = new AtomicInteger(),
ran = new AtomicInteger();
@Override
public synchronized void start()
{
started.incrementAndGet();
}
@Override
public void run()
{
ran.incrementAndGet();
}
public static void main(String[] args) throws InterruptedException
{
Thread[] threads = new Thread[N_THREADS];
for(int i = 0; i != N_THREADS; ++i)
threads[i] = new Thrap();
for(Thread thread : threads)
thread.start();
for(Thread thread : threads)
thread.join();
System.out.println("started: " + started);
System.out.println("ran: " + ran);
}
}
results in
started: 42 ran: 0
being written to your terminal. As you can see clearly, Thrap.run() is not executed at all, neither from a newly created, nor from the main thread. The fix the code above, you have to call super.start() in Thrap.start() like so:
@Override
public synchronized void start()
{
super.start();
startedThreads.incrementAndGet();
}
After this modification you get
started: 42 ran: 42
as expected.
So what can we learn from all of this? At first, there are two things to remember about the Java threads API:
- If you want to to start a new thread, do so by clearly documented standard procedures, or use the high level concurrency APIs from java.util.concurrent.
- Only override Thread.start() if you know exactly what you are doing.
Equally important are the consequences for API design: Interfaces should be easy to use correctly and hard to use incorrectly. Unfortunately the Thread class violates this principle, as it is quite easy to misuse as we have just seen. More generally, requiring clients to call the super version of a method they are overriding is considered to be an anti pattern for this very reason.
Series About Java Concurrency – Pt. 5
After quite some time, here is a concurrency related puzzle once again: Take a look at the following program and try to predict its output:
package com.wordpress.mlangc.concurrent;
import java.util.concurrent.atomic.AtomicInteger;
public class ThreadNoMore extends Thread
{
private static final int N_THREADS = 16;
private static final AtomicInteger threadsStarted = new AtomicInteger();
private final long ctorThreadId = Thread.currentThread().getId();
@Override
public synchronized void start()
{
if(ctorThreadId != Thread.currentThread().getId())
threadsStarted.incrementAndGet();
}
public static void main(String[] args) throws InterruptedException
{
Thread[] threads = new Thread[N_THREADS];
for(int i = 0; i != N_THREADS; ++i)
threads[i] = new ThreadNoMore();
for(Thread th : threads)
th.start();
for(Thread th : threads)
th.join();
System.out.println("threadsStarted: " + threadsStarted);
}
}
The solution, together with a detailed explanation will be available soon.
The IP SQUARE Commons Java Libraries
The purpose of this post is to introduce the ipsquare-commons project , a small collection of reusable Java classes I’ve put together while working at IP SQUARE. As there are tons of useful Java libraries already out there, and I prefer to use these whenever reasonably possible, you shouldn’t expect anything too fancy, as the exciting problems the everyday Java developer has to deal with have already been solved elsewhere.
As of today, the ipsquare-commons project consists of the following modules, that are separate artifacts and are introduced in the manuals linked below:
- ipsquare-commons-core: APIs that are likely to be useful in almost any Java project.
- ipsquare-commons-hibernate: Useful APIs for working with Hibernate.
- ipsquare-commons-servlet: APIs related to Java servlets and filters.
All related artifacts can be found in the Maven Central Repository.
Series About Java Concurrency – Pt. 4
This post is the solution of Series About Java Concurrency – Pt. 3, so be sure to read my last post before this one.
So, what does the code from Series About Java Concurrency – Pt. 3 do? As it turns out, there isn’t a single answer: If you are using a recent Oracle VM the behavior of the program seems to depend on whether your VM runs in server or client mode. In client mode the program will most likely stop after roughly 2.5s as one would naively expect. In server mode however, the program just loops forever. The update to stop in line 24 never gets visible to the main thread. This isn’t a bug, but perfectly legal behavior according to the Java Memory Model, which allows optimizing
while(!stop)
into
while(true)
Synchronization is not just about avoiding data races; it is also needed to avoid reading stale data. This may sound weired, but allows the VM as well as the CPU to apply powerful optimizations (for example Out-of-order execution) that would otherwise be impossible or very hard to implement.
One way to fix our shiny little program is by using the synchronized keyword as demonstrated below:
package com.wordpress.mlangc.concurrent;
public class PleaseWait
{
private static final long MILLIS_TO_WAIT = 2500;
private static boolean stop = false;
private static synchronized boolean isStop()
{
return stop;
}
private static synchronized void setStop(boolean stop)
{
PleaseWait.stop = stop;
}
public static void main(String[] args)
{
Thread timer = new Thread(new Runnable()
{
public void run()
{
try
{
Thread.sleep(MILLIS_TO_WAIT);
}
catch(InterruptedException e)
{
// We are already exiting the thread.
}
finally
{
setStop(true);
System.out.println("Stop requested.");
}
}
});
timer.start();
long start = System.currentTimeMillis();
while(!isStop())
; // <-- Do nothing.
long stoppedAfter = System.currentTimeMillis() - start;
System.out.printf("Stopped after %dms.\n", stoppedAfter);
}
}
It is important to note that both the setter and the getter are synchronized using the same lock. Synchronizing only the setter method is not enough!
While the code above works reasonably well, we can still do better because we don’t need mutual exclusion, but just want to ensure that main thread sees what the timer thread does. This can be accomplished quite easily by declaring stop to be volatile. The volatile keyword makes sure that the main thread always sees to most recent value of stop without any additional uses of synchronized. Following this advice, we end up with something like this:
package com.wordpress.mlangc.concurrent;
public class PleaseWait
{
private static final long MILLIS_TO_WAIT = 2500;
private static volatile boolean stop = false;
public static void main(String[] args)
{
Thread timer = new Thread(new Runnable()
{
public void run()
{
try
{
Thread.sleep(MILLIS_TO_WAIT);
}
catch(InterruptedException e)
{
// We are already exiting the thread.
}
finally
{
stop = true;
System.out.println("Stop requested.");
}
}
});
timer.start();
long start = System.currentTimeMillis();
while(!stop)
; // <-- Do nothing.
long stoppedAfter = System.currentTimeMillis() - start;
System.out.printf("Stopped after %dms.\n", stoppedAfter);
}
}
The only difference to the broken version from Series About Java Concurrency – Pt. 3 is the volatile keyword in line 6.
Last but not least I want to state clearly that this article is not about the proper way to implement task cancellation, which is a nontrivial topic of it’s own. If parts of this post are new to you, I strongly suggest that you grab yourself a copy of the excellent book Java Concurrency in Practice and read at least the first chapter called Fundamentals thoroughly. By doing so you are almost certainly saving yourself from nasty surprises or frustrating debugging session in the future.
Series About Java Concurrency – Pt. 3
As it seems that even experienced Java programmers might have serious misconceptions regarding the Java memory model, I’ve decided to publish a concurrency related puzzle once again. Consider the following simple program and try to predict it’s output:
package com.wordpress.mlangc.concurrent;
public class PleaseWait
{
private static final long MILLIS_TO_WAIT = 2500;
private static boolean stop = false;
public static void main(String[] args)
{
Thread timer = new Thread(new Runnable()
{
public void run()
{
try
{
Thread.sleep(MILLIS_TO_WAIT);
}
catch(InterruptedException e)
{
// We are already exiting the thread.
}
finally
{
stop = true;
System.out.println("Stop requested.");
}
}
});
timer.start();
long start = System.currentTimeMillis();
while(!stop)
; // <-- Do nothing.
long stoppedAfter = System.currentTimeMillis() - start;
System.out.printf("Stopped after %dms.\n", stoppedAfter);
}
}
The solution, together with explanations will be published in the near future.
Be Careful When Converting Java Arrays to Lists
Unfortunately not everything that should be trivial actually is. One example is converting Java arrays to lists. Of course, there is Arrays.toList, but using this method carelessly will almost certainly lead to nasty surprises. To see what I mean consider the following program and try to predict its output:
package com.wordpress.mlangc.arrays;
import java.util.Arrays;
public class ArraysToList
{
public static void main(final String[] args)
{
System.out.println(
Arrays.asList(new String[] { "a", "b" }));
System.out.println(
Arrays.asList(new Integer[] { 1, 2 }));
System.out.println(
Arrays.asList(new int[] { 1, 2 }));
System.out.println(
Arrays.asList(new String[] { "a", "b" }, "c"));
}
}
As the Javadocs for Arrays.asList are quite vague, I can’t blame you for having some difficulties coming to a conclusion, so here is the answer step by step:
- Line 9 prints “[a, b]“ to our shiny little console which is pretty much what one would expect from a sane API, so we are happy.
- The same is true for line 12 which results in “[1, 2]“.
- Line 15 however is different, not only because 15 is not 12, but also because an int is not an Integer, and therefore prints “[[I@39172e08]“ to our console, that is not shiny anymore. Instead of a list containing two Integer objects, we got a list containing the array as its sole element.
- After what we have seen above, it should not take you by surprise that line 18 results in another mess that looks like “[[Ljava.lang.String;@20cf2c80, c]“.
So, what happened? The first two print statements worked as expected, because the Java Language Specification states that calling a method with signature foo(T… t) like foo(new T[] { bar, baz }) is semantically equivalent to foo(bar, baz). In Arrays.asList T is a type parameter, so it has to be an Object, and this is not true for int, but for int[]. Thats why the statement in line 16 is equivalent to
Arrays.asList(new Object[] { new int[] { 1, 2 } })
Last but not least, the statement in line 19 called for trouble from the very beginning. We told the compiler that we want a list containing an array of strings and a string, which is exactly what we got.
So far for the explanation, but there is something else we can learn from that: The real source of confusion is not that varargs feature is badly designed; I would rather say that the opposite is true. The problem is that Arrays.asList violates EffJava2 Item 42, which explains clearly, in fact giving Arrays.asList as a bad example, why you should be quite careful when designing APIs that use Java varargs. I won’t repeat the reasoning of the book here, as you really should read it yourself, but for the sake of completeness I have to point out that the problematic statements from above would have been rejected by the compiler back in the old days of Java 1.4, which was a good thing. We can still use Arrays.asList today, but doing so safely requires us to be aware of the subtillities we are facing. Here are a few rules for converting arrays to lists, that guarantee nothing unexpected happens:
- If you convert to a list just to convert to a string, use Arrays.toString instead. It does what you want all the time and also works for arrays of primitives.
-
If you want to convert an array of primitives to a list of boxed primitives, take advantage of Apache Commons Lang, which most likely is a dependency of your project already anyway, and use ArrayUtils.toObject like this:
List<Integer> list = Arrays.asList(ArrayUtils.toObject(new int[] { 1, 2 }));Note however that lists of boxed primitives should not generally be preferred over arrays containing primitives.
-
If you want to convert an array of object references, use Arrays.asList directly:
List<String> list = Arrays.asList(new String[] { "a", "b" });Don’t forget to make sure that the people you work with won’t emulate you carelessly.
Of course, you can also choose to just remember that Arrays.asList might behave unexpectedly and use plain for loops instead, but that clutters your code and comes with a performance penalty.
What you don’t want to know about Java boxed primitives
In my last post I used a simple calculation to argument that a List containing 1024 Integer objects will take at least 8kb on a 32bit system, and at least 12kb on a 64bit system. I then also argued that additional storage for 1024 Java objects is needed (some of them might be cached, but let’s talk about that later). Unfortunately however, I did not investigate about how much that would be until Ortwin Glück, who runs the Java Anti-Patterns page, wrote me an email with some very interesting numbers. So I started doing some research:
How can you reliably obtain the size of a random Java object? Well, there are basically two answers: You can use a hack that is outlined here, or you can use Instrumentation.getObjectSize(Object obj). Still, obtaining an implementation of Instrumentation and transversing object graphs is far more work than I want to invest. Luckily there is java.sizeOf which does exactly that for us. java.sizeOf basically is a Java agent and a library to access the functionality of it, packaged into a single class called SizeOf. To see what it can do, I wrote this small program:
package com.wordpress.mlangc.sizeof;
import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import net.sourceforge.sizeof.SizeOf;
public class TestSizeOf
{
private static final String WTF = ":-O";
private static final int SIZE = 1024;
public static void main(final String[] args)
{
SizeOf.skipStaticField(true);
printSize(new Object());
printSize(Integer.valueOf(0));
printSize(new int[SIZE]);
List<Integer> intList = new ArrayList<Integer>(SIZE);
for(int i = 0; i != SIZE; ++i)
intList.add(i);
printSize(intList);
}
private static void printSize(final Object obj)
{
System.out.println("sizeof(" + describeObject(obj) + "): "
+ SizeOf.humanReadable(SizeOf.deepSizeOf((obj))));
}
private static String describeObject(final Object obj)
{
Class<?> clazz = obj.getClass();
StringBuilder sb = new StringBuilder(clazz.getCanonicalName());
if(clazz.isArray())
{
if(!sb.toString().endsWith("[]"))
throw new AssertionError(WTF);
sb.insert(sb.length() - 1, Array.getLength(obj));
}
else if(obj instanceof Collection<?>)
{
Collection<?> collection = (Collection<?>) obj;
sb.append("<")
.append(extractGenericTypeFromCollectionHack(collection))
.append(">")
.append("[")
.append(collection.size())
.append("]");
}
return sb.toString();
}
private static String extractGenericTypeFromCollectionHack(final Collection<?> collection)
{
//It is impossible to do this reliably because of type erasure;
//for our purposes however, this simple hack,
//that just inspects the first element, should do nicely.
if(collection.isEmpty())
return "?";
return collection.iterator().next().getClass().getCanonicalName();
}
}
Then I executed the code above like this (the -javaagent option is important):
$ java -javaagent:${LIB_PATH}/SizeOf.jar com.wordpress.mlangc.sizeof.TestSizeOf
JAVAGENT: call premain instrumentation for class SizeOf
sizeof(java.lang.Object): 16.0b
sizeof(java.lang.Integer): 24.0b
sizeof(int[1024]): 4.0234375Kb
sizeof(java.util.ArrayList<java.lang.Integer>[1024]): 32.0625Kb
So, yes a plain and mostly useless Java Object consumes 16bytes on my system (x86_64 with sun-jdk-1.6.0_17) – thats already 4 int values! An Integer object swallows 24bytes (I would have guessed 20 = 16 + 4 – maybe this has something to do with alignment?), which is 6 int values. An array of 1024 int values needs 4kb as it should, but an ArrayList<Integer> swallows 32kb = 24kb + 8kb and therefore consumes unbelievable 8 times more memory than the array.
So what do I suggest considering the disturbing facts outlined above?
- Prefer primitive types to boxed primitives as suggested by EffJava2 Item 49. Not only are they more efficient, the fact that they cannot be null makes them more attractive for other reasons too (albeit not always).
- If you use wrapped primitives, prefer autoboxing or the Foo.valueOf(foo f) factory methods to direct constructors calls, as this is (citing the Javadocs) “likely to yield significantly better space and time performance by caching frequently requested values”. This is especially important for Boolean, where autoboxing or valueOf(…) never creates a new object.
- Be aware of the fact, that Collections containing boxed primitives are huge memory wasters compared to arrays, even when caching is considered, but keep in mind, that premature optimization is the root of all evil.