A Note About Sets and Maps

February 18, 2018 Leave a comment

Recently I’ve repeatedly stumbled over code like this:

//...
Set<String> someSet = new HashSet<>(2);
someSet.add("foo");
someSet.add("bar");
doSomething(someSet);
//...

Obviously the intention here was initialize a HashSet for 2 elements. What’s not so obvious, is that this code actually allocates a HashSet that might hold at most a single element, that is subsequently resized to contain up to a maximum of 3 elements when "bar" is added. The explanation for this in fact counterintuitive behavior can be found in the Javadocs, which state

public HashSet​(int initialCapacity)
Constructs a new, empty set; the backing HashMap instance has the specified initial capacity and default load factor (0.75).

And while this is still slightly vague about the concrete meaning of initialCapacity, the parameter description is rather explicit:

initialCapacity – the initial capacity of the hash table

This means that rather then telling you how many elements you can put into your set before it might have to be resized, this parameter specifies the size of the internally used hash table, that given the default load factor of 0.75, will be enlarged as soon as there are more than 0.75*TABLE_SIZE elements in the set (even if some or all of them end up in the same hash bucket). Thus, to correctly initialize a HashSet for 2 elements, one has to call

Set<String> someSet = new HashSet<>(3);

which does the same as

Set<String> someSet = new HashSet<>(4);

since the internal table size is always a power of 2.

I’d recommend using com.google.common.collect.Sets#newHashSetWithExpectedSize though, that addresses exactly the problem that is being discussed here:

public static HashSet newHashSetWithExpectedSize(int expectedSize)
Returns a new hash set using the smallest initial table size that can hold expectedSize elements without resizing. Note that this is not what HashSet.HashSet(int) does, but it is what most users want and expect it to do.

An even better alternative that leads to more concise code at the same time is to use com.google.common.collect.ImmutableSet#of(E, E).

Last but not least, it should be mentioned that exactly the same reasoning applies to HashMap that is actually used to implement HashSet internally.

Categories: Java, Programming Tags: ,

A Note on Netstat

April 15, 2014 Leave a comment

I’ve just published another blog post for codecentric:
https://blog.codecentric.de/en/2014/04/note-netstat/

Categories: Linux Tags: , ,

It’s About Time

January 27, 2014 Leave a comment

I’ve just published a brief discussion about the new Java 8 Date-Time API for my new employee: https://blog.codecentric.de/en/2014/01/time/

Categories: Programming Tags: ,

Good Objects Breaking Bad

July 4, 2013 Leave a comment

The purpose of this blog post is to look at two typical problems that may arise when different versions of the same class may be loaded by the JVM. Abstractly speaking, I’m going to examine the following two scenarios in detail:

  1. Two versions of the same class, but originating from different class loaders, come in contact with each other.
  2. There are two versions of the same class available on the class path. Only one gets loaded, but it might be hard to impossible to tell which.

As the last scenario is far more common, I’m going to discuss it first. It often leads to seemingly invalid NoSuchMethodErrors. To understand why that happens, here is how this issue can be reproduced easily:

  1. Create a library, Lib1.jar, that contains exactly this class:

    package com.wordpress.mlangc;
    
    public class GoodObject {
       
    }
    
  2. Create another library, Lib2.jar, that contains an extended version of GoodObject:

    package com.wordpress.mlangc;
    
    public class GoodObject {
        public int getExcitingNewFeature() {
            return 42;
        }
    }
    
  3. Create a third library, Lib3.jar, that depends on Lib2.jar and contains this nifty peace of code:

    package com.wordpress.mlangc;
    
    public class BetterObject {
        private GoodObject goodObject = new GoodObject();
        
        public int doSomethingAwfullyExcitingAndNew() {
            return goodObject.getExcitingNewFeature();
        }
    }
    
  4. Now add the libraries you’ve just created, Lib1.jar, Lib2.jar and Lib3.jar, to your class path and run the following:

    package com.wordpress.mlangc;
    
    public class B0rkage {
        public static void main(String[] args) {
            System.out.println("Feel free to try this at home: " 
                + new BetterObject().doSomethingAwfullyExcitingAndNew());
        }
    }
    

    You should see an error message like

    Exception in thread "main" java.lang.NoSuchMethodError: com.wordpress.mlangc.GoodObject.getExcitingNewFeature()I
    	at com.wordpress.mlangc.BetterObject.doSomethingAwfullyExcitingAndNew(BetterObject.java:7)
    

    If not, make sure that you’ve added the libraries to the class path in the “correct” order. Indeed, if Lib2.jar is loaded before Lib1.jar, like here

    java -cp "${deps}/Lib2.jar:${deps}/Lib1.jar:${deps}/Lib3.jar:." com.wordpress.mlangc.B0rkage
    

    the program works just fine. Note that your IDE should allow you to specify the order in which JARs are loaded.

As it is my feeling that example above is rather self explanatory, I won’t elaborate on it in detail. What matters is simply what version GoodObject is found first, and this in turn depends on the order in which your JARs are scanned. While you can easily control the order in this example, things might be far more tricky in a real application. Also note that the order in which JARs in WEB-INF/lib are scanned is undefined as far as I know.

The second issue, namely that two versions of the same class originating from different class loaders come into contact with each other, is less common, but might lead to extremely surprising behavior. I going to demonstrate this on two examples. Both are going to result in a ClassCastException with the seemingly nonsensical error message

java.lang.ClassCastException: com.wordpress.mlangc.GoodObject cannot be cast to com.wordpress.mlangc.GoodObject

The first example uses basic Java only. Compared to the second example it’s extremely artificial (you would normally never do such a thing intentionally or unintentionally), but it shows the issue at hand very clearly. It goes like this:

  1. Create a library, GoodLib.jar containing exactly this class:

    package com.wordpress.mlangc;
    
    public class GoodObject {
       
    }
    
  2. Execute

    public final class WhenGoodObjectsGoBad {
    
        public static void main(String[] args) {
            System.out.println("Best served cold: " + goodObjectFromOtherClassLoader());
        }
        
        private static GoodObject goodObjectFromOtherClassLoader() {
            try(URLClassLoader cl = new URLClassLoader(new URL[] { new URL(pathToJarWithGoodObject()) }, null)) {
                Class<?> goodObjectClass = cl.loadClass(GoodObject.class.getName());
                return (GoodObject) goodObjectClass.newInstance();
            } catch (IOException | ClassNotFoundException | InstantiationException | IllegalAccessException e) {
                throw new RuntimeException(e);
            }
        }
        
        private static String pathToJarWithGoodObject() {
            String relPath = "/" + GoodObject.class.getName().replace(".", "/") + ".class";
            String absPath = GoodObject.class.getResource(relPath).toExternalForm();
            String jarPath = absPath.substring(0, absPath.length() - relPath.length() + 1);
            return jarPath;
        }
    }
    

    with GoodLib.jar in your class path. You should see an error message like

    Exception in thread "main" java.lang.ClassCastException: com.wordpress.mlangc.GoodObject cannot be cast to com.wordpress.mlangc.GoodObject
    	at com.wordpress.mlangc.WhenGoodObjectsGoBad.goodObjectFromOtherClassLoader(WhenGoodObjectsGoBad.java:16)
    	at com.wordpress.mlangc.WhenGoodObjectsGoBad.main(WhenGoodObjectsGoBad.java:10)
    

Don’t spent too much attention to the code in pathToJarWithGoodObject(). What matters is that we find the path to the JAR that contains GoodObject in a portable manner. This path is then used to construct a second class loader in line 8. Note that I explicitly pass null for the parent class loader. The new class loader is employed to load GoodObject once again, which directly leads to the ClassCastException mentioned above. If you take a look at the relevant parts of the The Java® Virtual Machine Specification it shouldn’t be that surprising that the code above calls for serious trouble. Here is a quote from chapter 5.3. Creation and Loading:

At run time, a class or interface is determined not by its name alone, but by a pair: its binary name (§4.2.1) and its defining class loader.

The cast in line 10 fails, because it attempts to cast a GoodObject originating from a foreign class loader to a GoodObject tied to the class loader that loaded our main class. The error message would be clearer if it included the involved class loaders, but with the necessary background information it’s pretty easy to interpret it the way it is.

Before closing this discussion I want to show you another, more realistic scenario that I’ve actually seen in the wild, where two identical classes are loaded by different class loaders. It involves a custom Tomcat Valve and a simple web application:

  1. Create a Java library, GoodValve.jar, containing

    package com.wordpress.mlangc;
    
    import java.io.IOException;
    
    import javax.servlet.ServletException;
    
    import org.apache.catalina.connector.Request;
    import org.apache.catalina.connector.Response;
    import org.apache.catalina.valves.ValveBase;
    
    public class GoodValve extends ValveBase {
        @Override
        public void invoke(Request request, Response response) throws IOException, ServletException {
            request.setAttribute("goodObject", new GoodObject());
            getNext().invoke(request, response);
        }
    }
    

    as well as the GoodObject class from above. You need some Tomcat APIs to compile this code. If you are using Maven, adding the following dependency should do the trick:

    <dependency>
      <groupId>org.apache.tomcat</groupId>
      <artifactId>tomcat-catalina</artifactId>
      <version>7.0.40</version>
      <scope>provided</scope>
    </dependency>
    
  2. Put the generated JAR file in ${catalina.home}/lib and add the following line to the appropriate Host section in your ${catalina.home}/conf/server.xml:

    <Valve className="com.wordpress.mlangc.GoodValve"/>
    
  3. Create a WAR file that contains the following servlet

    package com.wordpress.mlangc;
    
    import java.io.IOException;
    
    import javax.servlet.ServletException;
    import javax.servlet.annotation.WebServlet;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    
    @WebServlet("/*")
    public class GoodServlet extends HttpServlet {
        
        @Override
        protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            doPost(req, resp);
        }
        
        @Override
        protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
            resp.setContentType("text/plain;charset=utf-8");
            try {
                GoodObject goodObject = (GoodObject) req.getAttribute("goodObject");
                resp.getWriter().print("Mmmh, let's take a look: " + goodObject);
            } catch(ClassCastException e) {
                e.printStackTrace(resp.getWriter());
                resp.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
            }
        }
    }
    

    and of course another copy of our GoodObject class.

  4. Deploy the WAR file you’ve just created into the Tomcat installation you’ve modified before, start the server and try to open a page from the deployed WAR file. Voilà, your browser should now tell you that

    java.lang.ClassCastException: com.wordpress.mlangc.GoodObject cannot be cast to com.wordpress.mlangc.GoodObject
    

Again, the key to this example is that the same class is loaded by different class loaders. The GoodObject in GoodValve is loaded from GoodValve.jar by the Tomcat Common Class Loader while the GoodObject from GoodServlet is loaded by the Webapp Class Loader from the WAR file. Take a look at Tomcat 7 Class Loader HOW-TO if you want to know the gory details.

Categories: Programming Tags:

Be Careful When Hacking Into A Java Class

June 3, 2013 Leave a comment

I recently stumbled over a blog post that explains the usage of Java Anonymous Classes. Not being very happy with the example that was given to show the discussed language feature in action, I left a rather critical comment which I want to explain in more detail:

Tweaking existing classes on the fly is a legitimate use of Java Anonymous Classes. Still, as doing so involves inheritance, the usual rules apply:

  • The anonymous class should be in a is-a relationship to its parent. This is not true for the LinkedList in the example given here, as the tweaked add method clearly violates the specification of Collection.add(…).
  • Inheriting from classes that are not designed for inheritance is dangerous, as unlike composition and method invocation, inheritance violates encapsulation. By carelessly inheriting from a random class, you might create strong ties to the implementation of said class without even knowing. Again referring to the queue from here, imagine what would happen if somebody called queue.addAll(…). As it turns out, this very much depends on whether LinkedList.addAll(…) is implemented using LinkedList.add(…) or not, which of course is an implementation detail that is not mentioned in the Javadocs. It might vary between different class library vendors and versions. I strongly recommend reading Items 16 and 17 from Effective Java in this context.

Having said that, I still have explain why I would rather use

private static final ThreadLocal<Integer> someThreadLocalInt =
    new ThreadLocal<Integer>() {
      @Override 
      protected Integer initialValue() {
        return 42;
      }
    };

to illustrate a typical use of Java Anonymous Classes for tweaking an existing implementation. The reason for this is that ThreadLocal is designed for inheritance, while LinkedList is not. I know this, because the Javadocs for ThreadLocal contain information for clients who want to extend this class while the Javadocs for LinkedList don’t.

Categories: Programming Tags: ,

IP SQUARE Commons Servlet 2.1.0 Released

March 29, 2013 Leave a comment

I’ve just released IP SQUARE Commons Servlet 2.1.0 to the Central Maven Repository:

Categories: Programming Tags: ,

IP SQUARE Commons Hibernate 2.0.1 Released

March 29, 2013 Leave a comment

I’ve just released IP SQUARE Commons Hibernate 2.1.0 to the Central Maven Repository:

  • The library now comes with a manual and improved Javadocs.
  • The implementation takes advantage of APIs added with IP SQUARE Commons Core 2.1.0.
Categories: Programming Tags: ,

IP SQUARE Commons Core 2.1.0 Released

March 29, 2013 Leave a comment

I’ve just released IP SQUARE Commons Core 2.1.0 to the Central Maven Repository:

Categories: Programming Tags: ,

Grundlagen der Darstellungstheorie von endlichen Gruppen

March 24, 2013 Leave a comment

Nachdem doch einiges an Arbeit darin steckt, habe ich mich entschlossen meine Bachelorarbeit in Mathematik über die Grundlagen der Darstellungstheorie von endlichen Gruppen unter einer Creative Commons Lizenz online zu stellen. Vielleicht hat ja irgendjemand dadurch einen Nutzen. Wer Fehler darin findet, und solche gibt es ganz bestimmt genug, möge mich darauf hinweisen, auch dann wenn es sich um Trivialitäten handelt.

Categories: Mathematics Tags:

Series About Java Concurrency – Pt. 6

February 15, 2013 Leave a comment

This is the solution to my previous post, so make sure that you read it before you continue. For your convenience, here is the program we are going to discuss once again:

public class ThreadNoMore extends Thread
{
    private static final int N_THREADS = 16;
    private static final AtomicInteger threadsStarted = new AtomicInteger();
    
    private final long ctorThreadId = Thread.currentThread().getId();
    
    @Override
    public synchronized void start()
    {
        if(ctorThreadId != Thread.currentThread().getId())
            threadsStarted.incrementAndGet();
    }
    
    public static void main(String[] args) throws InterruptedException
    {
        Thread[] threads = new Thread[N_THREADS];
        for(int i = 0; i != N_THREADS; ++i)
            threads[i] = new ThreadNoMore();
        
        for(Thread th : threads)
            th.start();
        
        for(Thread th : threads)
            th.join();
        
        System.out.println("threadsStarted: " + threadsStarted);
    }
}

As already pointed out by Ortwin Glück, the program always prints

threadsStarted: 0

because ThreadNoMore.start() is always executed in thread that invoked it. And if you think that over twice, this should not surprise you at all, as methods are always executed in the thread that invoked them. This even applies to Thread.run(), which is the method we should have overridden instead, but unlike Thread.start(), Thread.run() is normally invoked by the Java Virtual Machine, as the Javadocs for Thread.start() tell us:

Causes this thread to begin execution; the Java Virtual Machine calls the run method of this thread.

So apart from not doing what we might have intended, carelessly overriding Thread.start(), that is without calling super.start(), so that the JVM can do its magic, leaves us with a crippled class, that no longer has anything to do with a thread at all. Considering these facts, it should not take you by surprise that

public class Thrap extends Thread
{
    private static final int N_THREADS = 42;
    
    private static final AtomicInteger 
        started = new AtomicInteger(),
        ran = new AtomicInteger();
    
    @Override
    public synchronized void start()
    {
        started.incrementAndGet();
    }
    
    @Override
    public void run()
    {
        ran.incrementAndGet();
    }
    
    public static void main(String[] args) throws InterruptedException
    {
        Thread[] threads = new Thread[N_THREADS];
        for(int i = 0; i != N_THREADS; ++i)
            threads[i] = new Thrap();
        
        for(Thread thread : threads)
            thread.start();
        
        for(Thread thread : threads)
            thread.join();
        
        System.out.println("started: " + started);
        System.out.println("ran: " + ran);
    }
}

results in

started: 42
ran: 0

being written to your terminal. As you can see clearly, Thrap.run() is not executed at all, neither from a newly created, nor from the main thread. The fix the code above, you have to call super.start() in Thrap.start() like so:

    @Override
    public synchronized void start()
    {
        super.start();
        startedThreads.incrementAndGet();
    }

After this modification you get

started: 42
ran: 42

as expected.

So what can we learn from all of this? At first, there are two things to remember about the Java threads API:

Equally important are the consequences for API design: Interfaces should be easy to use correctly and hard to use incorrectly. Unfortunately the Thread class violates this principle, as it is quite easy to misuse as we have just seen. More generally, requiring clients to call the super version of a method they are overriding is considered to be an anti pattern for this very reason.

Categories: Programming Tags: , ,