I’ve been taking the algorithms class with Robert Sedgewick through Coursera. It’s been a rather humbling experience. However, it made me begin to question the algorithms and datastructures which I turn to everyday. How good does the JDK implementation perform? Are there superior drop-in replacements? If so, when should I use one versus another?
With so many questions and no answers, I decided to put them to the test along side some alternate and generally available collection libraries.
In this installment, I will focus on the list.
The Candidates
Notably missing from the list of contestants is Trove. I chose not to assess Trove because I was limiting the test to drop-in replacements to a generic List. Trove creates different implementations specific to the primitive type which it contains.
Java collections
In this benchmark I am using the default Java collections classes supplied with and running on the 1.7.0.06 version of the HotSpot JDK. The two list implementations we’ll be looking at here are:
- ArrayList
- LinkedList
Apache commons collections
We’ll also be taking a look at the default list collections supplied with the 3.2.1 version of the Apache Commons collections libraries. The list candidates we’ll be looking at are:
- TreeList
- GrowthList
- NodeCachingLinkedList
Javalution
We’ll also take a look at the Javalution FastList implementation.
Guava
Last, but not least, we’ll be benchmarking the Guava list implementation.
The Benchmark
The benchmark will consist of a series of tests on lists of various sizes with a list payload of type Integer.
First, the test will acquiesce the machine by calling an explicit System.gc and pausing for an interval before beginning the test. Then, the test will:
- add – Create a list of size N.
- get – Get each of the N items from the list.
- iterate – Iterate over each of the items in the list.
- randomInsert – Insert N items into the list at randomly generated locations.
- removeFromHead – Remove N items from the beginning of the list.
- removeFromTail – Remove N items from the end of the list.
- addToTail – Add N items to the end of the list.
- randomRemove – Remove N items from random locations within the list.
- memory – Record the deep memory usage of the list.
- clearMemory – Call clear(), then record the deep memory usage of the list.
Tests were conducted for the for N = [1000, 10000, 100000, 200000 ] though I stop at 100,000 in cases where the trends are evidient.
The Results
In this benchmark, we start with an empty list then iterate with I from zero to N-1 adding an unique integer value I to the list.
The TreeList has the most overhead in creating the list. FastList also has some overhead associated with it.
At larger values of N, the performance of Java Link List, FastList and NodeCachingLinkList begin to suffer. Linked list implementations tend to have poor get performance due to the fact that they must iterate over each element in the list until they find the location of the item in which they seek.
FastList appears to be link list based, however, with much worse performance at higher values of N.
All of the implementations are within an order of magnitude of each other with respect to iterating. ArrayList and GrowthList being the worst performers.
In this test, we insert N items into a list of N items with a result being a list of size 2N. In this benchmark we begin to see some drastic design tradeoffs in the various list implementations.
In order to avoid use of a logarithmic scale, I divided the chart into 2 separate charts to prevent larger values from obscuring smaller values.
Even with lists of 10k or smaller we see that TreeList is the fastest by far. FastList is the slowest. GuavaList performs admirably as well, beating the Java ArrayList substantially.
At even higher values of N, we can see which implementations should be avoided if the number of items we will store is large, and if random insertion is a common operation.
Random removal of N items from a list of size 2N reveals 3 poorly performing implementations:
Java ArrayList, GrowthList and GuavaList. It is my educated guess that these three implementations are backed with some sort of resizable array which must be shifted periodically in order to efficiently reclaim memory.
Removal from the end of a list appears to be the worst case for the FastList.
In this benchmark we start with a list of 2N items, then remove N of them from random locations within the list.
In this test, we see three clusters of behavior:
- TreeList which has superb performance.
- ArrayList, GrowthList and GuavaList which have good performance.
- LinkedList, FastList and NodeCachingLinkedList which have relatively poor performance.
In this benchmark we seem to find the catch in TreeList. It has overhead relative to creation time and performs worse than it’s counterparts.
Last, but not least, we measure the memory overhead of the different list structures. Guava List and Growth Lis really shine here. They appear to have very similar behaviors in most respects.
TreeList has additional memory overhead as does FastList.
In this benchmark we start with a list of N items, then call clear() and measure it’s memory usage. This is a good way to measure how efficient/aggressive the structure is with respect to cleaning up after itself.
As we expect, resizable array implementations leave some buffer stranded while linked list implementations clean up pretty effectively.
FastList does not seem to perform much cleanup after itself.
So Which List Is Right For You?
It depends.
Each implementation has tradeoffs, some of them quite severe and some of them really shine in certain situations.
If you are memory constrained, you need look no further than your java.util.ArrayList. It has great memory performance. GrowthList and GuavaList have comparable memory profiles. GrowthList and Guava Lists are superior to ArrayList for the case where there are many interstitial insertions into the list taking place and all three are poor performers if you need to remove items from the head of the list.
If memory is not a problem and you expect to perform many insertions and deletions upon the list, then the obvious choice is the Apache Commons TreeList.
While having a slightly larger insertion at the tail overhead, it more than makes up for this in the area of random insertions and deletions.
Javalution FastList was anything but, faring significantly worse than the rest. I suspect that it was once quite superior to the default JDK implementations, but has since been surpassed.
The Code
package com.javainc.collections;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Random;
import javolution.util.FastList;
import org.apache.commons.collections.list.GrowthList;
import org.apache.commons.collections.list.NodeCachingLinkedList;
import org.apache.commons.collections.list.TreeList;
import com.google.common.collect.Lists;
import com.javamex.classmexer.MemoryUtil;
public class TestList
{
private static void testPerformance(String testName, int numItems,
int numRandom, List list)
{
long startTime, endTime;
int i;
Random rand = new Random();
int size = numItems;
cleanup();
// add:
System.out.print(testName + "," + numItems + "," + numRandom + ",");
startTime = System.currentTimeMillis();
for (i = 0; i < numItems; i++)
{
list.add(i);
}
endTime = System.currentTimeMillis();
System.out.print(endTime - startTime);
// Get:
startTime = System.currentTimeMillis();
for (i = 0; i < numItems; i++)
{
list.get(i);
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// ITERATE
startTime = System.currentTimeMillis();
for (Object obj : list)
{
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// RANDOM_INSERT
size = list.size();
startTime = System.currentTimeMillis();
for (i = 0; i < numRandom; i++)
{
list.add(rand.nextInt(size++), i);
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// REMOVE_FROM_HEAD
startTime = System.currentTimeMillis();
Iterator it = list.iterator();
for (i = 0; i < numRandom; i++)
{
list.remove(0);
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// Ensure the list is the original size.
for (i = 0; i < numRandom; i++)
{
list.add(i);
}
endTime = System.currentTimeMillis();
// REMOVE_FROM_TAIL
size = list.size();
startTime = System.currentTimeMillis();
for (i = 0; i < numRandom; i++)
{
list.remove(size - 1);
size--;
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// Ensure the list is the original size.
for (i = 0; i < numRandom; i++)
{
list.add(i);
}
endTime = System.currentTimeMillis();
// ADD TO END
startTime = System.currentTimeMillis();
for (i = 0; i < numRandom && it.hasNext(); i++)
{
list.add(i);
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
// REMOVE_RANDOM
startTime = System.currentTimeMillis();
for (i = 0; i < numRandom && it.hasNext(); i++)
{
list.remove(rand.nextInt(size));
size--;
}
endTime = System.currentTimeMillis();
System.out.print("," + (endTime - startTime));
System.out.print("," + MemoryUtil.deepMemoryUsageOf(list));
list.clear();
System.out.println("," + MemoryUtil.deepMemoryUsageOf(list));
}
private static void cleanup()
{
// Clean up the machine.
System.gc();
try
{
Thread.sleep(2000);
}
catch(Exception ex)
{
ex.printStackTrace();
}
}
public static void main(String args[])
{
System.out
.println("CLASS,ITEMS,RANDOM,add,get,iterate,randomInsert,removeFromHead"
+ ",removeFromTail,addToTail,randomRemove,memory,clearMemory");
int testSet[][] = { { 100, 100 }, { 1000, 1000 }, { 10000, 10000 },
{ 100000, 100000 }, { 200000, 200000 } };
for (int i = 0; i < testSet.length; i++)
{
testPerformance("Java ArrayList", testSet[i][0], testSet[i][1],
new ArrayList());
testPerformance("Java LinkedList", testSet[i][0], testSet[i][1],
new LinkedList());
testPerformance("Apache GrowthList", testSet[i][0], testSet[i][1],
new GrowthList());
testPerformance("Apache TreeList", testSet[i][0], testSet[i][1],
new TreeList());
testPerformance("Apache NodeCachingLinkedList", testSet[i][0],
testSet[i][1], new NodeCachingLinkedList());
testPerformance("Javalution FastList", testSet[i][0], testSet[i][1],
new FastList());
testPerformance("Guava List", testSet[i][0], testSet[i][1],
Lists.newArrayList());
}
}
}

























