Time complexity of array/list operations [Java, Python]

yourbasic.org

To write fast code, you must know the difference between constant and linear time array operations.

Scooter overtaking taxi at high speed

Accidentally quadratic list code is very common and can be hard to spot, but when the list grows your code grinds to a halt.

This text takes a detailed look at the performance of basic array operations and discusses alternatives to a standard array. It also includes cheat sheets of expen­sive list operations in Java and Python.

Array basics

An array is the most fundamental collection data type. It consists of elements of a single type laid out sequentially in memory. You can access any element in constant time by integer indexing.

Arrays are available in all major languages. In Java you can either use []-notation, or the more expressive ArrayList class. In Python, the list data type is imple­mented as an array.

Performance

In general, arrays have excellent performance. To optimize array performance is a major goal of memory hardware design and OS memory management.

Dynamic array

In a dynamic array, elements are stored at the start of an underlying fixed array, and the remaining positions are unused.

Here’s a view of the memory when appending the elements 2, 7, 1, 3, 8, 4 to an initially empty dynamic array with capacity 2.

Dynamic array example

The time to append an element is linear in the worst case, since it involves allocating new memory and copying each element.

However, if we expand the array by a constant proportion, e.g. by doubling its size, the total time to insert n elements will be O(n), and we say that each insertion takes constant amortized time.

See Amortized time complexity for more on how to analyze data structures that have expensive operations that happen only rarely.

Expensive list operations

To add or remove an element at a specified index can be expensive, since all elements after the index must be shifted. The worst-case time complexity is linear.

Similarly, searching for an element for an element can be expensive, since you may need to scan the entire array.

In this Python code example, the linear-time pop(0) call, which deletes the first element of a list, leads to highly inefficient code:

Warning: This code has quadratic time complexity. It runs in time Θ(n2), where n is the initial length of the list a. This means that the program is useful only for short lists, with at most a few thousand elements.

while len(a) > 0:
    foo = a.pop(0)

To avoid this type of performance problems, you need to know the difference between constant and linear time list operations.

Expensive Java ArrayList methods

The following ArrayList methods operate on a subset of the elements, but still have time complexity that depends on the size n of the list.

Method Worst-case time
add(int i, E element) n - i
remove(int i) n - i
removeRange(int i, int j) n - i
remove(Object o) n
contains(Object o) n
indexOf(Object o) n
lastIndexOf(Object o) n

Note: add(E element) takes constant amortized time, even though the worst-case time is linear.

Expensive Python list operations

The following Python list operations operate on a subset of the elements, but still have time complexity that depends on n = len(a).

Code Worst-case time
a.insert(i, x) n - i
a.pop(i) n - i
del a[i] n - i
del a[i:j] n - i
a.remove(x) n
a.index(x) n

Note: a.append(x) takes constant amortized time, even though the worst-case time is linear.

Alternatives

No other data structure can compete with the efficiency of array indexing and array iteration. However, you may need to take a different approach if other operations are performance critical.

Maps and dictionaries

The hash table, often in the form of a map or a dictionary, is the most commonly used alternative to an array. It implements an unordered collection of key-value pairs, where each key is unique.

In Java, hash tables are part of the standard library (HashSet and HashMap). Many modern languages, such as Python and Go, have built-in dictionaries and maps implemented by hash tables.

Sorted arrays

If search is important for performance, you may want to use a sorted array.

The Java Arrays class contains implementations of binary search, Python offers a similar bisect algorithm, and Go also has several binary search methods.

Linked lists

Singly linked list example

If you need to repeatedly add or remove elements at the start or end of a list, you may want to consider a linked list.

The Java LinkedList class implements a doubly linked list, Python offers a deque, and Go also has a list package.

Binary search trees

Balanced binary search trees store items in sorted order and offer effi­cient lookup, addi­tion and remo­val of items.

In Java, search trees are part of the standard library (TreeSet and TreeMap), while Python and Go don’t support them out of the box.

Share this page: