Data Structures

Linked Lists

Definition: A linear collection of nodes in which each node links to the next node.

Singly Linked List

Each node has a single pointer to the next node.

When to Use:

Need constant time insertion or deletion
- Only at head (front of list)
- Inserting elements is easier in singly linked list that in doubly linked list
Unknown number of elements
No need for random access
Need linked list and memory is a concern
- Requires less memory than doubly linked list

Time Complexity:

	Average	Worst
Access	θ(n)	O(n)
Search	θ(n)	O(n)
Insertion	θ(1)	O(1)
Deletion	θ(1)	O(1)

Doubly Linked List

Each node has two pointers, one to the previous node and one to the next node.

When to Use:

Need constant time insertion or deletion
- Only at head or tail (front or back of list)
- Doubly linked list deletion is faster that singly linked list deletion, except when deleting the head
Unknown number of elements
No need for random access

Time Complexity:

	Average	Worst
Access	θ(n)	O(n)
Search	θ(n)	O(n)
Insertion	θ(1)	O(1)
Deletion	θ(1)	O(1)

Stacks

Definition: An abstract data type that supports a last in, first out (LIFO) policy. A push method is used to add an element onto the stack and a pop method is used to remove the most recent element of the stack.

When to Use:

List reversal
Parsing
Expression conversion and evaluation
- Balancing delimiters
Tower of Hanoi

Time Complexity:

	Average	Worst
Access	θ(n)	O(n)
Search	θ(n)	O(n)
Insertion	θ(1)	O(1)
Deletion	θ(1)	O(1)

Queues

Definition: Elements within this abstract data type are ordered by a specified principle. An enqueue method adds an element to the back of the queue, while a dequeue method removes an element from the front of the queue.

FIFO

The traditional queue policy is first in, first out (FIFO), just like a line for a service. Elements which are added to the queue earlier on are accessible sooner than elements added to the queue later on.

When to Use:

Need to maintain order
Packet transmission
Providing a service in which clients are all valued equally

Time Complexity:

	Average	Worst
Access	θ(n)	O(n)
Search	θ(n)	O(n)
Insertion	θ(1)	O(1)
Deletion	θ(1)	O(1)

Deque

Elements can be inserted and removed from both the front and back of this type of queue.

When to Use:

Need access to old and new elements
Providing a service that is likely to require clients to return
- A fast food cashier serves a customer their order but forgets to include ketchup packets
  - The customer is allowed to cut the line to get ketchup packets.

Time Complexity:

	Average	Worst
Access	θ(n)	O(n)
Search	θ(n)	O(n)
Insertion	θ(1)	O(1)
Deletion	θ(1)	O(1)

Priority Queue

Similar to a standard FIFO queue, except elements are assigned priorities, and these priorities determine the element's position in the queue. A priority queue is typically implemented with a min or max heap.

When to Use:

Event-driven scheduling
Data compression
Graphing algorithms (Dijkstra, Prim)
Bandwidth management

Time Complexity:

	Average	Worst
Access	θ(1)	O(1)
Search	θ(n)	O(n)
Insertion	θ(nlog(n))	O(nlog(n))
Deletion	θ(nlog(n))	O(nlog(n))

Trees

Definition: A hierarchical structure of parent-child relationships between nodes. Each parent will be one level higher on the structure than its children, respectively. Trees are non-cyclic but may be linear.

Binary Search Tree (BST)

This type of tree may have one, two, or no children. A child is either to the left or right of a parent node. A parent's left child must have a value less than its own value, and a parent's right child must have a value greater than its own value. It is common for repeated nodes to be disallowed; however, in the case of a BST which allows repeated nodes, either the left child condition or the right child condition can be modified to handle equal values.

When to Use:

Goal of average case of log(n) for all operations
Need memory efficient solution to storing data
Expression evaluation
Data compression

Time Complexity:

	Average	Worst
Access	θ(log(n))	O(n)
Search	θ(log(n))	O(n)
Insertion	θ(log(n))	O(n)
Deletion	θ(log(n))	O(n)

Adelson-Velskii and Landis (AVL) Tree

Self-balancing BST in which the difference in height between left and right subtrees cannot exceed one for all nodes. If a node is inserted or deleted such that the height condition is violated, an AVL tree will reorder itself to meet its height conditions and standard BST conditions.

When to Use:

Lookup time is priority
- Fastest of the trees in terms of lookup time
Not concerned with insertion and deletion times
- Slow insertions and deletions due to strict height requirements
- Requires rebalancing often

Time Complexity:

	Average	Worst
Access	θ(log(n))	O(log(n))
Search	θ(log(n))	O(log(n))
Insertion	θ(log(n))	O(log(n))
Deletion	θ(log(n))	O(log(n))

Red-Black Tree

Self balancing BST with each node being assigned a color, either red or black. It is similar to an AVL tree; however, balancing is not as rigid (the farthest leaf from the root can be up to twice as far as the nearest leaf from the root) and reorganizing the tree is done in a more efficient manner. The following are the requirements for a red-black tree:

Nodes are either red or black
Root node is black
Leaves are black
Red nodes have black children
The number of black nodes between any node and its leaves is the same

When to Use:

Guaranteed O(log(n)) for all operations
Balance between fast look ups and efficient insertions and deletions

Time Complexity:

	Average	Worst
Access	θ(log(n))	O(log(n))
Search	θ(log(n))	O(log(n))
Insertion	θ(log(n))	O(log(n))
Deletion	θ(log(n))	O(log(n))

Hash Tables

Definition: An associative structure with mappings between keys and values. A hash function is used on a key to compute an index for a given value. Furthermore, once a hash table has been filled passed a predefined load factor, which is a ratio of the number of filled positions to the number of total positions, the hash table is resized. Ideally, hash tables have the following properties:

Size of hash table is prime at initialization and upon resizing
Values are evenly distributed
Hash function minimizes the number of collisions (keys hashing to same index)

In the event of a collision, either open addressing or separate chaining can be used to resolve the collision.

Open Addressing

Resolves collisions by probing alternative locations for availability. There are three main types of probing:

Linear probing
Quadratic probing
Double hashing

In linear probing the interval of the probe is fixed at a constant value, so if the constant was set to 1 and a collision occurred at position i, then the probe would follow the sequence i+1, i+2, i+3, ..., i+n, until an open position was found.

In quadratic probing the interval of the probe is incremented linearly and then squared, so if the constant was set to 1 and a collision occurred at position i, then the probe would follow the sequence i+1², i+2², i+3², ..., i+n², until an open position was found.

In double hashing the interval of the probe is fixed at a constant value and run through a second hash function. The resulting value is added to the initial index in which the collision occurred, so if the constant was set to 1 and there is a second hash function H₂(x), then the probe would follow the sequence i+H₂(1), i+H₂(2), i+H₂(3), ..., i+H₂(n), until an open position was found.

When to Use:

Fast look ups
- Typically faster than separate chaining
Small load factor
- Outperforms separate chaining when load factor is small
Ordering is not relevant

Time Complexity:

	Average	Worst
Access	N/A	N/A
Search	θ(1)	O(n)
Insertion	θ(1)	O(n)
Deletion	θ(1)	O(n)

Separate Chaining

Each position in the hash table contains a bucket of some sort (linked list, binary tree). Whenever a collision occurs, the colliding value is inserted into the corresponding bucket. If multiple values are contained in a bucket and the index for that bucket is called, then the bucket's find method must be invoked. Typically, only one value should be associated with each bucket, and three values at the upper limit; otherwise, the average case for lookup times will approach θ(n).

When to Use:

Fast look ups
Large load factor
- Outperforms open addressing as load factor increases
Ordering is not relevant
Need simplicity
- Easier to implement than open addressing hash tables

Time Complexity:

	Average	Worst
Access	N/A	N/A
Search	θ(1)	O(n)
Insertion	θ(1)	O(n)
Deletion	θ(1)	O(n)

Graphs

Definition:

Adjacency List

When to Use:

Complexity:

	Add Vertex	Add Edge	Remove Vertex	Remove Edge	Search	Space
Worst	O(1)	O(1)	O(\|V\|)+O(\|E\|)	O(\|E\|)	O(\|V\|)	O(\|V\|)+O(\|E\|)

Adjacency Matrix

When to Use:

Complexity:

	Add Vertex	Add Edge	Remove Vertex	Remove Edge	Search	Space
Worst	O(V²)	O(1)	O(V²)	O(1)	O(1)	O(V²)

Data Structures

Linked Lists

Singly Linked List

When to Use:

Time Complexity:

Doubly Linked List

When to Use:

Time Complexity:

Stacks

When to Use:

Time Complexity:

Queues

FIFO

When to Use:

Time Complexity:

Deque

When to Use:

Time Complexity:

Priority Queue

When to Use:

Time Complexity:

Trees

Binary Search Tree (BST)

When to Use:

Time Complexity:

Adelson-Velskii and Landis (AVL) Tree

When to Use:

Time Complexity:

Red-Black Tree

When to Use:

Time Complexity:

Hash Tables

Open Addressing

When to Use:

Time Complexity:

Separate Chaining

When to Use:

Time Complexity:

Graphs

Adjacency List

When to Use:

Complexity:

Adjacency Matrix

When to Use:

Complexity:

Home

Data Structures

Algorithms

Interview Q&A

Clone this wiki locally