Saturday, 9 March 2024

Skip Lists: A Probabilistic Approach to Ordered Data

SkipList is a specialized form of a Linked List optimized for rapid search and insertion operations.

 

In SkipLists, both insertion and search tasks are accomplished with a time complexity of O(log n), where 'n' represents the number of elements contained within the SkipList.

 

What are the steps involved in constructing a SkipList?

Let's consider a scenario where we have a linked list containing elements arranged in ascending order.

 


 

In SkipLists, even though elements are sorted, traditional binary search isn't applicable because the middle element's address isn't known.

 

Sentinel Node

Within SkipLists, a sentinel node serves as a special marker at either the beginning or end of the list. These nodes offer a consistent starting point for traversing the SkipList, irrespective of its size or the number of levels it encompasses. They simplify the implementation of operations like searching, insertion, and deletion by ensuring that nodes containing actual data always have predecessors and successors.

 

By incorporating a sentinel node with predictable values (typically positive or negative infinity), the need for handling special cases in the code is alleviated when dealing with scenarios such as locating the first or last element, or inserting at the list's extremities.

 

The presence of a sentinel node guarantees consistent behavior regardless of whether the list is empty or populated. Consequently, code interacting with the SkipList can uniformly treat the sentinel node as a reference point for traversals or comparisons.

 


 

The skips in SkipLists are determined as follows:

In SkipLists, the addition of a layer above the base layer, often referred to as L0, introduces skips in the structure. While L0 is a conventional linked list where all nodes are connected sequentially, L1 incorporates skips.

 

Upon insertion of a new element into the skip list, it initially resides on the lowest level with just one level. However, as the skip list expands, certain nodes may be elevated to higher levels with a certain probability. This promotion process is typically stochastic, with nodes ascending to higher levels based on a predefined probability distribution. 


For instance, a random number generator might be employed to determine whether a node should be elevated to a higher level. If the generated random number falls within a specific range or satisfies particular conditions, the node is promoted to the next level. This process continues until a level is reached where further promotion is not possible. Through this random promotion mechanism, SkipLists achieve a balanced distribution of nodes across various levels, facilitating efficient search, insertion, and deletion operations.

 

In simpler terms, as the skip list expands, nodes have a chance to "skip" to higher levels with a certain probability, thus enhancing the overall structure's balance and operational efficiency.

 

For example, let's say we're generating random numbers within the range of 1 to 10. If a generated number is less than 3, the corresponding node is promoted to the next level; otherwise, it remains at its current level.

 

Suppose nodes containing elements 5, 20, 31, and 65 are promoted to the next level based on the aforementioned random number selection logic. The skip list will appear as follows after the addition of the new level, L1.

 


 

At each level of the SkipList, there exists a linked list comprised of nodes representing elements in the list. Each node holds data and pointers to subsequent node(s) within the same level, enabling horizontal traversal within that level. Moreover, each node might possess pointers to nodes in the level(s) beneath it, constructing a staircase-like structure within the SkipList. This hierarchical arrangement facilitates efficient navigation and search operations across multiple levels, contributing to the skip list's overall performance. 


Construction of L2 level linked list

To construct the L2 level linked list on top of the L1 level linked list, we iterate through the elements of the L1 linked list one by one. For each element, we apply the same random number logic as we did during the construction of the L1 level linked list.

 

Let's assume that the nodes containing elements 20 and 65 are promoted to the next level based on the random number selection logic. After adding the new level L2, the skip list will resemble the following structure.

 


Now the skip list has 3 layers.

 

How can I decide number of levels in skip list?

Determining the number of levels in a skip list involves a randomized process during insertion rather than a direct decision. Here's a general approach to guide the determination of skip list levels:

 

Set a maximum level: Establish the maximum number of levels desired for the skip list. This maximum level serves as an upper limit on the skip list's height.

 

Randomly assign levels: When inserting a new node, employ a random process to determine the number of levels for that node. Typically, this involves generating a random number to determine the node's level.

 

Limit level assignment: Ensure that the assigned number of levels for each node does not surpass the maximum level set in step 1. If the random process yields a level count exceeding the maximum level, cap the node's level at the maximum.

 

Adjust the maximum level: Monitor the skip list's growth and the number of nodes over time. Periodically evaluate the skip list's height and adjust the maximum level as necessary to maintain a balanced structure and optimize performance.

 

By implementing a maximum level and employing a randomized process for level assignment during insertion, you can manage the skip list's height effectively while still facilitating efficient operations such as search, insertion, and deletion.


How to perform search in a Skip List?

In SkipLists, the search operation initiates from the topmost level, commencing at the sentinel node of that level. Here's how the search for an element, let's say 33, is conducted:

 

Start at the top level: Begin the search from the highest level of the skip list, commencing at the sentinel node. The sentinel node serves as the entry point into the skip list.


 


Move horizontally: Traverse horizontally across the current level of the skip list, progressing from one node to another based on the value being searched for. Compare the search value with the values stored in the nodes at the current level.

If the search value is greater than the value of the current node but less than the value of the next node, proceed downwards to the next level of the skip list and continue the horizontal search at that level. Repeat this process until reaching the bottom level of the skip list.

By following this procedure, it becomes evident that the element 33 lies between the nodes containing values 20 and 65 at Level 2. 




By applying the same steps in L1, we can identify the element 33 is in between 31 and 65 of L1 elements.

 


Finally we can find the element 33 at level L0.

 


Once you reach the bottom level of the skip list, continue searching horizontally until finding the target value or reaching the end of the skip list. If the target value is found, return the node containing the value. If the target value is not found, return a "not found" indication.

 

How to perform Insertion?

Let’s insert the key 11.

 

To insert an element 11, first we need to find the position for the new key 11. It can be determined by search operation that we discussed above. From the search operation, we can identify the element 11 is inserted between the nodes 9 and 13.

 

 


Once the insertion position in the lowest level (level 0) is determined, a random process is simulated to decide the number of levels the new element will occupy in the skip list. Subsequently, a new node with the desired value is created and inserted into the appropriate position at each level of the skip list. Pointers are adjusted to connect the new node both horizontally and vertically within the skip list.

 

Assuming for the element 11, the random level generation process determines that it will occupy 3 levels, the final skip list will appear as follows:

 

 


When to use SkipList?

Skip lists are an excellent choice in various scenarios requiring efficient search, insertion, and deletion operations, particularly for large and dynamic datasets. Here are some situations where skip lists are particularly advantageous:

 

Balanced performance: Skip lists offer an average-case time complexity of O(log n) for search, insertion, and deletion operations, making them efficient for datasets of various sizes.

 

Dynamic datasets: Skip lists dynamically adjust their structure as elements are added or removed, making them ideal for scenarios with frequent dataset modifications.

 

Ordered traversal: Skip lists maintain elements in sorted order, enabling efficient ordered traversal, which is beneficial for processing data sequentially without the need for sorting.

 

Concurrency: Concurrent skip lists support multiple threads concurrently accessing and modifying the structure, providing efficient concurrent operations without requiring explicit synchronization.

 

No strict balancing required: Skip lists achieve balance through probabilistic methods, eliminating the need for strict balancing mechanisms found in other data structures like AVL trees or red-black trees, leading to simpler implementation and maintenance.

 

Memory efficiency: Skip lists typically require less memory overhead compared to other balanced data structures, especially when additional balance information is not necessary.

 

In summary, skip lists are versatile data structures suitable for a wide range of applications, including databases, indexing, caching, and concurrent data structures, where efficient operations on dynamic datasets are essential.


Sample implementation of SkipList.

 

SkipListNode.java

package com.sample.app.ds;

import java.util.ArrayList;
import java.util.List;

/**
 * Represents a node in a SkipList data structure.
 *
 * @param <E> The type of value stored in the node.
 */
public class SkipListNode<E> {
	private E value;
	public List<SkipListNode<E>> nextNodes;

	/**
	 * Constructs a SkipListNode with the specified value.
	 *
	 * @param value The value to be stored in the node.
	 */
	public SkipListNode(E value) {
		this.value = value;
		nextNodes = new ArrayList<>();
	}

	/**
	 * Retrieves the value stored in the node.
	 *
	 * @return The value stored in the node.
	 */
	public E getValue() {
		return value;
	}

	/**
	 * Retrieves the level of the node in the SkipList. The level is determined by
	 * the number of nextNodes.
	 *
	 * @return The level of the node.
	 */
	public int level() {
		return nextNodes.size() - 1;
	}

	/**
	 * Returns a string representation of the SkipListNode.
	 *
	 * @return A string representation of the SkipListNode.
	 */
	@Override
	public String toString() {
		return "" + value;
	}
}

SkipListIterator.java

package com.sample.app.ds;

import java.util.Iterator;

/**
 * Iterator for traversing through elements of a SkipList.
 *
 * @param <E> The type of elements stored in the SkipList.
 */
public class SkipListIterator<E extends Comparable<E>> implements Iterator<E> {
	private SkipList<E> list;
	private SkipListNode<E> current;

	/**
	 * Constructs a SkipListIterator for the given SkipList.
	 *
	 * @param list The SkipList to be iterated over.
	 */
	public SkipListIterator(SkipList<E> list) {
		this.list = list;
		this.current = list.getHead();
	}

	/**
	 * Checks if there is a next element in the SkipList.
	 *
	 * @return true if there is a next element, otherwise false.
	 */
	@Override
	public boolean hasNext() {
		return current.nextNodes.get(0) != null;
	}

	/**
	 * Retrieves the next element from the SkipList.
	 *
	 * @return The next element in the iteration.
	 */
	@Override
	public E next() {
		current = current.nextNodes.get(0);
		return current.getValue();
	}

	/**
	 * Unsupported operation. This method throws UnsupportedOperationException.
	 *
	 * @throws UnsupportedOperationException if remove operation is not supported.
	 */
	@Override
	public void remove() throws UnsupportedOperationException {
		throw new UnsupportedOperationException();
	}
}

SkipList.java

package com.sample.app.ds;

import java.util.Iterator;

/**
 * Represents a SkipList data structure.
 *
 * @param <E> The type of elements stored in the SkipList, must be comparable.
 */
public class SkipList<E extends Comparable<E>> {
	private SkipListNode<E> head;
	private int maxLevel;
	private int size;

	private static final double PROBABILITY_TO_DECIDE_LEVEL = 0.5;

	private SkipListNode<E> find(E e) {
		return find(e, head, maxLevel);
	}

	private SkipListNode<E> find(E e, SkipListNode<E> current, int level) {
		do {
			current = findNext(e, current, level);
		} while (level-- > 0);
		return current;
	}

	private SkipListNode<E> findNext(E e, SkipListNode<E> current, int level) {
		SkipListNode<E> next = current.nextNodes.get(level);
		while (next != null) {
			E value = next.getValue();
			if (lessThan(e, value))
				break;
			current = next;
			next = current.nextNodes.get(level);
		}
		return current;
	}

	private boolean lessThan(E a, E b) {
		return a.compareTo(b) < 0;
	}

	private boolean equalTo(E a, E b) {
		return a.compareTo(b) == 0;
	}

	/**
	 * Constructs an empty SkipList.
	 */
	public SkipList() {
		size = 0;
		maxLevel = 0;
		head = new SkipListNode<>(null);
		head.nextNodes.add(null);
	}

	/**
	 * Returns the head of the SkipList.
	 *
	 * @return The head node of the SkipList.
	 */
	public SkipListNode<E> getHead() {
		return head;
	}

	/**
	 * Adds an element to the SkipList.
	 *
	 * @param e The element to be added.
	 * @return true if the element was added successfully, false if it already
	 *         exists in the SkipList.
	 */
	public boolean add(E e) {
		if (contains(e))
			return false;

		size++;

		// Decide maximum levels to this element
		int level = 0;
		while (Math.random() < PROBABILITY_TO_DECIDE_LEVEL)
			level++;

		while (level > maxLevel) {
			head.nextNodes.add(null);
			maxLevel++;
		}

		SkipListNode<E> newNode = new SkipListNode<>(e);
		SkipListNode<E> current = head;
		do {
			current = findNext(e, current, level);
			newNode.nextNodes.add(0, current.nextNodes.get(level));
			current.nextNodes.set(level, newNode);
		} while (level-- > 0);

		return true;
	}

	/**
	 * Returns the size of the SkipList.
	 *
	 * @return The number of elements in the SkipList.
	 */
	public int size() {
		return size;
	}

	/**
	 * Checks if the SkipList contains the specified element.
	 *
	 * @param o The element to be checked for containment.
	 * @return true if the SkipList contains the element, otherwise false.
	 */
	public boolean contains(Object o) {
		if (o == null)
			return false;
		@SuppressWarnings("unchecked")
		E e = (E) o;
		SkipListNode<E> node = find(e);
		return node != null && node.getValue() != null && equalTo(node.getValue(), e);
	}

	/**
	 * Deletes an element from the SkipList.
	 *
	 * @param e The element to be deleted.
	 * @return true if the element was deleted successfully, false if it does not
	 *         exist in the SkipList.
	 */
	public boolean delete(E e) {
		if (!contains(e))
			return false;

		SkipListNode<E> current = head;
		for (int i = maxLevel; i >= 0; i--) {
			while (current.nextNodes.get(i) != null && lessThan(current.nextNodes.get(i).getValue(), e))
				current = current.nextNodes.get(i);

			if (current.nextNodes.get(i) != null && equalTo(current.nextNodes.get(i).getValue(), e))
				current.nextNodes.set(i, current.nextNodes.get(i).nextNodes.get(i));
		}
		size--;
		return true;
	}

	/**
	 * Returns an iterator over the elements in the SkipList.
	 *
	 * @return An iterator over the elements in the SkipList.
	 */
	public Iterator<E> iterator() {
		return new SkipListIterator<>(this);
	}

	/**
	 * Returns a string representation of the SkipList.
	 *
	 * @return A string representation of the SkipList.
	 */
	@Override
	public String toString() {
		StringBuilder builder = new StringBuilder();
		builder.append("SkipList [head=").append(head).append(", maxLevel=").append(maxLevel).append(", size=")
				.append(size).append("]").append("\n");

		Iterator<E> iterator = this.iterator();

		while (iterator.hasNext()) {
			builder.append(iterator.next()).append("->");
		}
		String result = builder.toString();

		return result.substring(0, result.length() - 2);
	}
}

SkipListDemo.java

package com.sample.app;

import com.sample.app.ds.SkipList;

public class SkipListDemo {
	public static void main(String[] args) {
		SkipList<Integer> skipList = new SkipList<>();
		System.out.println(skipList + "\n");

		// Add elements to the skip list

		skipList.add(5);
		skipList.add(65);
		skipList.add(9);
		skipList.add(33);
		skipList.add(13);
		skipList.add(31);
		skipList.add(27);
		skipList.add(71);
		skipList.add(3);
		skipList.add(20);

		System.out.println(skipList + "\n");

		System.out.println("Delete 31 and 3 from the skip list");
		skipList.delete(31);
		skipList.delete(3);
		System.out.println(skipList + "\n");
	}
}

Output

SkipList [head=null, maxLevel=0, size=0

SkipList [head=null, maxLevel=5, size=10]
3->5->9->13->20->27->31->33->65->71

Delete 31 and 3 from the skip list
SkipList [head=null, maxLevel=5, size=8]
5->9->13->20->27->33->65->71


                                                  System Design Questions

No comments:

Post a Comment