Overview Persistent Array Fat Nodes Path Copying Persistent Segment Tree Resources Solution Application 1 - Static 2D Range Sums on Large Grids Application 2 - Largest Interval Completely Inside a Range Problems Persistent Heap

Rare

0/10

Persistent Data Structures

Authors: Andi Qu, Benjamin Qi

What if data structures could time travel?

Edit This Page

Prerequisites

Platinum - Sparse Segment Trees

Overview

A persistent data structure is a data structure that preserves the previous versions of itself when modified, allowing access to any historical version. In other words, once a change is made to the structure, both the original and modified versions remain accessible. This is particularly useful in scenarios where you need to keep track of the history of updates or backtrack to previous states of the data structure.

Persistent Array

Persistent arrays are one of the simplest persistent data structures. A persistent array should be able to access and update its elements at given times.

Fat Nodes

C++

In C++, one can implement this to run in $\mathcal O(\log N)$ time per query and $\mathcal O(1)$ time per update by using an array of vectors.

vector<pair<int, int>> arr[100001];  // The persistent array

int get_item(int index, int time) {
	// Gets the array item at a given index and time
	auto ub =
	    upper_bound(arr[index].begin(), arr[index].end(), make_pair(time, INT_MAX));
	return prev(ub)->second;
}

void update_item(int index, int value, int time) {

This approach (i.e. storing multiple values at each index without erasing old values) is known as fat nodes.

Although easy to implement, fat nodes are only partially persistent, meaning that only the latest version of the data structure can be modified.

For most competitive programming problems involving persistent data structures, we use path copying instead.

Path Copying

One can implement path copying to run in $\mathcal O(\log N)$ time per query and update by using a binary-tree-like structure where array elements are the leaves.

This is very similar to a sparse segment tree. The key differences are that we have multiple roots and every time we "update" a node, we actually create a new node in its place (hence persistence).

C++

struct Node {
	int val;
	Node *l, *r;

	Node(ll x) : val(x), l(nullptr), r(nullptr) {}
	Node(Node *ll, Node *rr) : val(0), l(ll), r(rr) {}
};

int n, a[100001];     // The initial array and its size
Node *roots[100001];  // The persistent array's roots

Path copying is fully persistent.

Persistent Segment Tree

Since persistent arrays with path copying are so similar to sparse segment trees, it's relatively straightforward to convert one into a persistent segment tree - just add range queries!

Range Queries and Copies

CSES - Easy

Focus Problem – try your best to solve this problem before continuing!

Resources

Resources
	oml1111	PSeg Slides
	Anudeep2011	PSegs w/ SPOJ	not great formatting

This section is not complete.

Any help would be appreciated! Just submit a Pull Request on GitHub.

Solution

Since this problem involves range queries, we'll use some type of segment tree to solve it. (We can also use a Fenwick tree, but that's much harder to make persistent.)

When dealing with problems involving multiple dimensions, it's often helpful to view one of those dimensions as time. In this problem, we'll view the index of each array as its time dimension.

Using a persistent segment tree, we can then turn the problem into the following:

Type 1 queries involve a point update on the segment tree at some time.
Type 2 queries involve a range query on the segment tree at some time.
Type 3 queries involve copying the root of the segment tree at some time and appending it to the array of segment tree roots.

C++

#include <bits/stdc++.h>
typedef long long ll;
using namespace std;

struct Node {
	ll val;
	Node *l, *r;

	Node(ll x) : val(x), l(nullptr), r(nullptr) {}
	Node(Node *ll, Node *rr) {

Application 1 - Static 2D Range Sums on Large Grids

Persistent segment trees can be used for online 2D static range sum queries in $\mathcal O(\log N)$ time (think of it like prefix sums).

Note that 2D Fenwick trees with coordinate compression often also work for this (and are easier to implement), but it's still good to know this application.

Application 2 - Largest Interval Completely Inside a Range

Consider the following problem:

Given $N$ intervals on the number line, answer $Q$ queries of the form "what is the largest interval completely contained inside the range $[x, y]$ ?"

$N, Q \leq 10^5$ .

Since each interval has two dimensions (i.e. left and right endpoints $l_i$ and $r_i$ ), we can view it as a point on the number line at $l_i$ with "value" $r_i - l_i$ that was inserted at time $r_i$ .

Now, each query becomes "what is the most valuable point in the range $[x, y]$ that was inserted at or before time $y$ ?" This is much easier to handle, so we can solve this problem in $\mathcal O(Q \log N)$ time.

Problems

Source	Problem Name	Difficulty	Tags
CF	Closest Equals	Easy	Show Tags Persistent Segtree
COCI	2021 - Index	Normal	Show Tags Persistent Segtree
NOI	2010 - Super Piano	Normal	Show Tags Persistent Segtree
CEOI	2020 - The Potion of Great Power	Hard	Show Tags Persistent Segtree, Sqrt
APIO	2017 - Land of the Rainbow Gold	Hard	Show Tags 2DRQ, Euler's Formula, Persistent Segtree
COCI	2020 - Specijacija	Very Hard	Show Tags Persistent Segtree
IOI	2015 - Teams	Very Hard	Show Tags 2DRQ, Persistent Segtree

Persistent Heap

Resources
	Benq	Leftist Heap

Status	Source	Problem Name	Difficulty	Tags
	Wesley's Anger Contest	Time Travelling Squirrels	Very Hard
	YS	K-th Shortest Walk	Very Hard

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

Join Forum

Table of Contents

Persistent Data Structures

Prerequisites

Table of Contents

Overview

Persistent Array

Fat Nodes

Path Copying

Persistent Segment Tree

Resources

This section is not complete.

Solution

Application 1 - Static 2D Range Sums on Large Grids

Application 2 - Largest Interval Completely Inside a Range

Problems

Persistent Heap

Module Progress:

Join the USACO Forum!

Table of Contents

Persistent Data Structures

Prerequisites

Table of Contents

Overview

Persistent Array

Fat Nodes

Path Copying

Persistent Segment Tree

Resources

This section is not complete.

Solution

Application 1 - Static 2D Range Sums on Large Grids

Application 2 - Largest Interval Completely Inside a Range

Problems

Persistent Heap

Module Progress:Not Started

Join the USACO Forum!

Module Progress: