Home/dsa/Trees/Find Duplicate Subtrees

Find Duplicate Subtrees

Master this topic with zero to advance depth.

Find Duplicate Subtrees

List duplicates.

Hard
Approach 1

Level I: Brute Force (Recursive Comparison)

Intuition

For every node in the tree, we can compare its subtree with the subtrees of all other nodes. This involves a nested traversal: for each node, we start a secondary traversal that checks for equality with every other node.

Logic

  1. dfs(node) visits every node.
  2. For each node, isSame(node, otherNode) checks if they are identical.
  3. Very slow (O(N3)O(N^3) or O(N2)O(N^2) depending on implementation).
O(N^3)💾 O(H)

Detailed Dry Run

Compare Node(2) with every other node in the tree. If equal and not seen before, add to result.

java
public class Main {
    public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        List<TreeNode> res = new ArrayList<>();
        // Brute force comparison omitted for brevity as it is highly inefficient.
        return res;
    }
}
Approach 2

Level II: Serialization with Map

Intuition

How do we know if two subtrees are identical? We can serialize each subtree (convert to string) and store the frequency of these strings in a HashMap. If a string appears for the second time, we've found a duplicate.

Logic

  1. Use Post-order DFS.
  2. For each node, string = val + "," + left + "," + right.
  3. Add string to Map.
  4. If map.get(string) == 2, add current node to results.
O(N^2) (Building strings takes O(N) in worst case for every node)💾 O(N^2)

Detailed Dry Run

NodeLeft SerialRight SerialNode SerialCountDuplicate?
Leaf 4##"4,#,#"1NO
Leaf 4##"4,#,#"2YES!
Node 2"4,#,#"#"2,4,#,#,#"1NO
java
public class Main {
    Map<String, Integer> map = new HashMap<>();
    List<TreeNode> res = new ArrayList<>();
    public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        serialize(root);
        return res;
    }
    private String serialize(TreeNode node) {
        if (node == null) return "#";
        String s = node.val + "," + serialize(node.left) + "," + serialize(node.right);
        map.put(s, map.getOrDefault(s, 0) + 1);
        if (map.get(s) == 2) res.add(node);
        return s;
    }
}
Approach 3

Level III: Optimal (Integer ID Mapping)

Intuition

Thought Process

The O(N2)O(N^2) bottleneck in Level II is caused by string concatenation. Instead of strings, we can assign a unique integer ID to each subtree structure. Two subtrees are identical if they have the same (node.val, left_child_id, right_child_id) triplet.

Logic

  1. Use a Map tripletToID to map triplets of integers to a unique ID.
  2. Use a Map idCount to track how many times each ID has appeared.
  3. This reduces the problem to O(N)O(N) because triplet comparison is O(1)O(1).
O(N)💾 O(N)

Detailed Dry Run

NodeLeft IDRight IDTripletNew IDCountDuplicate?
400(4,0,0)11NO
400(4,0,0)12YES!
210(2,1,0)21NO
java
public class Main {
    Map<String, Integer> tripletToId = new HashMap<>();
    Map<Integer, Integer> idCount = new HashMap<>();
    List<TreeNode> res = new ArrayList<>();
    int nextId = 1;

    public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        getId(root);
        return res;
    }

    private int getId(TreeNode node) {
        if (node == null) return 0;
        String triplet = node.val + "," + getId(node.left) + "," + getId(node.right);
        int id = tripletToId.computeIfAbsent(triplet, k -> nextId++);
        idCount.put(id, idCount.getOrDefault(id, 0) + 1);
        if (idCount.get(id) == 2) res.add(node);
        return id;
    }
}