Sets

Introduction: The Guardian of Uniqueness

While Arrays store order and Maps store associations, Sets store existence.

Sets are the most under-utilized data structure by junior developers, who often abuse Arrays to check for duplicates (array.includes()). This is a performance killer. A Set is an abstract data type that can store certain values, without any particular order, and no repeated values.

The Golden Rule of Sets: An item either exists in the Set, or it doesn't. There is no count, and there are no duplicates.

The Mechanics

How does a Set achieve uniqueness instantly?

Most modern languages (JavaScript, Python, Java HashSet) implement Sets using a Hash Table with dummy values.

  1. Input: You add "Apple".

  2. Hash: The system calculates Hash("Apple").

  3. Storage: It goes to that bucket index. If something is already there, it checks equality. If it's the same, it does nothing (no duplicate added). If it's empty, it stores "Apple".

Note: Some languages offer TreeSets (backed by Red-Black Trees) which keep elements sorted but have O(\log n) operations. JavaScript's standard Set is a Hash Set.

Operations & Complexity Analysis

Operation

Average Case

Worst Case

Notes

Add

O(1)

O(n)

Hashes the value and places it. Ignored if exists.

Delete

O(1)

O(n)

Hashes the value, finds the bucket, removes it.

Has (Contains)

O(1)

O(n)

The Superpower. Instantly tells you if an item exists. Using Array.includes() is O(n), making Sets exponentially faster for lookups.

Search

N/A

N/A

You don't "search" a set for index i. You only ask if it contains value X.

JavaScript Implementation

JavaScript introduced the Set object in ES6. It handles any data type and maintains insertion order (a unique feature of JS Sets compared to other languages).

1. Basic Usage

2. Expert Usage: Set Algebra

The true power of Sets comes from mathematical operations: Union, Intersection, and Difference. Since JS doesn't have built-in methods for these (yet), we implement them efficiently.

Expert Insight: When to Use Sets vs Arrays

If heavyList has 10,000 items, the loop performs 100,000,000 operations.

The Expert Fix:

Object vs Primitive Uniqueness

In JavaScript, Sets check for uniqueness by reference for Objects, and by value for Primitives.

Conclusion

Sets are the definitive tool for de-duplication and membership testing.

  1. If you ever need to ask "Have I seen this before?", use a Set.

  2. If you need to store a list where duplicates are forbidden, use a Set.

  3. Stop using Array.includes() inside loops; it is a hidden performance trap.

Last updated