Sets
Sets are collections that store unique elements with fast membership testing and set theory operations. Harneet's Set implementation uses hash-based storage for O(1) average-case performance on core operations.
Overview
A Set automatically ensures all elements are unique and provides efficient operations for: - Adding and removing elements - Testing membership - Set theory operations (union, intersection, difference) - Iteration over unique elements
Creating a Set
| Creating a Set |
|---|
| import collections
var set = collections.new_set()
|
Basic Operations
Adding Elements
| Adding Elements |
|---|
| var fruits = collections.new_set()
fruits.add("apple")
fruits.add("banana")
fruits.add("orange")
fruits.add("apple") // Duplicate - ignored
fmt.Println("Set size:", fruits.size()) // 3
fmt.Println("Set:", fruits.inspect()) // {"apple", "banana", "orange"}
|
Removing Elements
| Removing Elements |
|---|
| // Remove specific element
var removed = fruits.remove("banana")
fmt.Println("Removed banana:", removed) // true
// Try to remove non-existent element
var not_found = fruits.remove("grape")
fmt.Println("Removed grape:", not_found) // false
fmt.Println("Set after removal:", fruits.inspect()) // {"apple", "orange"}
|
Membership Testing
| Membership Testing |
|---|
| // Check if element exists
if fruits.contains("apple") {
fmt.Println("Set contains apple")
}
if !fruits.contains("banana") {
fmt.Println("Set does not contain banana")
}
// Batch membership testing
var items_to_check = ["apple", "grape", "orange"]
for item in items_to_check {
var exists = fruits.contains(item)
fmt.Println(item, "exists:", exists)
}
|
Size and Emptiness
| Size and Emptiness |
|---|
| fmt.Println("Set size:", fruits.size())
fmt.Println("Is empty:", fruits.is_empty())
// Clear all elements
fruits.clear()
fmt.Println("After clear - size:", fruits.size()) // 0
fmt.Println("After clear - empty:", fruits.is_empty()) // true
|
Set Theory Operations
Union (Combine Sets)
| Union |
|---|
| var set1 = collections.new_set()
set1.add("a")
set1.add("b")
set1.add("c")
var set2 = collections.new_set()
set2.add("c")
set2.add("d")
set2.add("e")
// Union contains all elements from both sets
var union_set = set1.union(set2)
fmt.Println("Union:", union_set.inspect()) // {"a", "b", "c", "d", "e"}
|
Intersection (Common Elements)
| Intersection |
|---|
| // Intersection contains only elements present in both sets
var intersection_set = set1.intersection(set2)
fmt.Println("Intersection:", intersection_set.inspect()) // {"c"}
|
Difference (Elements in First Set Only)
| Difference |
|---|
| // Difference contains elements in set1 but not in set2
var difference_set = set1.difference(set2)
fmt.Println("Difference:", difference_set.inspect()) // {"a", "b"}
// Symmetric difference (elements in either set, but not both)
var sym_diff = set1.symmetric_difference(set2)
fmt.Println("Symmetric difference:", sym_diff.inspect()) // {"a", "b", "d", "e"}
|
Subset and Superset Testing
| Subset and Superset |
|---|
| var small_set = collections.new_set()
small_set.add("a")
small_set.add("b")
var large_set = collections.new_set()
large_set.add("a")
large_set.add("b")
large_set.add("c")
large_set.add("d")
// Check subset relationship
fmt.Println("small_set is subset of large_set:", small_set.is_subset(large_set)) // true
fmt.Println("large_set is superset of small_set:", large_set.is_superset(small_set)) // true
// Check proper subset (subset but not equal)
fmt.Println("small_set is proper subset:", small_set.is_proper_subset(large_set)) // true
|
Advanced Features
Conversion and Iteration
| Conversion and Iteration |
|---|
| var numbers = collections.new_set()
for i in range(1, 6) {
numbers.add(i)
}
// Convert to array
var array = numbers.to_array()
fmt.Println("As array:", array)
// Iterate using iterator
var iter = numbers.iterator()
fmt.Println("Iterating:")
for iter.has_next() {
var item = iter.next()
fmt.Println(" Item:", item)
}
// Iterate using for-in loop (if supported)
for item in numbers {
fmt.Println(" Item:", item)
}
|
Functional Operations
| Functional Operations |
|---|
| var numbers = collections.new_set()
for i in range(1, 11) {
numbers.add(i)
}
// Filter elements
var evens = numbers.filter(((n) =>) n % 2 == 0)
fmt.Println("Even numbers:", evens.inspect())
// Map elements (creates new set)
var squares = numbers.map(((n) =>) n * n)
fmt.Println("Squares:", squares.inspect())
// Reduce to single value
var sum = numbers.reduce(|acc, n| acc + n, 0)
fmt.Println("Sum:", sum)
|
Batch Operations
| Batch Operations |
|---|
| var set = collections.new_set()
// Add multiple elements at once
var items = ["red", "green", "blue", "yellow"]
set.add_all(items)
// Remove multiple elements
var to_remove = ["red", "blue"]
set.remove_all(to_remove)
fmt.Println("After batch operations:", set.inspect()) // {"green", "yellow"}
|
Use Cases and Examples
1. Duplicate Removal
| Duplicate Removal |
|---|
| function remove_duplicates(arr array) array {
var set = collections.new_set()
for item in arr {
set.add(item)
}
return set.to_array()
}
var numbers = [1, 2, 2, 3, 3, 3, 4, 5, 5]
var unique = remove_duplicates(numbers)
fmt.Println("Unique numbers:", unique) // [1, 2, 3, 4, 5]
|
2. Membership Testing for Large Collections
| Fast Membership Testing |
|---|
| // Fast lookup for large datasets
var valid_ids = collections.new_set()
for i in range(1, 1000000) {
valid_ids.add("user_" + i.to_string())
}
function is_valid_user(user_id string) bool {
return valid_ids.contains(user_id) // O(1) average case
}
// Usage
fmt.Println("Valid user:", is_valid_user("user_12345")) // true
fmt.Println("Valid user:", is_valid_user("user_999999")) // false
|
3. Tag System
| Tag System |
|---|
| struct Article {
title: string
content: string
tags: Set
}
function new_article(title string, content string) Article {
return Article{
title: title,
content: content,
tags: collections.new_set()
}
}
function (article *Article) add_tag(tag) {
article.tags.add(tag)
}
function (article *Article) has_tag(tag) {
return article.tags.contains(tag)
}
function (article *Article) has_any_tags(tag_list) {
for tag in tag_list {
if article.tags.contains(tag) {
return true
}
}
return false
}
// Usage
var article = new_article("Harneet Guide", "A comprehensive guide...")
article.add_tag("programming")
article.add_tag("tutorial")
article.add_tag("harneet")
fmt.Println("Has programming tag:", article.has_tag("programming")) // true
fmt.Println("Has any tech tags:", article.has_any_tags(["tech", "programming"])) // true
|
4. Set-based Permissions
| Set-based Permissions |
|---|
| struct User {
name: string
permissions: Set
}
struct Role {
name: string
permissions: Set
}
function new_user(name string) User {
return User{
name: name,
permissions: collections.new_set()
}
}
function new_role(name string, permissions array) Role {
var role = Role{name: name, permissions: collections.new_set()}
for perm in permissions {
role.permissions.add(perm)
}
return role
}
function (user *User) assign_role(role) {
user.permissions = user.permissions.union(role.permissions)
}
function (user *User) has_permission(permission) {
return user.permissions.contains(permission)
}
function (user *User) has_all_permissions(required_permissions) {
for perm in required_permissions {
if !user.permissions.contains(perm) {
return false
}
}
return true
}
// Usage
var admin_role = new_role("admin", ["read", "write", "delete", "manage_users"])
var editor_role = new_role("editor", ["read", "write"])
var user = new_user("alice")
user.assign_role(editor_role)
fmt.Println("Can read:", user.has_permission("read")) // true
fmt.Println("Can delete:", user.has_permission("delete")) // false
fmt.Println("Can edit:", user.has_all_permissions(["read", "write"])) // true
|
5. Graph Algorithms with Sets
| Graph Algorithms |
|---|
| struct Graph {
vertices: Set
edges: Map // vertex -> Set of neighbors
}
function new_graph() Graph {
return Graph{
vertices: collections.new_set(),
edges: collections.new_map()
}
}
function (g *Graph) add_vertex(vertex) {
g.vertices.add(vertex)
if !g.edges.contains_key(vertex) {
g.edges.set(vertex, collections.new_set())
}
}
function (g *Graph) add_edge(from, to) {
g.add_vertex(from)
g.add_vertex(to)
g.edges.get(from).add(to)
g.edges.get(to).add(from) // Undirected graph
}
function (g *Graph) get_neighbors(vertex) {
return g.edges.get(vertex)
}
function (g *Graph) has_path(start, end) {
if start == end {
return true
}
var visited = collections.new_set()
var queue = collections.new_queue()
queue.enqueue(start)
visited.add(start)
for !queue.is_empty() {
var current = queue.dequeue()
var neighbors = g.get_neighbors(current)
for neighbor in neighbors {
if neighbor == end {
return true
}
if !visited.contains(neighbor) {
visited.add(neighbor)
queue.enqueue(neighbor)
}
}
}
return false
}
// Usage
var graph = new_graph()
graph.add_edge("A", "B")
graph.add_edge("B", "C")
graph.add_edge("C", "D")
fmt.Println("Path A to D:", graph.has_path("A", "D")) // true
fmt.Println("Path A to E:", graph.has_path("A", "E")) // false
|
Time Complexity
| Operation | Average Case | Worst Case | Notes |
| Add | O(1) | O(n) | Hash collision dependent |
| Remove | O(1) | O(n) | Hash collision dependent |
| Contains | O(1) | O(n) | Hash collision dependent |
| Union | O(n + m) | O(n + m) | n, m are set sizes |
| Intersection | O(min(n, m)) | O(n * m) | Optimized for smaller set |
| Difference | O(n) | O(n * m) | n is first set size |
| Size | O(1) | O(1) | Cached counter |
| Is Empty | O(1) | O(1) | Simple size check |
Space Complexity
- Storage: O(n) where n is the number of unique elements
- Load Factor: Typically maintains 0.75 load factor for optimal performance
- Hash Table Overhead: Additional 25-50% memory for hash buckets
The Set implementation uses efficient hash functions optimized for different data types:
| Hash Function Performance |
|---|
| // Different types have optimized hashing
var mixed_set = collections.new_set()
mixed_set.add(42) // Integer hash
mixed_set.add("hello") // String hash
mixed_set.add([1, 2, 3]) // Array hash
mixed_set.add({"key": "value"}) // Map hash
// Custom hashable types can implement hash() method
struct Point {
x: int
y: int
}
function (p Point) hash() {
return p.x * 31 + p.y // Simple hash combination
}
var points = collections.new_set()
points.add(Point{x: 1, y: 2})
points.add(Point{x: 3, y: 4})
|
Error Handling
Sets handle errors gracefully:
| Error Handling |
|---|
| var set = collections.new_set()
// Adding non-hashable types
var result = set.add(some_unhashable_object)
if result.type() == "ERROR" {
fmt.Println("Cannot add non-hashable type to set")
}
// Operations on empty sets
var empty_set = collections.new_set()
var union_result = empty_set.union(set) // Works fine, returns copy of set
// Null handling
set.add(null) // null is hashable and can be stored
fmt.Println("Contains null:", set.contains(null))
|
Best Practices
1. Choose Sets for Uniqueness
| Choose Sets for Uniqueness |
|---|
| // Use sets when uniqueness is important
var unique_visitors = collections.new_set()
function track_visitor(user_id string) {
unique_visitors.add(user_id) // Automatically handles duplicates
}
|
2. Pre-size for Known Capacity
| Pre-size Sets |
|---|
| // If you know approximate size, reserve space
var large_set = collections.new_set()
large_set.reserve(10000) // Reduces hash table resizing
|
3. Use Set Operations Instead of Loops
| Use Set Operations |
|---|
| // Instead of manual loops
var common_elements = collections.new_set()
for item in set1 {
if set2.contains(item) {
common_elements.add(item)
}
}
// Use built-in intersection
var common_elements = set1.intersection(set2) // More efficient
|
4. Consider Memory Usage for Large Sets
| Monitor Memory Usage |
|---|
| // Monitor memory usage for large sets
var large_set = collections.new_set()
for i in range(1000000) {
large_set.add(i)
}
// Periodically check size and clear if needed
if large_set.size() > threshold {
large_set.clear()
}
|
5. Use Concurrent Sets for Multi-threading
| Concurrent Sets |
|---|
| // For concurrent access
var shared_set = collections.new_concurrent_set()
// Thread-safe operations across goroutines
|
Migration from Arrays
If you're currently using arrays to maintain unique elements:
| Migration from Arrays |
|---|
| // Old approach with arrays
var unique_items = []
function add_unique(item any) {
for existing in unique_items {
if existing == item {
return // Already exists
}
}
unique_items.append(item) // O(n) check + O(1) append
}
// New approach with sets
var unique_items = collections.new_set()
function add_unique(item any) {
unique_items.add(item) // O(1) average case
}
|
Sets provide a more efficient and cleaner solution for managing unique collections with powerful set theory operations built-in.