Table of contents
Python Sets for busy people
- Set items don’t have a serial number (index)
- No Index. No Order. You can’t refer to an item like
mySet[2]
- No duplicate items
- No Lists/Dictionaries
- Can add new items:
mySet.add(4)
: Adds a single element to the set.mySet.update([4, 5])
: Adds multiple elements to the set.
- Can remove items:
mySet.remove(element)
: Removes a specific element. Raises aKeyError
if not found.mySet.discard(element)
: Removes a specific element. No issues if not found.mySet.pop()
: Removes and returns a random element.KeyError
if set empty.mySet.clear()
: Removes all elements. Slate clean.
- Check if an element exists:
element in mySet
:True
if item present, elseFalse
.
- Get the number of elements:
len(mySet)
: Total no of items in set.
- Copy a set:
mySet.copy()
: Creates a shallow copy of the set.
- Union of sets:
mySet.union(other_set)
ormySet | other_set
: Combines all elements from both sets, without duplicates.
- Intersection of sets:
mySet.intersection(other_set)
ormySet & other_set
: Returns elements common to both sets.
- Difference of sets:
mySet.difference(other_set)
ormySet - other_set
: Returns elements in the first set but not in the second.
- Symmetric difference of sets:
mySet.symmetric_difference(other_set)
ormySet ^ other_set
: Returns elements in either set, but not in both.
Note:
Changeable items (also called mutable) are items that can be modified after they are created. For example:
- Lists: You can add, remove, or change elements.
- Dictionaries: You can add, remove, or change key-value pairs.
These items cannot be added to a set because sets need items that do not change.
Unchangeable items (also called immutable) are items that cannot be modified after they are created. For example:
- Numbers: Once created, their values cannot be changed.
- Strings: Any modification creates a new string.
- Tuples: Their elements cannot be changed once created.
These items can be added to a set because their values stay the same.
Python Sets
A Python set is like your travel kit. Collection of unique items. There can be different items. But, they should be unique. Set items don't have serial numbers (Index). Without a serial number, you can't do something like mySet[2]="Guava"
. All items in a set must be different. Otherwise, how would you tell them apart? If your set has two apples, which one is which? But, you can remove items from a set. You can take out an apple and add a guava. Don't think about removing an apple and adding another apple. Sets can't contain a list
or a dictionary
. Period. They can contain tuples, but these tuples can't have lists or dictionaries inside them. (It won't cause an error, but it can make the code unstable.)
Python Sets Properties
So, here are some properties of Python Sets:
Items have No Index:
Python stores Set items but does not keep track of their order. This means there is no first item, second item, etc. For example, if you input `apple`, `orange`, `banana`, you might get `banana`, `apple`, `orange` as the output.
```python
mySet = {1, 2, 3}
print(mySet) # Output could be {1, 2, 3} or {3, 1, 2} or any permutation
mySet[0] # THIS IS AN ERROR. No one is sitting at 0. There is no order, no index.
```
No Duplicates:
Since items in a set do not have serial numbers, duplicates are not allowed. If you try to add two apples, how would you distinguish between them? Therefore, when you add duplicates to a set, Python automatically removes the duplicates.
mySet = {1, 2, 2, 3}
print(mySet) # Output: {1, 2, 3}
No In-Place Replace. Add/remove instead.
You can add/remove items, but can't change an item's value directly. Can't in-place replace items. First, remove the old one and add the new one.
```python
mySet = {1, 2, 3}
mySet.remove(2) # OK
mySet.add(4) # OK
mySet[0] = 5 # ERROR
```
No Lists/Dictionaries, Tuples Are OK.
Sets use hashing, so you can't store lists or dictionaries in them. However, you can store tuples. Just make sure these tuples don't contain lists or dictionaries inside them.
```python
# Valid elements
mySet = {1, "hello", (1, 2)} # TUPLES OK
# Invalid elements
mySet = {[1, 2], {"key": "value"}} # ERROR, NO LISTS, NO DICTS
```
When to use sets
Sets for Python are very useful when you need keep unique items and do quick membership checks. Here are some scenarios where sets are frequently used:
Removing Duplicates
- Use Case: When you need to ensure that a collection of elements contains no duplicates.
- Example: Removing duplicates from a list.
items = [1, 2, 2, 3, 4, 4, 5] unique_items = list(set(items)) # [1, 2, 3, 4, 5]
Membership Testing
- Use Case: When you need to check if an element exists in a collection. Sets provide average O(1) time complexity for membership tests.
- Example: Checking if an item exists in a collection.
allowed_items = {"apple", "banana", "cherry"} if "banana" in allowed_items: print("Banana is allowed")
Set Operations
- Use Case: When you need to perform operations like union, intersection, difference, and symmetric difference between collections.
- Example: Finding common elements between two sets.
set1 = {1, 2, 3} set2 = {3, 4, 5} common_items = set1 & set2 # {3}
Data Validation
- Use Case: When validating data to ensure uniqueness, such as checking for duplicate entries in a dataset.
- Example: Validating unique user IDs.
user_ids = [101, 102, 103, 101] unique_user_ids = set(user_ids) if len(user_ids) != len(unique_user_ids): print("Duplicate user IDs found")
Tracking Unique Elements
- Use Case: When you need to keep track of unique items encountered during processing.
- Example: Tracking unique words in a text.
text = "hello world hello" words = text.split() unique_words = set(words) # {"hello", "world"}
Efficient Data Lookups
- Use Case: When you need a data structure that allows for fast lookups, insertions, and deletions.
- Example: Keeping track of visited URLs in a web crawler.
visited_urls = set() visited_urls.add("https://example.com") if "https://example.com" in visited_urls: print("URL already visited")
Test your knowledge
Highlight the answer section to reveal!
Question - set.update()
What will be the output of the following statement?
thisset = {"apple", "banana", "cherry", False, True, 0}
print(thisset)
Answer: {‘apple’, ‘banana’, ‘cherry’, False, True}
Question - set.add()
What will be the output of the following statement?
thisset = {"apple", "banana", "cherry"}
thisset.add("apple")
print(thisset)
Answer: {‘apple’, ‘banana’, ‘cherry’}
Question - set.discard()
What will be the output of the following statement?
thisset = {1, 2, 3, 4, 5}
thisset.discard(6)
print(thisset)
Answer: {1, 2, 3, 4, 5}
Question - set.remove()
What will be the output of the following statement?
thisset = {1, 2, 3, 4, 5}
thisset.remove(6)
print(thisset)
Answer: Raises a KeyError
Question - set.update()
What will be the output of the following statement?
thisset = {"apple", "banana", "cherry"}
thisset.update(["orange", "mango"])
print(thisset)
Answer: {‘apple’, ‘banana’, ‘cherry’, ‘orange’, ‘mango’}
Question - set.copy()
What will be the output of the following statement?
thisset = {"apple", "banana", "cherry"}
newset = thisset.copy()
thisset.add("orange")
print(newset)
Answer: {‘apple’, ‘banana’, ‘cherry’}
Question - set membership
What will be the output of the following statement?
thisset = {1, 2, 3, 4, 5}
result = 3 in thisset
print(result)
Answer: True
Question - set intersection
What will be the output of the following statement?
thisset1 = {1, 2, 3}
thisset2 = {3, 4, 5}
result = thisset1 & thisset2
print(result)
Answer: {3}
Question - set union
What will be the output of the following statement?
thisset1 = {1, 2, 3}
thisset2 = {3, 4, 5}
result = thisset1 | thisset2
print(result)
Answer: {1, 2, 3, 4, 5}
Question - set difference
What will be the output of the following statement?
thisset1 = {1, 2, 3}
thisset2 = {3, 4, 5}
result = thisset1 - thisset2
print(result)
Answer: {1, 2}
Set Operations and Properties
Operation | Syntax | Description & Example |
---|---|---|
Union | x1.union(x2) x1 | x2 | Combines all elements from both sets, without duplicates.x1 = {1, 2, 3} x2 = {3, 4, 5} x1.union(x2) Output: {1, 2, 3, 4, 5} |
Intersection | x1.intersection(x2) x1 & x2 | Returns elements common to both sets.x1 = {1, 2, 3} x2 = {3, 4, 5} x1 & x2 Output: {3} |
Difference | x1.difference(x2) x1 - x2 | Returns elements in the first set but not in the second.x1 = {1, 2, 3} x2 = {3, 4, 5} x1 - x2 Output: {1, 2} |
Symmetric Difference | x1.symmetric_difference(x2) x1 ^ x2 | Elements in either set, but not both.x1 = {1, 2, 3} x2 = {3, 4, 5} x1 ^ x2 Output: {1, 2, 4, 5} |
Subset | x1.issubset(x2) x1 <= x2 | Checks if all elements of one set are in another.x1 = {1, 2} x2 = {1, 2, 3} x1 <= x2 Output: True |
Superset | x1.issuperset(x2) x1 >= x2 | Checks if one set contains all elements of another.x1 = {1, 2, 3} x2 = {1, 2} x1 >= x2 Output: True |
Disjoint | x1.isdisjoint(x2) | Checks if two sets have no elements in common.x1 = {1, 2, 3} x2 = {4, 5, 6} x1.isdisjoint(x2) Output: True |
Add Element | x1.add(element) | Adds a single element to the set.x1 = {1, 2, 3} x1.add(4) Output: {1, 2, 3, 4} |
Remove Element | x1.remove(element) | Removes a specific element from the set.x1 = {1, 2, 3} x1.remove(2) Output: {1, 3} |
Discard Element | x1.discard(element) | Removes a specific element if it is present.x1 = {1, 2, 3} x1.discard(2) Output: {1, 3} |
Clear Set | x1.clear() | Removes all elements from the set.x1 = {1, 2, 3} x1.clear() Output: set() |
Copy Set | x1.copy() | Creates a shallow copy of the set.x1 = {1, 2, 3} x2 = x1.copy() Output: x2 = {1, 2, 3} |
Update Set | x1.update(x2) | Adds elements from another set.x1 = {1, 2} x2 = {3, 4} x1.update(x2) Output: {1, 2, 3, 4} |
Intersection Update | x1.intersection_update(x2) | Updates the set, keeping only elements found in it and another set.x1 = {1, 2, 3} x2 = {2, 3, 4} x1.intersection_update(x2) Output: {2, 3} |
Difference Update | x1.difference_update(x2) | Updates the set, removing elements found in another set.x1 = {1, 2, 3} x2 = {2, 3, 4} x1.difference_update(x2) Output: {1} |
Symmetric Difference Update | x1.symmetric_difference_update(x2) | Updates the set, keeping only elements found in either set, but not both.x1 = {1, 2, 3} x2 = {2, 3, 4} x1.symmetric_difference_update(x2) Output: {1, 4} |