Python Sets — A Powerful Collection Every Developer Should Use
- Valeria Aynbinder
- Coding
- 10 Sep, 2024
Introduction
There are three basic Python collections: list, dictionary, and set. While list and dictionary are widely used in almost every piece of Python code, the set sometimes feels underrated, though it shouldn’t be!
Python set is an incredibly useful collection that allows you to write very efficient and elegant code thanks to its special properties and functionalities.
Mastering sets will significantly improve your code and shorten development time. So, let’s dive deep into Python sets without further ado!
Feel free to watch my video on Python sets with code examples for better understanding.
Let’s start with a motivating task: how do you get unique values from a list that might contain duplicates?
Using a Python set, you can achieve this in a single line of code by creating a set from the given list:
Note that now our unique_names
set does not contain duplicates. This demonstrates one of the two main properties of a set:
Set elements are unique.
Let’s discuss these two main set properties in more detail.
Set Properties
Sets implement two main properties that significantly distinguish them from lists:
- Set elements are unique
- Set elements are unordered
At first glance, it might seem odd — why use a collection that doesn’t support ordering and doesn’t allow duplicates when we have lists that don’t have these limitations?
The answer is that these are not limitations, but important properties that can prevent potential bugs and enable unique functionalities. Let’s look at a couple of examples.
Example 1: Cities That Hosted the Olympics
Suppose you want to store all the cities that have ever hosted the Olympic Games. It makes perfect sense to use a set instead of a list:
- You want each city to appear only once, even if it has hosted multiple times.
- Ordering isn’t necessary, as there’s no inherent ranking among the cities.
Example 2: Fruits That Grow in Israel
You want to store all the fruits that grow in Israel. The same principles apply:
- You don’t want duplicates in your collection of fruits.
- There’s no need for any particular order.
Now let’s see how these properties are implemented in sets.
Uniqueness
In the following code snippet, we have a list of rainy months in Israel from the past three years. Since there are some winter months that often repeat, there are duplicates in the rainy_months_list.
If we want to get a unique set of rainy months in Israel, all we need to do is create a set from rainy_months_list:
The set automatically removes all duplicates from the initial collection. Additionally, sets maintain uniqueness by preventing duplicates from being added. In the following code snippet, you can see an attempt to add a duplicate element to the set, but it remains unchanged:
Absence of Order
Since sets don’t support ordering, attempting to access an element at a specific index will raise an exception:
However, we can iterate over set elements like any other collection, though the order of elements during iteration isn’t guaranteed and may change:
Set Special Functionality
Now that we’ve covered set properties, let’s take a look at some powerful functionalities Python sets implement:
- Intersection
- Union
- Difference
If you’re familiar with Set Theory, these are exactly the operations defined there. If not, this diagram will help you visualize them:
Let’s see these operations in action with some code examples. First, we’ll create three sets to work with:
- contains all seven weekdays.
sport_days
represents days when I work out.lecture_days
are the days I teach.
We’re going to use set operations to get some insights from these sets.
Intersection
What are the days when I give lectures in the morning and work out in the evening?
To answer this, we need to find elements common to both sport_days
and lecture_days
, i.e., the intersection of these sets. We can do this in one line using the intersection
method:
Union
On which days do I have something on my schedule (i.e., what are my busy days)?
These are the days present in either sport_days
or lecture_days
, which matches the definition of a union. Let’s store these as busy_days
:
Difference
I’m planning a one-day hike with my friends. Which days am I completely free?
The free days are the difference between weekdays
and busy_days
— days present in weekdays
but not in busy_days
:
Conclusion
Congrats! You’ve added a powerful tool to your Python toolbox!
Now it’s time to apply what you’ve learned. Consider your recent Python projects. Was there any data that had set properties but was implemented using lists or other collections? If so, try rewriting that code to use sets where appropriate.
You can find all the code presented here on my GitHub.
Thanks for reading!