Why Sets Are My Favorite Data Structure (And Should Be Yours Too)

Aaron Lee Maxwell
2 min readApr 10, 2023

--

All programming languages have an array or list type. This is the most widely used collection type in the history of computing, for good reason: it’s useful.

But I suggest you consider a different collection type:

The set.

We use a list when the order of elements matters, or when we need to keep track of duplicates.

But many algorithms do not need either. You need to process every unique element in the collection — or you only want to “visit” each once, so you carefully code it so duplicates are not added to the list anyway. And you do not care about the order, so long as all elements get processed.

And when operating on a collection that does NOT need ordering or duplicates, many algorithms can be expressed more simply — and with better time and space complexity — with sets than they can be with lists.

Code operating on sets is simpler to write, once you invest a bit of time getting used to them. And they are simpler to read, and easier to update. All because the calculation requires fewer, simpler lines of code.

Most people “default” to lists, and only use sets when they can think of a specific reason to use a set.

I rewired my thinking to the opposite: I use a set unless I have a specific reason to use a different collection type.

For a small collection size, which is most cases, it really does not matter. You basically cannot even measure the performance difference with anything of a size less than thousands or even much higher.

But sometimes it DOES matter. And by defaulting to sets, it naturally infuses all the code I write with a little bit better performance on average over time.

What coding habits can you adopt, that will have leverage over time? Something to think about.

--

--