#5

"Setting the Scene with Miserable Iterables"

14 September 2024

  • iterables

If 1 programmer can do it in 1 week, then 2 programmers can do it in 2 weeks.

[ If you’re new here, this is pycobytes – a weekly series delving into cool stuff in Python. Whether you’re just starting out, or have already gone insane from programming, I’m sure you’ll learn something new! – Dawei, U6C1 ]

Hey pips!

How does a for loop work?

>>> stuff = [0, 3, 6, 9]

>>> for item in stuff:
...     print(item)
0
3
6
9

The exact details are a little involved, but it’s essentially fetching items 1 by 1 from stuff, until there are no more. It’s like you taking desserts at lunch until none are left (we’ve reached the last object in stuff) or you’re told off (you hit a break or raise).

Naturally, any object which can be iterated over in this way is known as an iterable.

Python comes with many in-built iterable types useful for different situations. Other than list and str, we also have tuple, set, and dict.

There’s a lot we could cover here, so we’ll start with the set. You might already know them from maths. Compared to lists, sets have 2 distinct differences: they can only contain unique items, and they don’t retain order.

We create sets using curly braces {}, like so:

>>> a_set = {4, 0, 1}

If you want to create an empty set, you’ve got to use the set() constructor with no arguments. {} has already been claimed by dicts!

>>> a_dict = {}
>>> a_set = set()

That constructor can also be used to convert other iterables to sets. It’ll add the objects in that iterable to the set, and because we can’t have more than 1 of the same item, any duplicates will be removed.

>>> a_list = [2, 2, 6, 9]
>>> a_set = set(a_list)
>>> a_set
{2, 6, 9}

This shows us a really useful feature of sets: they can be used to filter iterables for only unique items! Say we’ve got a list of social media posts, and want to produce a list of the users who’ve posted.

>>> posts = ...
>>> users = [post.user for post in posts]
>>> print(users)
["Dawei", "Oliver", "Oliver", "Alex", "Alex", "Alex", "Alex", "Alex"]

We can remove the duplicate users just by converting the list to a set, and then back to a list:

>>> unique_users = list(set(users))
>>> print(unique_users)
["Oliver", "Dawei", "Alex"]

Ah, the order changed! Remember, sets don’t have any particular order, so there’s no guarantee the items will stay in the order they started.

The use cases for sets are pretty niche; most of the time a list will suffice. One really cool thing is that checking if a set contains an object is really, really fast!

>>> loads_of_stuff = set(range(10 ** 10))  # this might not be fast...
>>> 2147483647 in loads_of_stuff
True  # super quick!

This is because sets store objects in a hash table, so all it has to do is run the hashing algorithm on 2147483647 and check if the corresponding location in memory is valid!

Question? Bug needs fixing? Or just want to nerd out over programming?
Drop a message in the GitHub discussion for this issue.