Sam Wonders

When to Stop Digging

Unlimited Rabbit Hole Entrances

One of the best/worst parts of software engineering is that every day, you encounter endless Rabbit Hole Entrances.

You’re constantly coming across things that seem interesting, things that don’t behave the way you’d expect, things that seem worth investigating.

A very important skill is the ability to determine which Rabbit Holes to go down, and once you do start the journey, at what point to turn back.

Big books of data

Here’s an example to a Rabbit Hole Entrance I recently came across. I’m using an Imperfect Analogy so that a technical background isn’t required to understand.

Let’s say we have a big book of Movies.

Imagine this book is simply a giant list, where each item has some basic info, like the movie name, release year, runtime, etc.

For example:

title=Minari; release_year=2020; runtime_minutes=150; movie_id=123
title=Dune; release_year=2021; runtime_minutes=155; movie_id=124
title=CODA; release_year=2021; runtime_minutes=111; movie_id=125
...
(many, many more list items)

Now let’s say we have a big book of Awards.

Imagine this book is simply a giant list, where each item has some basic info, like the granting authority (“Academy Awards”), the award name (“Best Picture”), the Movie connected to that award, whether the Movie won the award, etc.

For example:

movie_id=123; authority=Academy Awards; award=Best Picture; won=no; award_id=4567
movie_id=124; authority=Academy Awards; award=Best Picture; won=no; award_id=4568
movie_id=125; authority=Academy Awards; award=Best Picture; won=yes; award_id=4569
...
(many, many more list items)

(Note: at the beginning of the Awards book is a Special Page that tells the reader how the list items in the Awards book are connected to the list items in the Movies book. The Fancy Word for this is “foreign key relationship”, but the Fancy Word isn’t important for this story.)

Our helpful robot, Django

Let’s say we have a robot that has memorized the contents of the big book of Movies and the big book of Awards. (Plus a lot of other fancy stuff that isn’t important for this story.)

Assuming we know how to talk in this robot’s language, we can give the robot instructions on what we want to know, and it will give us an output.

Let’s call this robot Django.

Our First Question (Success!)

Let’s say we want a list of movies that were released in 2021 and have a runtime of 111 minutes. (Note that we only need the big book of Movies for this question.)

Here’s one way to write instructions to give to Django:

results = (Movie.objects
    .filter(release_year=2021)
    .filter(runtime_minutes=111)
)

The above instructions basically say:

Django will give us a nice final output list of movies based on our instructions.

Our Second Question (Confusion…)

Now let’s say we want a list of movies that were nominated for the Academy Award for Best Picture. (Note that we need both big books to answer this question.)

Here’s our instructions to Django:

results = (Movie.objects
    .filter(award__authority=“Academy Awards”)
    .filter(award__award=“Best Picture”)
)

Django has given us an output list based on our instructions again… but this list has duplicate items!

In other words, our list has “Minari” twice, “Dune” twice, and “CODA” twice.

Um. What the heck, Django!

Django made me facepalm

The big book of Movies only has “Minari” once, so why does an instruction that is “filtering” that list into a SMALLER, MORE LIMITED list give us “Minari” twice??

Why does the same approach of “chaining filters” have different behavior when asking these two questions??

Congratulations. We’ve discovered the Entrance to a Rabbit Hole.

A Peek into Sam’s Brain

(Please feel free to skip to next section…)

For the project I was working on, I needed to figure out how to give Django instructions so that the output list didn’t have duplicates. I’m also trying to learn more about how Django works. Not to mention, I’m an extremely Curious person.

My personal experience with this Rabbit Hole went something like this:

Should we go down the Rabbit Hole?

If you spend much time doing software engineering, this sort of thing happens ALL THE TIME. It can be both exhilarating and exhausting.

But here’s the thing about Rabbit Holes: you never know how deep they go, you never know how many diverging paths they may have, and you never know whether there will be anything satisfying at the end.

I’m constantly struggling with deciding how deep to go down Rabbit Holes… It’s so hard to know When To Stop Digging.


Footnote: Solution to Question 2

For other Curious persons, you can remove duplicates by passing multiple arguments to the same filter, rather than chaining filters. Like so:

Movie.objects.filter(
    award__authority="academy awards",
    award__award="best picture"
)