I recently encountered a bug in production. Frankly, it had no impact, but it was pretty weird, and complex in its origin, so I decided to share it. This is also a followup of sorts to my previous post on namedtuples.

I’ll walk you through the code we had written, how the problem manifested, and the shenanigans that caused it.

An enum of instances

We use Airflow to run our data pipelines. Some of the tasks we run require large machines, others can make do with very small ones. We wrote a simple interface to manage k8s resource requests.

First, we define a class to hold the memory and CPU we want to request:

from typing import NamedTuple

class PodRequest(NamedTuple):
    memory: int # [MiB]
    cpu: int    # [mCPU]

We make it a NamedTuple for … reasons. It’s worth noting that inheriting from NamedTuple really is just equivalent to creating a new instance with

PodRequest = collections.namedtuple(PodRequest, "memory", "cpu")

but it’s nicer, because it allows us to define functions. This is useful to us: We have some common requests which we don’t want to specify manually each time, but sometimes we want to customize the request a bit. We defined an Enum to hold the request we plan on reusing.

class CommonPodRequest(PodRequest, Enum):
    SMALL = (512, 1000)
    MEDIUM = (2048, 2000)
    LARGE = (65535, 3000)

And we add some helper functions to PodRequest to help us customize a CommonPodRequest.

class PodRequest(NamedTuple):
    # ...

    def with_memory(self, new_memory):
        return self._replace(memory=new_memory)

    def with_cpu(self, new_cpu):
        return self._replace(cpu=new_cpu)

request = CommonPodRequest.SMALL.with_memory(1024)

Finally, we have some code that puts together the executor_config dict, an Airflow concept, that will be passed into the k8s API:

def get_executor_config(request: PodRequest):
    return {
        "KubernetesExecutor": {
            "request_cpu": f"{request.cpu / 1000}",
            "request_memory": f"{request.memory}Mi"
            # ...
        }
    }

Other than the interface with Airflow, this should all be pretty straightforward. Admittedly, the Enum is more of a container of constants than an exhaustive set alternatives1, so we’re abusing notation a bit for that sweet, sweet autocomplete.

In hindsight, the interesting thing is that it works exactly as expected! The tasks spin up and the correct memory requests are made to k8s. However, it is flawed. You may want to pause here and see if you can figure out why, especially if you’re a Python programmer.

Picture of a grumpy cat with subtitle "This cat will judge you if you just keep scrolling"

Errors in the scheduler

The problem manifests in the following error message, which is logged in each worker pod2, after the task is completed as success3:

AttributeError: 'CommonPodRequest' object has no attribute '_name_'

This message is quite puzzling. Why was the worker trying to access _name_ attribute? And, given that CommonPodRequest is an Enum, and all Enums have a _name_ attribute, why do we get an AttributeError here?

Now the first question is easily answered, it requires context I’ve not given you, and is probably only interesting if you’re into Airflow: The full stack trace would show that the error gets raised when the scheduler tries to serialize the DAG, probably in order to do ✨ XCOM things ✨? We are using a custom subclasses of BashOperator to define tasks, which had a PodRequest as one of its attributes. In serializing the DAG, the worker had to serialize the PodRequest, which, apparently, was a CommonPodRequest, and so should have a _name_ attribute. If you know why exactly serialization was happening here, please tell me. Anyways, since this happens after the task is marked as success, this error had no impact on operations, it’s just polluting the logs and distracting devs who have not encountered it before.

Last chance to figure out what’s going on on your own!

Picture of a grumpy cat with subtitle "This cat will judge you if you just keep scrolling"

How? Why?

The key here are the helpers we defined

    def with_memory(self, new_memory):
        return self._replace(memory=new_memory)

    def with_cpu(self, new_cpu):
        return self._replace(cpu=new_cpu)

If we try inspecting one of the requests obtained through a helper, we get the same error:

In [16]: CommonPodRequest.SMALL.with_cpu(2000)

AttributeError: 'CommonPodRequest' object has no attribute '_name_'

The problem lies in the _replace method, which is defined inside the built-in namedtuple function.

def namedtuple(...):
    # ...

    tuple_new = tuple.__new__

    @classmethod
    def _make(cls, iterable):
        result = tuple_new(cls, iterable)
        if _len(result) != num_fields:
            raise TypeError(f'Expected {num_fields} arguments, got {len(result)}')
        return result

    def _replace(self, /, **kwds):
        result = self._make(_map(kwds.pop, field_names, self))  # [This is pretty intense hackery, worth meditating on for a moment. (J's comment.)]
        if kwds:
            raise TypeError(f'Got unexpected field names: {list(kwds)!r}')
        return result

Here’s a breakdown of what’s going on:

  1. field_names is the array ["memory", "cpu"].
  2. CommonPodRequest.SMALL.with_memory(1024) therefore calls self._make(1024, 1000).
  3. _make is a classmethod, and cls is CommonPodRequest.
  4. So in effect, _replace returns tuple.__new__(CommonPodRequest, [1024, 1000]), which creates an instance of CommonPodRequest without calling Enum.__new__!

This is why the usual Enum attributes, like _name_, aren’t set.

Summing up

The lessons from this are straightforward:

  1. namedtuples are evil.
  2. Multiple inheritance is evil…
  3. …especially if the parents don’t share the same metaclass!
  4. Abuse of notation will come back to haunt you. An enum that’s not really an enum, mixed with a tuple that’s not actually a tuple, is just asking for it.
  5. We should have just used module-scoped constants.
  6. Honestly, why namedtuple?

The fix was easy too. Instead of calling _replace on a CommonPodRequest, we turn it into a PodRequest first.

    def with_memory(self, new_memory):
        return PodRequest(*self)._replace(memory=new_memory)

    def with_cpu(self, new_cpu):
        return PodRequest(*self)._replace(cpu=new_cpu)

CommonPodRequest is a tuple, after all. Hope you enjoyed!


  1. In case you need a reminder on how python Enums mix with inheritance: The tuple assigned to the enum constant gets passed as *args to the constructor of the first parent class (Enum must always come last). So in the example, Requests.SMALL will have a value of PodRequest(memory=512, cpu=1000)

  2. A worker pod is a k8s pod spun up by Airflow’s KubernetesExecutor in order to do actual useful compute. 

  3. Full error message omitted because spinning up Airflow is a faff, actually.