Python’s operator.itemgetter is quite versatile. It works on pretty much any iterables and
map-like objects and allows you to fetch elements from them. The following snippet shows how
you can use it to sort a list of tuples by the first element of the tuple:
In [2]: from operator import itemgetter
...:
...: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)]
...: l_sorted = sorted(l, key=itemgetter(0))
In [3]: l_sorted
Out[3]: [(0, 55), (1, 3), (4, 8), (6, 7), (10, 9)]
Here, the itemgetter callable is doing the work of selecting the first element of every
tuple inside the list and then the sorted function is using those values to sort the
elements. Also, this is faster than using a lambda function and passing that to the key
parameter to do the sorting:
In [6]: from operator import itemgetter
In [7]: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)]
In [8]: %timeit sorted(l, key=itemgetter(0))
386 ns ± 4.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [9]: %timeit sorted(l, key=lambda x: x[0])
498 ns ± 0.444 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
You can also use itemgetter to extract multiple values from a dictionary in a single pass.
Consider this example:
In [13]: from operator import itemgetter
In [14]: d = {'foo': 31, 'bar': 12, 'baz': 42, 'chez': 83, 'moi': 24}
In [15]: foo, bar, bazz = itemgetter('foo', 'bar', 'baz')(d)
In [16]: foo, bar, bazz
Out[16]: (31, 12, 42)
So, instead of extracting the key-value pairs with d['foo'], d['bar'], ..., itemgetter
allows us to make it DRY. The source code of the callable is freakishly simple. Here’s the
entire thing:
# operator.py
class itemgetter:
"""
Return a callable object that fetches the given item(s) from
its operand.
After f = itemgetter(2), the call f(r) returns r[2].
After g = itemgetter(2, 5, 3), the call g(r)
returns (r[2], r[5], r[3])
"""
__slots__ = ("_items", "_call")
def __init__(self, item, *items):
if not items:
self._items = (item,)
def func(obj):
return obj[item]
self._call = func
else:
self._items = items = (item,) + items
def func(obj):
return tuple(obj[i] for i in items)
self._call = func
def __call__(self, obj):
return self._call(obj)
def __repr__(self):
return "%s.%s(%s)" % (
self.__class__.__module__,
self.__class__.__name__,
", ".join(map(repr, self._items)),
)
def __reduce__(self):
return self.__class__, self._items
While this is all good and dandy, itemgetter will raise a KeyError if it can’t find the
corresponding value against a key in a map or an IndexError if the provided index is
outside of the range of the sequence. This is how it looks in a dict:
In [1]: from operator import itemgetter
In [2]: d = {'foo': 31, 'bar': 12, 'baz': 42, 'chez': 83, 'moi': 24}
In [3]: # These keys don't exist.
In [4]: fizz, up = itemgetter('fiz', 'up')(d)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 fizz, up = itemgetter('fiz', 'up')(d)
KeyError: 'fiz'
In the above snippet, itemgetter can’t find the key fiz in the dict d and it complains
when we try to fetch the value against it. In a sequence, the error looks like this:
In [5]: from operator import itemgetter
In [6]: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)]
In [7]: # These indices don't exist.
In [8]: item_42, item_50 = itemgetter(42, 50)(l)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 item_42, item_50 = itemgetter(42, 50)(l)
IndexError: list index out of range
A more tolerant version of ‘operator.itemgetter’
I wanted something that works similar to itemgetter but doesn’t raise these exceptions
when it can’t find the key in a dict or the index of a sequence is out of range. Instead,
it’d return a default value when these exceptions occur. So, to avoid KeyError in a map,
it’d use d.get(key, default) instead of d[key] to fetch the value. Similarly, in a
sequence, it’d first compare the length of the sequence with the index and return a default
value if the index is out of range.
Since operator.itemgetter is a class, we could inherit it and overwrite the __init__
method. However, your type-checker will complain if you do so. That’s because, in the stub
file, the itemgetter class is decorated with the typing.final decorator and isn’t meant
to be subclassed. So, our only option is to rewrite it. The good news is that this
implementation is quite terse just like the original. Here it goes:
# src.py
from collections.abc import Mapping
class _Nothing:
"""Works as a sentinel value."""
def __repr__(self):
return "<NOTHING>"
_NOTHING = _Nothing()
class safe_itemgetter:
"""
Return a callable object that fetches the given item(s)
from its operand.
"""
__slots__ = ("_items", "_call")
def __init__(self, item, *items, default=_NOTHING):
if not items:
self._items = (item,)
def func(obj):
if isinstance(obj, Mapping):
return obj.get(item, default)
if (item > 0 and len(obj) <= item) or (
item < 0 and len(obj) < abs(item)
):
return default
return obj[item]
self._call = func
else:
self._items = items = (item,) + items
def func(obj):
if isinstance(obj, Mapping):
get = obj.get # Reduce attibute search call.
return tuple(get(i, default) for i in items)
return tuple(
default
if (i > 0 and len(obj) <= i)
or (i < 0 and len(obj) < abs(i))
else obj[i]
for i in items
)
self._call = func
# ----------------- same as operator.itemgetter --------------#
def __call__(self, obj):
return self._call(obj)
def __repr__(self):
return "%s.%s(%s)" % (
self.__class__.__module__,
self.__class__.__name__,
", ".join(map(repr, self._items)),
)
def __reduce__(self):
return self.__class__, self._items
This class behaves almost the same way as the original itemgetter function. The only
difference is that you can pass a default value to return instead of raising
KeyError/IndexError depending on the type of the container. Let’s try it out with a dict:
In [12]: d = {'foo': 31, 'bar': 12, 'baz': 42, 'chez': 83, 'moi': 24}
In [13]: safe_itemgetter(-5, -3, -33, 'baz', 1)(d)
Out[13]: (<NOTHING>, <NOTHING>, <NOTHING>, 42, <NOTHING>)
Here, we’re trying to access a bunch of keys that don’t exist in the dict d and we want to
do this without raising any exceptions. You can see that instead of raising an exception,
safe_itemgetter returns a tuple containing the value(s) that it can find and the rest of
the positions are filled with the default value; in this case, the <NOTHING> sentinel.
We can pass any default value there:
In[14]: safe_itemgetter(-5, -3, -33, "baz", 1, default="default")(d)
Out[14]: ("default", "default", "default", 42, "default")
This works similarly when a sequence is passed:
In[18]: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)]
In[19]: safe_itemgetter(-11, default=())(l)
Out[19]: ()
This returns an empty tuple when the sequence index is out of range. It works with multiple indices as well:
In [28]: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)]
In [29]: safe_itemgetter(-1, -3, -7, 1)(l)
Out[29]: ((6, 7), (4, 8), <NOTHING>, (1, 3))
Recent posts
- Revisiting interface segregation in Go
- Avoiding collisions in Go context keys
- Organizing Go tests
- Subtest grouping in Go
- Let the domain guide your application structure
- Test state, not interactions
- Early return and goroutine leak
- Lifecycle management in Go tests
- Gateway pattern for external service calls
- Flags for discoverable test config in Go