How to merge two dictionaries in a single expression?

Carl Meyer Source

I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged. The update() method would be what I need, if it returned its result instead of modifying a dict in-place.

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{'a': 1, 'b': 10, 'c': 11}

How can I get that final merged dict in z, not x?

(To be extra-clear, the last-one-wins conflict-handling of dict.update() is what I'm looking for as well.)

pythondictionarymerge

Answers

answered 10 years ago Greg Hewgill #1

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z = dict(x.items() + y.items())
print z

For items with keys in both dictionaries ('b'), you can control which one ends up in the output by putting that one last.

answered 10 years ago Thomas Vander Stichele #2

In your case, what you can do is:

z = dict(x.items() + y.items())

This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict's value:

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = dict(x.items() + y.items())
>>> z
{'a': 1, 'c': 11, 'b': 10}

If you use Python 3, it is only a little more complicated. To create z:

>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{'a': 1, 'c': 11, 'b': 10}

answered 10 years ago Matthew Schinckel #3

An alternative:

z = x.copy()
z.update(y)

answered 10 years ago Carl Meyer #4

Another, more concise, option:

z = dict(x, **y)

Note: this has become a popular answer, but it is important to point out that if y has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can't recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.

answered 10 years ago rcreswick #5

I wanted something similar, but with the ability to specify how the values on duplicate keys were merged, so I hacked this out (but did not heavily test it). Obviously this is not a single expression, but it is a single function call.

def merge(d1, d2, merge_fn=lambda x,y:y):
    """
    Merges two dictionaries, non-destructively, combining 
    values on duplicate keys as defined by the optional merge
    function.  The default behavior replaces the values in d1
    with corresponding values in d2.  (There is no other generally
    applicable merge strategy, but often you'll have homogeneous 
    types in your dicts, so specifying a merge technique can be 
    valuable.)

    Examples:

    >>> d1
    {'a': 1, 'c': 3, 'b': 2}
    >>> merge(d1, d1)
    {'a': 1, 'c': 3, 'b': 2}
    >>> merge(d1, d1, lambda x,y: x+y)
    {'a': 2, 'c': 6, 'b': 4}

    """
    result = dict(d1)
    for k,v in d2.iteritems():
        if k in result:
            result[k] = merge_fn(result[k], v)
        else:
            result[k] = v
    return result

answered 10 years ago Tony Meyer #6

This probably won't be a popular answer, but you almost certainly do not want to do this. If you want a copy that's a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.

In addition, when you use .items() (pre Python 3.0), you're creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.

In terms of time:

>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()\ntemp.update(y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027

IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.

answered 10 years ago zaphod #7

In a follow-up answer, you asked about the relative performance of these two alternatives:

z1 = dict(x.items() + y.items())
z2 = dict(x, **y)

On my machine, at least (a fairly ordinary x86_64 running Python 2.5.2), alternative z2 is not only shorter and simpler but also significantly faster. You can verify this for yourself using the timeit module that comes with Python.

Example 1: identical dictionaries mapping 20 consecutive integers to themselves:

% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z1=dict(x.items() + y.items())'
100000 loops, best of 3: 5.67 usec per loop
% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z2=dict(x, **y)' 
100000 loops, best of 3: 1.53 usec per loop

z2 wins by a factor of 3.5 or so. Different dictionaries seem to yield quite different results, but z2 always seems to come out ahead. (If you get inconsistent results for the same test, try passing in -r with a number larger than the default 3.)

Example 2: non-overlapping dictionaries mapping 252 short strings to integers and vice versa:

% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z1=dict(x.items() + y.items())'
1000 loops, best of 3: 260 usec per loop
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z2=dict(x, **y)'               
10000 loops, best of 3: 26.9 usec per loop

z2 wins by about a factor of 10. That's a pretty big win in my book!

After comparing those two, I wondered if z1's poor performance could be attributed to the overhead of constructing the two item lists, which in turn led me to wonder if this variation might work better:

from itertools import chain
z3 = dict(chain(x.iteritems(), y.iteritems()))

A few quick tests, e.g.

% python -m timeit -s 'from itertools import chain; from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z3=dict(chain(x.iteritems(), y.iteritems()))'
10000 loops, best of 3: 66 usec per loop

lead me to conclude that z3 is somewhat faster than z1, but not nearly as fast as z2. Definitely not worth all the extra typing.

This discussion is still missing something important, which is a performance comparison of these alternatives with the "obvious" way of merging two lists: using the update method. To try to keep things on an equal footing with the expressions, none of which modify x or y, I'm going to make a copy of x instead of modifying it in-place, as follows:

z0 = dict(x)
z0.update(y)

A typical result:

% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z0=dict(x); z0.update(y)'
10000 loops, best of 3: 26.9 usec per loop

In other words, z0 and z2 seem to have essentially identical performance. Do you think this might be a coincidence? I don't....

In fact, I'd go so far as to claim that it's impossible for pure Python code to do any better than this. And if you can do significantly better in a C extension module, I imagine the Python folks might well be interested in incorporating your code (or a variation on your approach) into the Python core. Python uses dict in lots of places; optimizing its operations is a big deal.

You could also write this as

z0 = x.copy()
z0.update(y)

as Tony does, but (not surprisingly) the difference in notation turns out not to have any measurable effect on performance. Use whichever looks right to you. Of course, he's absolutely correct to point out that the two-statement version is much easier to understand.

answered 8 years ago driax #8

The best version I could think while not using copy would be:

from itertools import chain
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
dict(chain(x.iteritems(), y.iteritems()))

It's faster than dict(x.items() + y.items()) but not as fast as n = copy(a); n.update(b), at least on CPython. This version also works in Python 3 if you change iteritems() to items(), which is automatically done by the 2to3 tool.

Personally I like this version best because it describes fairly good what I want in a single functional syntax. The only minor problem is that it doesn't make completely obvious that values from y takes precedence over values from x, but I don't believe it's difficult to figure that out.

answered 7 years ago phobie #9

While the question has already been answered several times, this simple solution to the problem has not been listed yet.

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z4 = {}
z4.update(x)
z4.update(y)

It is as fast as z0 and the evil z2 mentioned above, but easy to understand and change.

answered 7 years ago EMS #10

If you think lambdas are evil then read no further. As requested, you can write the fast and memory-efficient solution with one expression:

x = {'a':1, 'b':2}
y = {'b':10, 'c':11}
z = (lambda a, b: (lambda a_copy: a_copy.update(b) or a_copy)(a.copy()))(x, y)
print z
{'a': 1, 'c': 11, 'b': 10}
print x
{'a': 1, 'b': 2}

As suggested above, using two lines or writing a function is probably a better way to go.

answered 7 years ago Stan #11

Recursively/deep update a dict

def deepupdate(original, update):
    """
    Recursively update a dict.
    Subdict's won't be overwritten but also updated.
    """
    for key, value in original.iteritems(): 
        if key not in update:
            update[key] = value
        elif isinstance(value, dict):
            deepupdate(value, update[key]) 
    return update

Demonstration:

pluto_original = {
    'name': 'Pluto',
    'details': {
        'tail': True,
        'color': 'orange'
    }
}

pluto_update = {
    'name': 'Pluutoo',
    'details': {
        'color': 'blue'
    }
}

print deepupdate(pluto_original, pluto_update)

Outputs:

{
    'name': 'Pluutoo',
    'details': {
        'color': 'blue',
        'tail': True
    }
}

Thanks rednaw for edits.

answered 6 years ago Thanh Lim #12

Even though the answers were good for this shallow dictionary, none of the methods defined here actually do a deep dictionary merge.

Examples follow:

a = { 'one': { 'depth_2': True }, 'two': True }
b = { 'one': { 'extra': False } }
print dict(a.items() + b.items())

One would expect a result of something like this:

{ 'one': { 'extra': False', 'depth_2': True }, 'two': True }

Instead, we get this:

{'two': True, 'one': {'extra': False}}

The 'one' entry should have had 'depth_2' and 'extra' as items inside its dictionary if it truly was a merge.

Using chain also, does not work:

from itertools import chain
print dict(chain(a.iteritems(), b.iteritems()))

Results in:

{'two': True, 'one': {'extra': False}}

The deep merge that rcwesick gave also creates the same result.

Yes, it will work to merge the sample dictionaries, but none of them are a generic mechanism to merge. I'll update this later once I write a method that does a true merge.

answered 6 years ago Sam Watkins #13

def dict_merge(a, b):
  c = a.copy()
  c.update(b)
  return c

new = dict_merge(old, extras)

Among such shady and dubious answers, this shining example is the one and only good way to merge dicts in Python, endorsed by dictator for life Guido van Rossum himself! Someone else suggested half of this, but did not put it in a function.

print dict_merge(
      {'color':'red', 'model':'Mini'},
      {'model':'Ferrari', 'owner':'Carl'})

gives:

{'color': 'red', 'owner': 'Carl', 'model': 'Ferrari'}

answered 6 years ago Mathieu Larose #14

Two dictionaries

def union2(dict1, dict2):
    return dict(list(dict1.items()) + list(dict2.items()))

n dictionaries

def union(*dicts):
    return dict(itertools.chain.from_iterable(dct.items() for dct in dicts))

sum has bad performance. See https://mathieularose.com/how-not-to-flatten-a-list-of-lists-in-python/

answered 5 years ago Raymond Hettinger #15

In Python 3, you can use collections.ChainMap which groups multiple dicts or other mappings together to create a single, updateable view:

>>> from collections import ChainMap
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = ChainMap({}, y, x)
>>> for k, v in z.items():
        print(k, '-->', v)

a --> 1
b --> 10
c --> 11

answered 5 years ago octoback #16

Using a dict comprehension, you may

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

dc = {xi:(x[xi] if xi not in list(y.keys()) 
           else y[xi]) for xi in list(x.keys())+(list(y.keys()))}

gives

>>> dc
{'a': 1, 'c': 11, 'b': 10}

Note the syntax for if else in comprehension

{ (some_key if condition else default_key):(something_if_true if condition 
          else something_if_false) for key, value in dict_.items() }

answered 5 years ago user2571462 #17

Here is some code, it seems to work ok:

def merge(d1, d2, mode=0):
    if not type(d2) is dict:
        raise Exception("d2 is not a dict")

    if not type(d1) is dict:
        if mode == 0:
            raise Exception("d1 is not a dict")
        return d2

    result = dict(d1)

    for k, v in d2.iteritems():
        if k in result and type(v) is dict:
            result[k] = merge(result[k], v, 1)
        else:
            if mode == 1:
                result.update(d2)
            else:
                result[k] = v
    return result

answered 5 years ago Bijou Trouvaille #18

Drawing on ideas here and elsewhere I've comprehended a function:

def merge(*dicts, **kv): 
      return { k:v for d in list(dicts) + [kv] for k,v in d.items() }

Usage (tested in python 3):

assert (merge({1:11,'a':'aaa'},{1:99, 'b':'bbb'},foo='bar')==\
    {1: 99, 'foo': 'bar', 'b': 'bbb', 'a': 'aaa'})

assert (merge(foo='bar')=={'foo': 'bar'})

assert (merge({1:11},{1:99},foo='bar',baz='quux')==\
    {1: 99, 'foo': 'bar', 'baz':'quux'})

assert (merge({1:11},{1:99})=={1: 99})

You could use a lambda instead.

answered 5 years ago Claudiu #19

Abuse leading to a one-expression solution for Matthew's answer:

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = (lambda f=x.copy(): (f.update(y), f)[1])()
>>> z
{'a': 1, 'c': 11, 'b': 10}

You said you wanted one expression, so I abused lambda to bind a name, and tuples to override lambda's one-expression limit. Feel free to cringe.

You could also do this of course if you don't care about copying it:

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = (x.update(y), x)[1]
>>> z
{'a': 1, 'b': 10, 'c': 11}

answered 5 years ago quodlibetor #20

** creates an intermediary dict, which means that the total number of copies is actually higher doing the dict(one, **two) form, but all that happens in C so it's still generally faster than going to itertools, unless there are a huge number of copies (or, probably, if the copies are very expensive). As always if you actually care about speed you should time your use case.

Timing on Python 2.7.3 with an empty dict:

$ python -m timeit "dict({}, **{})"
1000000 loops, best of 3: 0.405 usec per loop

$ python -m timeit -s "from itertools import chain" \
    "dict(chain({}.iteritems(), {}.iteritems()))"
1000000 loops, best of 3: 1.18 usec per loop

With 10,000 (tiny) items:

$ python -m timeit -s 'd = {i: str(i) for i in xrange(10000)}' \
    "dict(d, **d)"
1000 loops, best of 3: 550 usec per loop

$ python -m timeit -s "from itertools import chain" -s 'd = {i: str(i) for i in xrange(10000)}' \
    "dict(chain(d.iteritems(), d.iteritems()))"
1000 loops, best of 3: 1.11 msec per loop

With 100,000 items:

$ python -m timeit -s 'd = {i: str(i) for i in xrange(100000)}' \
    "dict(d, **d)"
10 loops, best of 3: 19.6 msec per loop

$ python -m timeit -s "from itertools import chain" -s 'd = {i: str(i) for i in xrange(100000)}' \
    "dict(chain(d.iteritems(), d.iteritems()))"
10 loops, best of 3: 20.1 msec per loop

With 1,000,000 items:

$ python -m timeit -s 'd = {i: str(i) for i in xrange(1000000)}' \
    "dict(d, **d)"
10 loops, best of 3: 273 msec per loop

$ python -m timeit -s "from itertools import chain" -s 'd = {i: str(i) for i in xrange(1000000)}' \
    "dict(chain(d.iteritems(), d.iteritems()))"
10 loops, best of 3: 233 msec per loop

answered 5 years ago beardc #21

In python3, the items method no longer returns a list, but rather a view, which acts like a set. In this case you'll need to take the set union since concatenating with + won't work:

dict(x.items() | y.items())

For python3-like behavior in version 2.7, the viewitems method should work in place of items:

dict(x.viewitems() | y.viewitems())

I prefer this notation anyways since it seems more natural to think of it as a set union operation rather than concatenation (as the title shows).

Edit:

A couple more points for python 3. First, note that the dict(x, **y) trick won't work in python 3 unless the keys in y are strings.

Also, Raymond Hettinger's Chainmap answer is pretty elegant, since it can take an arbitrary number of dicts as arguments, but from the docs it looks like it sequentially looks through a list of all the dicts for each lookup:

Lookups search the underlying mappings successively until a key is found.

This can slow you down if you have a lot of lookups in your application:

In [1]: from collections import ChainMap
In [2]: from string import ascii_uppercase as up, ascii_lowercase as lo; x = dict(zip(lo, up)); y = dict(zip(up, lo))
In [3]: chainmap_dict = ChainMap(y, x)
In [4]: union_dict = dict(x.items() | y.items())
In [5]: timeit for k in union_dict: union_dict[k]
100000 loops, best of 3: 2.15 µs per loop
In [6]: timeit for k in chainmap_dict: chainmap_dict[k]
10000 loops, best of 3: 27.1 µs per loop

So about an order of magnitude slower for lookups. I'm a fan of Chainmap, but looks less practical where there may be many lookups.

answered 5 years ago John La Rooy #22

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> x, z = dict(x), x.update(y) or x
>>> x
{'a': 1, 'b': 2}
>>> y
{'c': 11, 'b': 10}
>>> z
{'a': 1, 'c': 11, 'b': 10}

answered 5 years ago upandacross #23

The problem I have with solutions listed to date is that, in the merged dictionary, the value for key "b" is 10 but, to my way of thinking, it should be 12. In that light, I present the following:

import timeit

n=100000
su = """
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
"""

def timeMerge(f,su,niter):
    print "{:4f} sec for: {:30s}".format(timeit.Timer(f,setup=su).timeit(n),f)

timeMerge("dict(x, **y)",su,n)
timeMerge("x.update(y)",su,n)
timeMerge("dict(x.items() + y.items())",su,n)
timeMerge("for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k] ",su,n)

#confirm for loop adds b entries together
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k]
print "confirm b elements are added:",x

Results:

0.049465 sec for: dict(x, **y)
0.033729 sec for: x.update(y)                   
0.150380 sec for: dict(x.items() + y.items())   
0.083120 sec for: for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k]

confirm b elements are added: {'a': 1, 'c': 11, 'b': 12}

answered 5 years ago thiruvenkadam #24

I have a solution which is not specified here(Man I LOVE python) :-)

z = {}
z.update(x) or z.update(y)

This will not update x as well as y. Performance? I don't think it will be terribly slow :-)

NOTE: It is supposed to be 'or' operation and not 'and' operation. Edited to correct the code.

answered 5 years ago GetFree #25

It's so silly that .update returns nothing.
I just use a simple helper function to solve the problem:

def merge(dict1,*dicts):
    for dict2 in dicts:
        dict1.update(dict2)
    return dict1

Examples:

merge(dict1,dict2)
merge(dict1,dict2,dict3)
merge(dict1,dict2,dict3,dict4)
merge({},dict1,dict2)  # this one returns a new copy

answered 4 years ago bassounds #26

A union of the OP's two dictionaries would be something like:

{'a': 1, 'b': 2, 10, 'c': 11}

Specifically, the union of two entities(x and y) contains all the elements of x and/or y. Unfortunately, what the OP asks for is not a union, despite the title of the post.

My code below is neither elegant nor a one-liner, but I believe it is consistent with the meaning of union.

From the OP's example:

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

z = {}
for k, v in x.items():
    if not k in z:
        z[k] = [(v)]
    else:
        z[k].append((v))
for k, v in y.items():
    if not k in z:
        z[k] = [(v)]
    else:
        z[k].append((v))

{'a': [1], 'b': [2, 10], 'c': [11]}

Whether one wants lists could be changed, but the above will work if a dictionary contains lists (and nested lists) as values in either dictionary.

answered 4 years ago Aaron Hall #27

How can I merge two Python dictionaries in a single expression?

For dictionaries x and y, z becomes a merged dictionary with values from y replacing those from x.

  • In Python 3.5 or greater, :

    z = {**x, **y}
    w = {'foo': 'bar', 'baz': 'qux', **y}  # merge a dict with literal values
    
  • In Python 2, (or 3.4 or lower) write a function:

    def merge_two_dicts(x, y):
        z = x.copy()   # start with x's keys and values
        z.update(y)    # modifies z with y's keys and values & returns None
        return z
    

    and

    z = merge_two_dicts(x, y)
    

Explanation

Say you have two dicts and you want to merge them into a new dict without altering the original dicts:

x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}

The desired result is to get a new dictionary (z) with the values merged, and the second dict's values overwriting those from the first.

>>> z
{'a': 1, 'b': 3, 'c': 4}

A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is

z = {**x, **y}

And it is indeed a single expression. It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into What's New in Python 3.5 document.

However, since many organizations are still on Python 2, you may wish to do this in a backwards compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:

z = x.copy()
z.update(y) # which returns None since it mutates z

In both approaches, y will come second and its values will replace x's values, thus 'b' will point to 3 in our final result.

Not yet on Python 3.5, but want a single expression

If you are not yet on Python 3.5, or need to write backward-compatible code, and you want this in a single expression, the most performant while correct approach is to put it in a function:

def merge_two_dicts(x, y):
    """Given two dicts, merge them into a new dict as a shallow copy."""
    z = x.copy()
    z.update(y)
    return z

and then you have a single expression:

z = merge_two_dicts(x, y)

You can also make a function to merge an undefined number of dicts, from zero to a very large number:

def merge_dicts(*dict_args):
    """
    Given any number of dicts, shallow copy and merge into a new dict,
    precedence goes to key value pairs in latter dicts.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

This function will work in Python 2 and 3 for all dicts. e.g. given dicts a to g:

z = merge_dicts(a, b, c, d, e, f, g) 

and key value pairs in g will take precedence over dicts a to f, and so on.

Critiques of Other Answers

Don't use what you see in the formerly accepted answer:

z = dict(x.items() + y.items())

In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you're adding two dict_items objects together, not two lists -

>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'

and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power.

Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don't do this:

>>> c = dict(a.items() | b.items())

This example demonstrates what happens when values are unhashable:

>>> x = {'a': []}
>>> y = {'b': []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

Here's an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets:

>>> x = {'a': 2}
>>> y = {'a': 1}
>>> dict(x.items() | y.items())
{'a': 2}

Another hack you should not use:

z = dict(x, **y)

This uses the dict constructor, and is very fast and memory efficient (even slightly more-so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it's difficult to read, it's not the intended usage, and so it is not Pythonic.

Here's an example of the usage being remediated in django.

Dicts are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings.

>>> c = dict(a, **b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

From the mailing list, Guido van Rossum, the creator of the language, wrote:

I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism.

and

Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally I find it more despicable than cool.

It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dicts for readability purposes, e.g.:

dict(a=1, b=10, c=11)

instead of

{'a': 1, 'b': 10, 'c': 11}

Response to comments

Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-comming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact ** was designed precisely to pass dicts as keywords.

Again, it doesn't work for 3 when keys are non-strings. The implicit calling contract is that namespaces take ordinary dicts, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2:

>>> foo(**{('a', 'b'): None})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{('a', 'b'): None})
{('a', 'b'): None}

This inconsistency was bad given other implementations of Python (Pypy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.

I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.

Another comment:

dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts.

My response: merge_two_dicts(x, y) actually seems much clearer to me, if we're actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.

Less Performant But Correct Ad-hocs

These approaches are less performant, but they will provide correct behavior. They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dicts have precedence)

You can also chain the dicts manually inside a dict comprehension:

{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7

or in python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):

dict((k, v) for d in dicts for k, v in d.items())

itertools.chain will chain the iterators over the key-value pairs in the correct order:

import itertools
z = dict(itertools.chain(x.iteritems(), y.iteritems()))

Performance Analysis

I'm only going to do the performance analysis of the usages known to behave correctly.

import timeit

The following is done on Ubuntu 14.04

In Python 2.7 (system Python):

>>> min(timeit.repeat(lambda: merge_two_dicts(x, y)))
0.5726828575134277
>>> min(timeit.repeat(lambda: {k: v for d in (x, y) for k, v in d.items()} ))
1.163769006729126
>>> min(timeit.repeat(lambda: dict(itertools.chain(x.iteritems(), y.iteritems()))))
1.1614501476287842
>>> min(timeit.repeat(lambda: dict((k, v) for d in (x, y) for k, v in d.items())))
2.2345519065856934

In Python 3.5 (deadsnakes PPA):

>>> min(timeit.repeat(lambda: {**x, **y}))
0.4094954460160807
>>> min(timeit.repeat(lambda: merge_two_dicts(x, y)))
0.7881555100320838
>>> min(timeit.repeat(lambda: {k: v for d in (x, y) for k, v in d.items()} ))
1.4525277839857154
>>> min(timeit.repeat(lambda: dict(itertools.chain(x.items(), y.items()))))
2.3143140770262107
>>> min(timeit.repeat(lambda: dict((k, v) for d in (x, y) for k, v in d.items())))
3.2069112799945287

Resources on Dictionaries

answered 4 years ago Bilal Syed Hussain #28

Python 3.5 (PEP 448) allows a nicer syntax option:

x = {'a': 1, 'b': 1}
y = {'a': 2, 'c': 2}
final = {**x, **y} 
final
# {'a': 2, 'b': 1, 'c': 2}

Or even

final = {'a': 1, 'b': 1, **x, **y}

answered 4 years ago Saksham Varma #29

a = {1: 2, 3: 4, 5: 6}
b = {7:8, 1:2}
combined = dict(a.items() + b.items())
print combined

answered 3 years ago RemcoGerlich #30

This can be done with a single dict comprehension:

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> { key: y[key] if key in y else x[key]
      for key in set(x) + set(y)
    }

In my view the best answer for the 'single expression' part as no extra functions are needed, and it is short.

answered 3 years ago reubano #31

Simple solution using itertools that preserves order (latter dicts have precedence)

import itertools as it
merge = lambda *args: dict(it.chain.from_iterable(it.imap(dict.iteritems, args)))

And it's usage:

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> merge(x, y)
{'a': 1, 'b': 10, 'c': 11}

>>> z = {'c': 3, 'd': 4}
>>> merge(x, y, z)
{'a': 1, 'b': 10, 'c': 3, 'd': 4}

answered 3 years ago reetesh11 #32

from collections import Counter
dict1 = {'a':1, 'b': 2}
dict2 = {'b':10, 'c': 11}
result = dict(Counter(dict1) + Counter(dict2))

This should solve your problem.

answered 3 years ago Robino #33

Be pythonic. Use a comprehension:

z={i:d[i] for d in [x,y] for i in d}

>>> print z
{'a': 1, 'c': 11, 'b': 10}

answered 2 years ago kjo #34

(For Python2.7* only; there are simpler solutions for Python3*.)

If you're not averse to importing a standard library module, you can do

from functools import reduce

def merge_dicts(*dicts):
    return reduce(lambda a, d: a.update(d) or a, dicts, {})

(The or a bit in the lambda is necessary because dict.update always returns None on success.)

answered 2 years ago Alfe #35

I know this does not really fit the specifics of the questions ("one liner"), but since none of the answers above went into this direction while lots and lots of answers addressed the performance issue, I felt I should contribute my thoughts.

Depending on the use case it might not be necessary to create a "real" merged dictionary of the given input dictionaries. A view which does this might be sufficient in many cases, i. e. an object which acts like the merged dictionary would without computing it completely. A lazy version of the merged dictionary, so to speak.

In Python, this is rather simple and can be done with the code shown at the end of my post. This given, the answer to the original question would be:

z = MergeDict(x, y)

When using this new object, it will behave like a merged dictionary but it will have constant creation time and constant memory footprint while leaving the original dictionaries untouched. Creating it is way cheaper than in the other solutions proposed.

Of course, if you use the result a lot, then you will at some point reach the limit where creating a real merged dictionary would have been the faster solution. As I said, it depends on your use case.

If you ever felt you would prefer to have a real merged dict, then calling dict(z) would produce it (but way more costly than the other solutions of course, so this is just worth mentioning).

You can also use this class to make a kind of copy-on-write dictionary:

a = { 'x': 3, 'y': 4 }
b = MergeDict(a)  # we merge just one dict
b['x'] = 5
print b  # will print {'x': 5, 'y': 4}
print a  # will print {'y': 4, 'x': 3}

Here's the straight-forward code of MergeDict:

class MergeDict(object):
  def __init__(self, *originals):
    self.originals = ({},) + originals[::-1]  # reversed

  def __getitem__(self, key):
    for original in self.originals:
      try:
        return original[key]
      except KeyError:
        pass
    raise KeyError(key)

  def __setitem__(self, key, value):
    self.originals[0][key] = value

  def __iter__(self):
    return iter(self.keys())

  def __repr__(self):
    return '%s(%s)' % (
      self.__class__.__name__,
      ', '.join(repr(original)
          for original in reversed(self.originals)))

  def __str__(self):
    return '{%s}' % ', '.join(
        '%r: %r' % i for i in self.iteritems())

  def iteritems(self):
    found = set()
    for original in self.originals:
      for k, v in original.iteritems():
        if k not in found:
          yield k, v
          found.add(k)

  def items(self):
    return list(self.iteritems())

  def keys(self):
    return list(k for k, _ in self.iteritems())

  def values(self):
    return list(v for _, v in self.iteritems())

answered 2 years ago Kalpesh Dusane #36

For Python 2 :

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z = dict(x.items()+y.items())
print(z)

For Python 3:

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z = dict(x.items()|y.items())
print(z)

It gives output:{'a': 1, 'c': 11, 'b': 10}

answered 2 years ago levi #37

In Python 3.5 you can use unpack ** in order to create new dictionary. This method has no been showed in past answers. Also, it's better to use {} instead of dict(). Because {} is a python literal and dict() involves a function call.

dict1 = {'a':1}
dict2 = {'b':2}
new_dict = {**dict1, **dict2}
>>>new_dict
{'a':1, 'a':2}

answered 2 years ago Mike Graham #38

You can use toolz.merge([x, y]) for this.

answered 1 year ago Skyduy #39

In python 3:

import collections
a = {1: 1, 2: 2}
b = {2: 3, 3: 4}
c = {3: 5}

r = dict(collections.ChainMap(a, b, c))
print(r)

Out:

{1: 1, 2: 2, 3: 4}

Docs: https://docs.python.org/3/library/collections.html#collections.ChainMap:

answered 1 year ago gboffi #40

The question is tagged python-3x but, taking into account that it's a relatively recent addition and that the most voted, accepted answer deals extensively with a Python 2.x solution, I dare add a one liner that draws on an irritating feature of Python 2.x list comprehension, that is name leaking...

$ python2
Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> [z.update(d) for z in [{}] for d in (x, y)]
[None, None]
>>> z
{'a': 1, 'c': 11, 'b': 10}
>>> ...

I'm happy to say that the above doesn't work any more on any version of Python 3.

answered 1 year ago gilch #41

If you don't mind mutating x,

x.update(y) or x

Simple, readable, performant. You know update() always returns None, which is a false value. So it will always evaluate to x.

Mutating methods in the standard library, like update, return None by convention, so this trick will work on those too.

If you're using a library that doesn't follow this convention, you can use a tuple display and index to make it a single expression, instead of or, but it's not as readable.

(x.update(y), x)[-1]

If you don't have x in a variable yet, you can use lambda to make a local without using an assignment statement. This amounts to using lambda as a let expression, which is a common technique in functional languages, but rather unpythonic.

(lambda x: x.update(y) or x)({'a':1, 'b': 2})

If you do want a copy, PEP 448 is best {**x, **y}. But if that's not available, let works here too.

(lambda z: z.update(y) or z)(x.copy())

answered 10 months ago Haziq Nordin #42

dictionaries = [{'body': 'text'},
{'correctAnswer': 'text'},
{'ans': 'text'},
{'ans': 'text'}]

final_dictionary = {}
for dictionary in dictionaries:
    for key in dictionary:
        final_dictionary[key] = dictionary[key]

print(final_dictionary)

It sounds like you are very new to Python, so the solution I would suggest is the one above which I think is most readable to someone who only knows about for-loops, and iterating over the keys of a dictionary. The pseudo-code is as follows:

for each of the dictionaries in the list: 
    add each of the key-value pairs in that dictionary to our final dictionary.

answered 8 months ago Goran B. #43

code snippet -- latest

python 3.6.*

d1 = {"a":1,"b":2,"c":3}
d2 = {"d":4,"e":5,"f":6}
d3 = {"g":7,"h":8,"j":9}
d4 = {'wtf':'yes'}

d1a = {"a":1,"b":2,"c":3}
d1b = {"a":2,"b":3,"c":4}
d1c = {"a":3,"b":4,"c":5}
d1d = {"a":"wtf"}

def dics_combine(*dics):
    dic_out = {}
    for d in dics:
        if isinstance(d, dict):
            dic_out = {**dic_out, **d}
        else:
            pass
    return dic_out

inp = (d1,d2,d3,d4)
combined_dics = dics_combine(*inp)
print('\n')
print('IN-ORDER: {0}'.format(inp))
print('OUT: {0}'.format(combined_dics))

inp = (d1a,d1b,d1c,d1d)
combined_dics = dics_combine(*inp)
print('\n')
print('IN-ORDER: {0}'.format(inp))
print('OUT: {0}'.format(combined_dics))

inp = (d1d,d1c,d1b,d1a)
combined_dics = dics_combine(*inp)
print('\n')
print('IN-ORDER: {0}'.format(inp))
print('OUT: {0}'.format(combined_dics)))

IN-ORDER: ({'a': 1, 'b': 2, 'c': 3}, {'d': 4, 'e': 5, 'f': 6}, {'g': 7, 'h': 8, 'j': 9}, {'wtf': 'yes'})

OUT: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'j': 9, 'wtf': 'yes'}

IN-ORDER: ({'a': 1, 'b': 2, 'c': 3}, {'a': 2, 'b': 3, 'c': 4}, {'a': 3, 'b': 4, 'c': 5}, {'a': 'wtf'})

OUT: {'a': 'wtf', 'b': 4, 'c': 5}

IN-ORDER: ({'a': 'wtf'}, {'a': 3, 'b': 4, 'c': 5}, {'a': 2, 'b': 3, 'c': 4}, {'a': 1, 'b': 2, 'c': 3})

OUT: {'a': 1, 'b': 2, 'c': 3}

answered 6 months ago litepresence #44

I was curious if I could beat the accepted answer's time with a one line stringify approach:

I tried 5 methods, none previously mentioned - all one liner - all producing correct answers - and I couldn't come close.

So... to save you the trouble and perhaps fulfill curiosity:

import json
import yaml
import time
from ast import literal_eval as literal

def merge_two_dicts(x, y):
    z = x.copy()   # start with x's keys and values
    z.update(y)    # modifies z with y's keys and values & returns None
    return z

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

start = time.time()
for i in range(10000):
    z = yaml.load((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify yaml')

start = time.time()
for i in range(10000):
    z = literal((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify literal')

start = time.time()
for i in range(10000):
    z = eval((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify eval')

start = time.time()
for i in range(10000):
    z = {k:int(v) for k,v in (dict(zip(
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ')
            .replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[::2], 
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ').replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[1::2]
             ))).items()}
elapsed = (time.time()-start)
print (elapsed, z, 'stringify replace')

start = time.time()
for i in range(10000):
    z = json.loads(str((str(x)+str(y)).replace('}{',', ').replace("'",'"')))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify json')

start = time.time()
for i in range(10000):
    z = merge_two_dicts(x, y)
elapsed = (time.time()-start)
print (elapsed, z, 'accepted')

results:

7.693928956985474 {'c': 11, 'b': 10, 'a': 1} stringify yaml
0.29134678840637207 {'c': 11, 'b': 10, 'a': 1} stringify literal
0.2208399772644043 {'c': 11, 'b': 10, 'a': 1} stringify eval
0.1106564998626709 {'c': 11, 'b': 10, 'a': 1} stringify replace
0.07989692687988281 {'c': 11, 'b': 10, 'a': 1} stringify json
0.005082368850708008 {'c': 11, 'b': 10, 'a': 1} accepted

what I did learn from this is that json approach is the fastest way (of those attempted) to return a dictionary from string-of-dictionary; much faster (about 1/4th of the time) of what I considered to be the normal method using ast. I also learned that, the yaml approach should be avoided at all cost.

Yes, I get that this is not the best/correct way so please don't downvote to negative oblivion, zero is just fine. I was curious if it was faster, which it isn't; I posted to prove it so.

answered 5 months ago Josh Bode #45

This is an expression for Python 3.5 or greater that merges dictionaries using reduce:

>>> from functools import reduce
>>> l = [{'a': 1}, {'b': 2}, {'a': 100, 'c': 3}]
>>> reduce(lambda x, y: {**x, **y}, l, {})
{'a': 100, 'b': 2, 'c': 3}

Note: this works even if the dictionary list is empty or contains only one element.

answered 5 months ago hygull #46

Where is the problem in your code?

In Python, dictionary is defined as an unordered collection of key-value paired items where keys are unique and immutable objects.

The update() method updates the calling dictionary object i.e. x with the contents of passing dictionary object i.e. y.

If it finds any duplicated key in the calling object then it will override its value with the value corresponding to the matched key of passing object.

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

So you will get {'a': 1, 'b': 10, 'c': 11} in place of getting {'a': 1, 'b': 10, 'c': 11, 'b': 2}

Use dictionary comprehension to solve

The professional way to create dictionaries using existing dictionaries or other iterator objects.

>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>>
>>> my_dicts = (x, y);  # Gathering all dictionaries in a tuple/list
>>> z = {key: value for my_dict in my_dicts for key, value in my_dict.iteritems()}; # Single line of code that creates new dictionary object from existing dictionaries
>>> z
{'a': 1, 'c': 11, 'b': 10}

References

https://www.datacamp.com/community/tutorials/python-dictionary-comprehension

Thanks.

answered 4 months ago Tigran Saluev #47

I think my ugly one-liners are just necessary here.

z = next(z.update(y) or z for z in [x.copy()])
# or
z = (lambda z: z.update(y) or z)(x.copy())
  1. Dicts are merged.
  2. Single expression.
  3. Don't ever dare to use it.

answered 3 months ago crook #48

In python2

dict(mydict, **other)

or

In [11]: dict({1:2}.items() + {2:3}.items() + {1:5}.items() )
Out[11]: {1: 5, 2: 3}

python3

{ **mydict, **otherdict}

answered 2 weeks ago gourav sb #49

The below method joins a list of dictionaries passed to it. And it takes the least time in terms of performance.

# Function to merge a list of dictionaries into one dict
def dict_merger_fn(dict_list):
    z = dict_list[0].copy()
    for y in dict_list[1:]:
        z.update(y) # modifies z with y's keys and values & returns None
    return z

Usage:

merged_dict = dict_merger_fn([dict1,dict2,dict3...])

comments powered by Disqus