Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Configuring ruamel.yaml to allow duplicate keys

I'm trying to use the ruamel.yaml library to process a Yaml document that contains duplicate keys. In this case the duplicate key happens to be a merge key <<:.

This is the yaml file, dupe.yml:

foo: &ref1
  a: 1

bar: &ref2
  b: 2

baz:
  <<: *ref1
  <<: *ref2
  c: 3

This is my script:

import ruamel.yaml

yml = ruamel.yaml.YAML()
yml.allow_duplicate_keys = True
doc = yml.load(open('dupe.yml'))

assert doc['baz']['a'] == 1
assert doc['baz']['b'] == 2
assert doc['baz']['c'] == 3

When run, it raises this error:

Traceback (most recent call last):
  File "rua.py", line 5, in <module>
    yml.load(open('dupe.yml'))
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 331, in load
    return constructor.get_single_data()
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
    return self.construct_document(node)
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 121, in construct_document
    for _dummy in generator:
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
    self.construct_mapping(node, data, deep=True)
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1448, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 174, in construct_object
    for _dummy in generator:
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
    self.construct_mapping(node, data, deep=True)
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1399, in construct_mapping
    merge_map = self.flatten_mapping(node)
  File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1350, in flatten_mapping
    raise DuplicateKeyError(*args)
ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping
  in "dupe.yml", line 8, column 3
found duplicate key "<<"
  in "dupe.yml", line 9, column 3

To suppress this check see:
   http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys

Duplicate keys will become an error in future releases, and are errors
by default when using the new API.

How can I make ruamel read this file without errors? The documentation says that allow_duplicate_keys = True will make the loader tolerate duplicated keys, but it doesn't seem to work.

I'm using Python 3.7 and ruamel.yaml 0.15.90.

like image 596
mamacdon Avatar asked Oct 27 '25 17:10

mamacdon


2 Answers

That

yaml.allow_duplicate_keys = True

only works for non-merge keys in versions before 0.15.91.

In 0.15.91+ this works and the merge key assumes the value of the first instantiation of the key (like with non-merge keys), that means it works as if you had written:

baz:
  <<: *ref1
  c: 3

and not as if you had written:

baz:
  <<: [*ref1, *ref2]
  c: 3

If you need that you have to monkey-patch the flatten routine that handles the merge keys (and that affects loading of all following YAML files with double merge keys):

import sys
import ruamel.yaml

yaml_str = """\
foo: &ref1
  a: 1

bar: &ref2
  b: 2

baz:
  <<: *ref1
  <<: *ref2
  c: 3

"""

def my_flatten_mapping(self, node):

    def constructed(value_node):
        # type: (Any) -> Any
        # If the contents of a merge are defined within the
        # merge marker, then they won't have been constructed
        # yet. But if they were already constructed, we need to use
        # the existing object.
        if value_node in self.constructed_objects:
            value = self.constructed_objects[value_node]
        else:
            value = self.construct_object(value_node, deep=False)
        return value

    merge_map_list = []
    index = 0
    while index < len(node.value):
        key_node, value_node = node.value[index]
        if key_node.tag == u'tag:yaml.org,2002:merge':
            if merge_map_list and not self.allow_duplicate_keys:  # double << key
                args = [
                    'while constructing a mapping',
                    node.start_mark,
                    'found duplicate key "{}"'.format(key_node.value),
                    key_node.start_mark,
                    """
                    To suppress this check see:
                       http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
                    """,
                    """\
                    Duplicate keys will become an error in future releases, and are errors
                    by default when using the new API.
                    """,
                ]
                if self.allow_duplicate_keys is None:
                    warnings.warn(DuplicateKeyFutureWarning(*args))
                else:
                    raise DuplicateKeyError(*args)
            del node.value[index]
            # if key/values from later merge keys have preference you need
            # to insert value_node(s) at the beginning of merge_map_list
            # instead of appending
            if isinstance(value_node, ruamel.yaml.nodes.MappingNode):
                merge_map_list.append((index, constructed(value_node)))
            elif isinstance(value_node, ruamel.yaml.nodes.SequenceNode):
                for subnode in value_node.value:
                    if not isinstance(subnode, ruamel.yaml.nodes.MappingNode):
                        raise ruamel.yaml.constructor.ConstructorError(
                            'while constructing a mapping',
                            node.start_mark,
                            'expected a mapping for merging, but found %s' % subnode.id,
                            subnode.start_mark,
                        )
                    merge_map_list.append((index, constructed(subnode)))
            else:
                raise ConstructorError(
                    'while constructing a mapping',
                    node.start_mark,
                    'expected a mapping or list of mappings for merging, '
                    'but found %s' % value_node.id,
                    value_node.start_mark,
                )
        elif key_node.tag == u'tag:yaml.org,2002:value':
            key_node.tag = u'tag:yaml.org,2002:str'
            index += 1
        else:
            index += 1
    return merge_map_list

ruamel.yaml.constructor.RoundTripConstructor.flatten_mapping = my_flatten_mapping

yaml = ruamel.yaml.YAML()
yaml.allow_duplicate_keys = True
data = yaml.load(yaml_str)
for k in data['baz']:
    print(k, '>', data['baz'][k])

The above gives:

c > 3
a > 1
b > 2
like image 73
Anthon Avatar answered Oct 29 '25 05:10

Anthon


After reading the library source code, I found a workaround. Setting the option to None prevents the error.

yml.allow_duplicate_keys = None

A warning is still printed to the console, but it's not fatal and the program will continue.

like image 38
mamacdon Avatar answered Oct 29 '25 05:10

mamacdon