Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ansible: How to get duplicate items from list?

Tags:

ansible

How can I get duplicated items in ansible

Input:

- vars: 
    list1: 
      - a
      - b
      - c
      - d
      - d
      - e
      - e
      - e

Expected output:

list1: 
  - d
  - e
like image 216
Theppasin Kongsakda Avatar asked Nov 26 '25 05:11

Theppasin Kongsakda


2 Answers

Looping in Ansible throught large data lists can be very slow. It is much faster to create a Custom Filter Plugin.

In your project root create folder filter_plugins. Inside the folder create a Python file, custom_filter.py for example.

Then use the following code

#!/usr/bin/python

class FilterModule(object):
    def filters(self):
        return {'duplicates': self.duplicates}

    def duplicates(self, items):
        sums = {}
        result = []

        for item in items:
            if item not in sums:
                sums[item] = 1
            else:
                if sums[item] == 1:
                    result.append(item)
                sums[item] += 1
        return result

and call the custom filter in your playbook

    - name: "debug"
      debug:
        msg: "{{ [1, 2, 2, 4, 5, 1] | duplicates }}" 
ok: [localhost] => {
    "msg": [
        2,
        1
    ]
}

When you are processing data it is usually better to use custom filters.

like image 89
NFJ25 Avatar answered Nov 28 '25 16:11

NFJ25


  • Count the frequencies in a loop. For example,
    - set_fact:
        list2: "{{ list2 + [{'key': item,
                             'freq': list1|
                                     select('regex', myregex)|
                                     length}] }}"
      loop: "{{ list1|unique|sort }}"
      vars:
        list2: []
        myregex: "^{{ item }}$"

gives

  list2:
    - {freq: 1, key: a}
    - {freq: 1, key: b}
    - {freq: 1, key: c}
    - {freq: 2, key: d}
    - {freq: 3, key: e}

Then select the items. For example, use json_query

  list3: "{{ list2|json_query('[?freq > `1`].key') }}"

, or the combination of the filters selectattr and map

  list4: "{{ list2|selectattr('freq', '>', 1)|map(attribute='key') }}"

Both options give the list [d, e]


  • The next option is the comparison of items by Extended loop variables. For example,
    - set_fact:
        list5: "{{ list5|default([]) + [item] }}"
      loop: "{{ list1|sort }}"
      loop_control:
        extended: yes
      when: item == ansible_loop.nextitem|default('')

Then, list5|unique gives the same result [d, e]


  • The simplest option is the filter community.general.counter
  freq6: "{{ list1|community.general.counter }}"

gives the dictionary

  freq6: {a: 1, b: 1, c: 1, d: 2, e: 3}

Select the duplicates

  list7: "{{ freq6|dict2items|selectattr('value', '>', 1)|map(attribute='key') }}"

gives the same result

  list7: [d, e]

Example of a complete playbook for testing

- hosts: localhost

  vars:

    list1: [a, b, c, d, d, e, e, e]

    list3: "{{ list2|json_query('[?freq > `1`].key') }}"
    list4: "{{ list2|selectattr('freq', '>', 1)|map(attribute='key') }}"

    freq6: "{{ list1|community.general.counter }}"
    list7: "{{ freq6|dict2items|selectattr('value', '>', 1)|map(attribute='key') }}"

  tasks:

    - set_fact:
        list2: "{{ list2 + [{'key': item,
                             'freq': list1|
                                     select('regex', myregex)|
                                     length}] }}"
      loop: "{{ list1|unique|sort }}"
      vars:
        list2: []
        myregex: "^{{ item }}$"
    - debug:
        var: list2|to_yaml
    - debug:
        var: list3|to_yaml
    - debug:
        var: list4|to_yaml

    - set_fact:
        list5: "{{ list5|default([]) + [item] }}"
      loop: "{{ list1|sort }}"
      loop_control:
        extended: true
      when: item == ansible_loop.nextitem|default('')
    - debug:
        var: list5|unique|to_yaml

    - debug:
        var: freq6|to_yaml
    - debug:
        var: list7|to_yaml
like image 20
Vladimir Botka Avatar answered Nov 28 '25 15:11

Vladimir Botka