I want to generate many different permutations of JSON structures as a representation of the same data set, preferably without having to hard code the implementation. For example, given the following JSON:
{"name": "smith", "occupation": "agent", "enemy": "humanity", "nemesis": "neo"}`
Many different permutations should be produced, such as:
{"name":"smith"}- > {"last_name":"smith"}{"name":"...","occupation":"..."} -> {"occupation":"...", "name":"..."}{"name":"...","occupation":"..."} -> "smith":{"occupation":"..."}{"name":"...","occupation":"..."} -> "status": 200, "data":{"name":"...","occupation":"..."}Currently, the implementation is as follows:
I am using itertools.permutations and OrderedDict() to range through the possible key and respective value combinations as well as the order in which they are returned.
key_permutations = SchemaLike(...).permutate()
all_simulacrums = []
for key_permutation in key_permutations:
simulacrums = OrderedDict(key_permutation)
all_simulacrums.append(simulacrums)
for x in itertools.permutations(all_simulacrums.items()):
test_data = json.dumps(OrderedDict(p))
print(test_data)
assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)
My problem occurs when I try to implement the permutations of arrangement and template. I don't know how best to implement this functionality, any suggestions?
For ordering, just use ordered dicts:
>>> data = OrderedDict(foo='bar', bacon='eggs', bar='foo', eggs='bacon')
>>> for p in itertools.permutations(data.items()):
... test_data = json.dumps(OrderedDict(p))
... print(test_data)
... assert json.loads(test_data) == data, 'Oops! {} != {}'.format(test_data, data)
{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "bacon", "bar": "foo"}
{"foo": "bar", "bar": "foo", "bacon": "eggs", "eggs": "bacon"}
{"foo": "bar", "bar": "foo", "eggs": "bacon", "bacon": "eggs"}
{"foo": "bar", "eggs": "bacon", "bacon": "eggs", "bar": "foo"}
{"foo": "bar", "eggs": "bacon", "bar": "foo", "bacon": "eggs"}
{"bacon": "eggs", "foo": "bar", "bar": "foo", "eggs": "bacon"}
{"bacon": "eggs", "foo": "bar", "eggs": "bacon", "bar": "foo"}
{"bacon": "eggs", "bar": "foo", "foo": "bar", "eggs": "bacon"}
{"bacon": "eggs", "bar": "foo", "eggs": "bacon", "foo": "bar"}
{"bacon": "eggs", "eggs": "bacon", "foo": "bar", "bar": "foo"}
{"bacon": "eggs", "eggs": "bacon", "bar": "foo", "foo": "bar"}
{"bar": "foo", "foo": "bar", "bacon": "eggs", "eggs": "bacon"}
{"bar": "foo", "foo": "bar", "eggs": "bacon", "bacon": "eggs"}
{"bar": "foo", "bacon": "eggs", "foo": "bar", "eggs": "bacon"}
{"bar": "foo", "bacon": "eggs", "eggs": "bacon", "foo": "bar"}
{"bar": "foo", "eggs": "bacon", "foo": "bar", "bacon": "eggs"}
{"bar": "foo", "eggs": "bacon", "bacon": "eggs", "foo": "bar"}
{"eggs": "bacon", "foo": "bar", "bacon": "eggs", "bar": "foo"}
{"eggs": "bacon", "foo": "bar", "bar": "foo", "bacon": "eggs"}
{"eggs": "bacon", "bacon": "eggs", "foo": "bar", "bar": "foo"}
{"eggs": "bacon", "bacon": "eggs", "bar": "foo", "foo": "bar"}
{"eggs": "bacon", "bar": "foo", "foo": "bar", "bacon": "eggs"}
{"eggs": "bacon", "bar": "foo", "bacon": "eggs", "foo": "bar"}
The same principle can be applied for key/value permutations:
>>> for p in itertools.permutations(data.keys()):
...: test_data = json.dumps(OrderedDict(zip(p, data.values())))
...: print(test_data)
...:
{"foo": "bar", "bacon": "eggs", "bar": "foo", "eggs": "bacon"}
{"foo": "bar", "bacon": "eggs", "eggs": "foo", "bar": "bacon"}
{"foo": "bar", "bar": "eggs", "bacon": "foo", "eggs": "bacon"}
{"foo": "bar", "bar": "eggs", "eggs": "foo", "bacon": "bacon"}
{"foo": "bar", "eggs": "eggs", "bacon": "foo", "bar": "bacon"}
{"foo": "bar", "eggs": "eggs", "bar": "foo", "bacon": "bacon"}
{"bacon": "bar", "foo": "eggs", "bar": "foo", "eggs": "bacon"}
{"bacon": "bar", "foo": "eggs", "eggs": "foo", "bar": "bacon"}
{"bacon": "bar", "bar": "eggs", "foo": "foo", "eggs": "bacon"}
{"bacon": "bar", "bar": "eggs", "eggs": "foo", "foo": "bacon"}
{"bacon": "bar", "eggs": "eggs", "foo": "foo", "bar": "bacon"}
{"bacon": "bar", "eggs": "eggs", "bar": "foo", "foo": "bacon"}
{"bar": "bar", "foo": "eggs", "bacon": "foo", "eggs": "bacon"}
{"bar": "bar", "foo": "eggs", "eggs": "foo", "bacon": "bacon"}
{"bar": "bar", "bacon": "eggs", "foo": "foo", "eggs": "bacon"}
{"bar": "bar", "bacon": "eggs", "eggs": "foo", "foo": "bacon"}
{"bar": "bar", "eggs": "eggs", "foo": "foo", "bacon": "bacon"}
{"bar": "bar", "eggs": "eggs", "bacon": "foo", "foo": "bacon"}
{"eggs": "bar", "foo": "eggs", "bacon": "foo", "bar": "bacon"}
{"eggs": "bar", "foo": "eggs", "bar": "foo", "bacon": "bacon"}
{"eggs": "bar", "bacon": "eggs", "foo": "foo", "bar": "bacon"}
{"eggs": "bar", "bacon": "eggs", "bar": "foo", "foo": "bacon"}
{"eggs": "bar", "bar": "eggs", "foo": "foo", "bacon": "bacon"}
{"eggs": "bar", "bar": "eggs", "bacon": "foo", "foo": "bacon"}
And so on... You can just use a predefined set of keys/values if you don't need all combinations. You can also use a for loop with random.choice to flip a coin in order to skip some combinations or use random.shuffle at the risk of repeating combinations.
For the template thing I guess you must create a list (or a list of lists if you want nested structures) of different templates and then iterate over it in order to create your data. In order to give a better suggestion we need a more constrained specification of what you want.
Note that there are several libraries that generate test data in Python:
>>> from faker import Faker
>>> faker = Faker()
>>> faker.credit_card_full().strip().split('\n')
['VISA 13 digit', 'Jerry Gutierrez', '4885274641760 04/24', 'CVC: 583']
Faker has several schemas and it is easy to create your own custom fake data providers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With