Context: An API I'm integrating spits out data in pretty print form like below. I have a MSSQL parsing proc that requires the input JSON to have a flat form in a single line (tabs removed).
Question: I've found some semi-related questions here & here but they dont seem to address my needs because I dont want to perform operations on the file level.
Can anyone recommend some specific methods for transforming the JSON text into a single line in a more granular fashion? Perhaps regular expressions or some string manipulation methods?
Current JSON form:
{
"data": {
"first_name": "Eric",
"last_name": "B",
"email": null,
"score": null,
"domain": "@datashiftlabs.io",
"position": null,
"twitter": null,
"linkedin_url": null,
"phone_number": null,
"company": null,
"sources": []
},
"meta": {
"params": {
"first_name": "Eric",
"last_name": "B",
"full_name": null,
"domain": "@datashiftlabs.io",
"company": null
}
}
}
Desired form:
{"data": {"first_name": "Eric","last_name": "B","email": null,"score": null,"domain": "datashiftlabs.io","position": null,"twitter": null,"linkedin_url": null,"phone_number": null,"company": null,"sources": []},"meta": {"params": {"first_name": "Eric","last_name": "B","full_name": null,"domain": "datashiftlabs.io","company": null}}}
I'm not sure if that's actually what you want, but you could convert your json string with the json library to an object, and than convert it back to a string.
The example would look like this
import json
json_str = """{
"data": {
"first_name": "Eric",
"last_name": "B",
"email": null,
"score": null,
"domain": "@datashiftlabs.io",
"position": null,
"twitter": null,
"linkedin_url": null,
"phone_number": null,
"company": null,
"sources": []
},
"meta": {
"params": {
"first_name": "Eric",
"last_name": "B",
"full_name": null,
"domain": "@datashiftlabs.io",
"company": null
}
}
}"""
obj = json.loads(json_str)
flatten_str = json.dumps(obj)
print(flatten_str)
An alternativ would be using string replace and regex substitution to remove all unnecessary characters like line-breaks, multiple spaces and tabs. A quick draft for this function would look like this. Note: The current regex does not work flawlessly and still has some unintended behavior on certain edge cases, e.g. multiple whitespaces at the end of a string
import re
def flatten_json(string):
# Remove line breaks
string = string.replace("\n", "")
# Remove tabs and multiple spaces
string = re.sub('[\t ]+("|{|})', r' \1', string)
# Return result
return string
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With