Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read and manipulate a Json file with Apache beam in Python

I have a .txt file which has the JSON format. I want to read, manipulate and restructure the file (change the fields name...) How Can I do this in Python with Apache Beam?

like image 605
Rim Avatar asked Nov 21 '25 10:11

Rim


1 Answers

To be able to read a Json File with Apache Beam on Python, you can make a Custom Coder:

CF : https://beam.apache.org/documentation/programming-guide/#specifying-coders

class JsonCoder(object):
"""A JSON coder interpreting each line as a JSON string."""

def encode(self, x):
    return json.dumps(x)

def decode(self, x):
    return json.loads(x)

And then you have to specify it when you read or write your data, for instance :

lines = p | 'read_data' >> ReadFromText(known_args.input, coder=JsonCoder())

Best regards, work well ;)

like image 189
Alexis C Avatar answered Nov 23 '25 23:11

Alexis C



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!