Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error serializing numpy.int64 into JSON with Python

Tags:

python

json

numpy

I’m developing a Python script that collects “snapshots” of my data at different points in time and saves them into JSON files for later analysis. I want to store each snapshot as a single line in a JSONL file.

I run this script using Python 3.10/3.11 inside Streamlit on Git Bash, and some snapshots are not written correctly because numpy data types like int64 are not recognized by json.dumps. This causes the JSON file to become invalid, and prevents me from loading the data later. Here’s a simplified snippet of my code:

import json
import numpy as np
import logging

logger = logging.getLogger(__name__)

def is_json_serializable(obj):
    try:
        json.dumps(obj)
        return True
    except:
        return False

try:
    with open('het_q2_snapshots.json', 'w', encoding='utf-8') as f:
        for i, snapshot in enumerate(snapshots):
            if is_json_serializable(snapshot):
                json_line = json.dumps(snapshot, ensure_ascii=False, separators=(',', ':'))
                f.write(json_line + '\n')
            else:
                logger.error(f"Snapshot {i} is not serializable: {snapshot}")
                # conversion attempt
                safe_snapshot = {k: int(v) if isinstance(v, np.int64) else v 
                                 for k, v in snapshot.items()}
                json_line = json.dumps(safe_snapshot, ensure_ascii=False, separators=(',', ':'))
                f.write(json_line + '\n')
except Exception as e:
    logger.error(f"Error saving snapshots: {e}")

When I run this, I get the following errors:

ERROR - Snapshot loading error: Expecting value: line 4 column 14 (char 57)
ERROR - Unexpected error: Object of type int64 is not JSON serializable

I already tried converting all numpy.int64 values to Python int, using default=str in json.dumps, and checking for non-serializable fields, but the problem persists.

Question: What’s the best way to ensure that all numpy.int64 (or any non-native types) are properly converted before serializing, especially when the data can be nested in dictionaries/lists?

Thanks a lot!

like image 994
berny a Avatar asked Feb 17 '26 11:02

berny a


1 Answers

I guess something like this solves all of your problems:

class NumpyEncoder(json.JSONEncoder):
    """Custom encoder for numpy data types"""

    def default(self, obj):
        if isinstance(
            obj,
            (
                np.int_,
                np.intc,
                np.intp,
                np.int8,
                np.int16,
                np.int32,
                np.int64,
                np.uint8,
                np.uint16,
                np.uint32,
                np.uint64,
            ),
        ):
            return int(obj)

        elif isinstance(obj, (np.float16, np.float32, np.float64)):
            return float(obj)

        elif isinstance(obj, np.complex64, np.complex128):
            return {"real": obj.real, "imag": obj.imag}

        elif isinstance(obj, (np.ndarray,)):
            return obj.tolist()

        elif isinstance(obj, (np.bool_)):
            return bool(obj)

        elif isinstance(obj, (np.void)):
            return None

        return json.JSONEncoder.default(self, obj)

That can be used very easily:

json.dumps(variable, cls=NumpyEncoder)

Credits to hmallen:
https://github.com/hmallen/numpyencoder/blob/master/numpyencoder/numpyencoder.py

like image 191
vallops Avatar answered Feb 20 '26 00:02

vallops



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!