Coco Json file to CSV format (path/to/image.jpg,x1,y1,x2,y2,class_name)

Question

I would like to convert my coco JSON file as follows:

The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:

path/to/image.jpg,x1,y1,x2,y2,class_name

A full example:

*/data/imgs/img_001.jpg,837,346,981,456,cow 
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird

This defines a dataset with 3 images: img_001.jpg contains a cow, img_002.jpg contains a cat and a bird, and img_003.jpg contains no interesting objects/animals.

How could I do that?

ZFTurbo · Accepted Answer

I have such function.

def convert_coco_json_to_csv(filename):
    import pandas as pd
    import json
    
    # COCO2017/annotations/instances_val2017.json
    s = json.load(open(filename, 'r'))
    out_file = filename[:-5] + '.csv'
    out = open(out_file, 'w')
    out.write('id,x1,y1,x2,y2,label
')

    all_ids = []
    for im in s['images']:
        all_ids.append(im['id'])

    all_ids_ann = []
    for ann in s['annotations']:
        image_id = ann['image_id']
        all_ids_ann.append(image_id)
        x1 = ann['bbox'][0]
        x2 = ann['bbox'][0] + ann['bbox'][2]
        y1 = ann['bbox'][1]
        y2 = ann['bbox'][1] + ann['bbox'][3]
        label = ann['category_id']
        out.write('{},{},{},{},{},{}
'.format(image_id, x1, y1, x2, y2, label))

    all_ids = set(all_ids)
    all_ids_ann = set(all_ids_ann)
    no_annotations = list(all_ids - all_ids_ann)
    # Output images without any annotations
    for image_id in no_annotations:
        out.write('{},{},{},{},{},{}
'.format(image_id, -1, -1, -1, -1, -1))
    out.close()

    # Sort file by image id
    s1 = pd.read_csv(out_file)
    s1.sort_values('id', inplace=True)
    s1.to_csv(out_file, index=False)

Ahmed Roshdy · Answer

Here is a function I use to convert Coco format to AutoML CSV format for image object detection annotated data:

def convert_coco_json_to_csv(filename,bucket):
    import pandas as pd
    import json
    
    s = json.load(open(filename, 'r'))
    out_file = filename[:-5] + '.csv'

    with open(out_file, 'w') as out:
      out.write('GCS_FILE_PATH,label,X_MIN,Y_MIN,,,X_MAX,Y_MAX,,
')
      file_names = [f"{bucket}/{image['file_name']}" for image in s['images']]
      categories = [cat['name'] for cat in s['categories']]
      for label in s['annotations']:
        #The COCO bounding box format is [top left x position, top left y position, width, height]. 
        # for AutoML: For example, a bounding box for the entire image is expressed as (0.0,0.0,,,1.0,1.0,,), or (0.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0).
        HEIGHT = s['images'][label['image_id']]['height']
        WIDTH = s['images'][label['image_id']]['width']
        X_MIN = label['bbox'][0]/WIDTH
        X_MAX = (label['bbox'][0] + label['bbox'][2]) / WIDTH
        Y_MIN = label['bbox'][1] / HEIGHT
        Y_MAX = (label['bbox'][1] + label['bbox'][3]) / HEIGHT
        out.write(f"{file_names[label['image_id']]},{categories[label['category_id']]},{X_MIN},{Y_MIN},,,{X_MAX},{Y_MAX},,
")

And simply you can use it by calling the function with the file name and the gs storage where images were uploaded:

convert_coco_json_to_csv("/content/train_annotations.coco.json", "gs://[bucket name]")

Coco Json file to CSV format (path/to/image.jpg,x1,y1,x2,y2,class_name)

Tags:

json

object-detection

coco

jvl

2 Answers

ZFTurbo

Ahmed Roshdy

Recent Activity

Donate For Us

Coco Json file to CSV format (path/to/image.jpg,x1,y1,x2,y2,class_name)

Tags:

json

object-detection

coco

jvl

2 Answers

ZFTurbo

Ahmed Roshdy

Related questions

Recent Activity

Donate For Us