Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add an enum value to an AVRO schema in a FULL compatible way?

I have an enum in an AVRO schema like this :

{
    "type": "record",
    "name": "MySchema",
    "namespace": "com.company",
    "fields": [
        {
            "name": "color",
            "type": {
                "type": "enum",
                "name": "Color",
                "symbols": [
                    "UNKNOWN",
                    "GREEN",
                    "RED"
                ]
            },
            "default": "UNKNOWN"
        }
    ]
}

When using FULL (which means BACKWARD and FORWARD) compatibility mode, how am I supposed to add a new symbol to the enum ? Is this impossible ?

I read Avro schema : is adding an enum value to existing schema backward compatible? but it doesn't help.

Whenever I try to add a new value to the symbols it fails the compatibility check in the schema registry even though I have a default value on the enum. After testing a bit it seems that adding a new value is BACKWARD compatible but not FORWARD compatible. However, due to the default value I set I expected it to be also FORWARD compatible. Indeed the old reader schema should be able to read a value written by the new schema and default to the "UNKNOWN" enum value when it doesn't know the new symbol.

like image 221
singe3 Avatar asked Sep 15 '25 13:09

singe3


2 Answers

It appears there is currently a bug in AVRO which affects the versions 1.9.0, 1.10.0, 1.9.1, 1.9.2, 1.11.0, 1.10.1, 1.10.2 and further until it is fixed.

The bug is in avro handling of enum default value.

According to the documentation on the reader side with an old schema, we should be able to deserialize a payload containing an enum value that was generated by the writer side having the new schema. Since the value is unknown to the reader it should be deserialized as the default value.

A default value for this enumeration, used during resolution when the reader encounters a symbol from the writer that isn't defined in the reader's schema

However thats not what happen and the deserializer on the reader side fails with the exception org.apache.avro.AvroTypeException: No match for C.

I have reported the bug here, and a pushed a reproduction test here

Hope it attracts some attention from the maintainers :)

like image 79
singe3 Avatar answered Sep 17 '25 03:09

singe3


We can use the symbol level defaults to achieve this, (by moving default inside the type definition). Hope this helps

{
"type": "record",
"name": "MySchema",
"namespace": "com.company",
"fields": [
    {
        "name": "color",
        "type": {
            "type": "enum",
            "name": "Color",
            "symbols": [
                "UNKNOWN",
                "GREEN",
                "RED"
            ],
           "default": "UNKNOWN"
        }
    }
 ]
}
like image 31
Chamara Liyanage Avatar answered Sep 17 '25 02:09

Chamara Liyanage