Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validate Pydantic dynamic float enum by name with OpenAPI description

Following on from this question and this discussion I am now trying to create a Pydantic BaseModel that has a field with a float Enum that is created dynamically and is validated by name. (Down the track I will probably want to use Decimal but for now I'm dealing with float.)

The discussion provides a solution to convert all Enums to validate by name, but I'm looking for how to do this for one or more individual fields, not a universal change to all Enums.

I consider this to be a common use case. The model uses an Enum which hides implementation details from the caller. The valid field values that a caller can supply are a limited list of names. These names are associated with internal values (in this case float) that the back-end wants to operate on, without requiring the caller to know them.

The Enum valid names and values do change dynamically and are loaded at run time but for the sake of clarity this would result in an Enum something like the following. Note that the Sex enum needs to be treated normally and validated and encoded by value, but the Factor enum needs to be validated by name:

from enum import Enum
from pydantic import BaseModel

class Sex(str, Enum):
    MALE = "M"
    FEMALE = "F"

class Factor(Enum):
    single = 1.0
    half = 0.4
    quarter = 0.1

class Model(BaseModel):
    sex: Sex
    factor: Factor
    class Config:
        json_encoders = {Factor: lambda field: field.name}

model = Model(sex="M", factor="half")
# Error: only accepts e.g. Model(sex="M", factor=0.4)

This is what I want but doesn't work because the normal Pydantic Enum behaviour requires Model(factor=0.4), but my caller doesn't know the particular float that's in use right now for this factor, it can and should only provide "half". The code that manipulates the model internally always wants to refer to the float and so I expect it to have to use model.factor.value.

It's fairly simple to create the Enum dynamically, but that doesn't provide any Pydantic support for validating on name. It's all automatically validated by value. So I think this is where most of the work is:

Factor = Enum("Factor", {"single": 1.0, "half": 0.4, "quarter": 0.1})

The standard way for Pydantic to customise serialization is with the json_encoders Config attribute. I've included that in the sample static Enum. That doesn't seem to be problematic.

Finally, there needs to be support to provide the right description to the OpenAPI schema.

Actually, in my use-case I only need the Enum name/values to be dynamically established. So an implementation that modifies a declared Enum would work, as well as an implementation that creates the Enum type.

like image 371
NeilG Avatar asked Dec 06 '25 15:12

NeilG


1 Answers

Update (2023-03-03)

Class decorator solution

A convenient way to solve this is by creating a reusable decorator that adds both a __get_validators__ method and a __modify_schema__ method to any given Enum class. Both of these methods are documented here.

We can define a custom validator function that will be called for our decorated Enum classes, which will enforce that only names will be turned into members and actual members will pass validation.

The schema modifier will ensure that the JSON schema only shows the names as enum options.

from collections.abc import Callable, Iterator
from enum import EnumMeta
from typing import Any, Optional, TypeVar, cast

from pydantic.fields import ModelField

E = TypeVar("E", bound=EnumMeta)

def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    field_schema["enum"] = list(cast(EnumMeta, field.type_).__members__.keys())

def __enum_name_validator__(v: Any, field: ModelField) -> Any:
    assert isinstance(field.type_, EnumMeta)
    if isinstance(v, field.type_):
        return v  # value is already an enum member
    try:
        return field.type_[v]  # get enum member by name
    except KeyError:
        raise ValueError(f"Invalid {field.type_.__name__} `{v}`")

def __get_enum_validators__() -> Iterator[Callable[..., Any]]:
    yield __enum_name_validator__

def validate_by_name(cls: E) -> E:
    setattr(cls, "__modify_schema__", __modify_enum_schema__)
    setattr(cls, "__get_validators__", __get_enum_validators__)
    return cls

Usage

from enum import Enum
from random import choices, random
from string import ascii_lowercase

from pydantic import BaseModel

# ... import validate_by_name


# Randomly generate an enum of floats:
_members = {
    name: round(random(), 1)
    for name in choices(ascii_lowercase, k=3)
}
Factor = Enum("Factor", _members)  # type: ignore[misc]
validate_by_name(Factor)
first_member = next(iter(Factor))
print("`Factor` members:", Factor.__members__)
print("First `Factor` member:", first_member)


class Foo(Enum):
    member_a = "a"
    member_b = "b"


@validate_by_name
class Bar(int, Enum):
    x = 1
    y = 2


class Model(BaseModel):
    factor: Factor
    foo: Foo
    bar: Bar

    class Config:
        json_encoders = {Factor: lambda field: field.name}


obj = Model.parse_obj({
    "factor": first_member.name,
    "foo": "a",
    "bar": "x",
})
print(obj.json(indent=4))
print(Model.schema_json(indent=4))

Example output:

`Factor` members: {'r': <Factor.r: 0.1>, 'j': <Factor.j: 0.9>, 'z': <Factor.z: 0.6>}
First `Factor` member: Factor.r
{
    "factor": "r",
    "foo": "a",
    "bar": 1
}
{
    "title": "Model",
    "type": "object",
    "properties": {
        "factor": {
            "$ref": "#/definitions/Factor"
        },
        "foo": {
            "$ref": "#/definitions/Foo"
        },
        "bar": {
            "$ref": "#/definitions/Bar"
        }
    },
    "required": [
        "factor",
        "foo",
        "bar"
    ],
    "definitions": {
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": [
                "r",
                "j",
                "z"
            ]
        },
        "Foo": {
            "title": "Foo",
            "description": "An enumeration.",
            "enum": [
                "a",
                "b"
            ]
        },
        "Bar": {
            "title": "Bar",
            "description": "An enumeration.",
            "enum": [
                "x",
                "y"
            ],
            "type": "integer"
        }
    }
}

This just demonstrates a few variations for this approach. As you can see, the Factor and Bar enums are both validated by name, whereas Foo is validated by value (as a regular Enum).

Since we defined a custom JSON Encoder for Factor, the factor value is exported/encoded as the name string, while both Foo and Bar are exported by value (as a regular Enum).

Both Factor and Bar display the enum names in their JSON schema, while Foo shows the enum values.

Note that the "type": "integer" for the JSON Schema of Bar is only present because I specified int as a explicit base class of Bar and disappears, if we remove that. To further ensure consistency, we could of course also simply add "type": "string" inside our __modify_enum_schema__ function.

The only thing that is seemingly impossible right now is to also somehow register our custom way of encoding those enums inside our decorator, so that we do not need to set it in the Config or pass the encoder argument to json explicitly. That may be possible with a few changes to the BaseModel logic, but I think this would be overkill.


Original answer

Validating Enum by name

The parsing part of your problem can be solved fairly easily with a custom validator.

Since a validator method can take the ModelField as an argument and that has the type_ attribute pointing to the type of the field, we can use that to try to coerce any value to a member of the corresponding Enum.

We can actually write a more or less generalized implementation that applies to any arbitrary Enum subtype fields. If we use the "*" argument for the validator, it will apply to all fields, but we also need to set pre=True to perform our checks before the default validators kick in:

from enum import Enum
from typing import Any

from pydantic import BaseModel, validator
from pydantic.fields import ModelField


class CustomBaseModel(BaseModel):
    @validator("*", pre=True)
    def coerce_to_enum_member(cls, v: Any, field: ModelField) -> Any:
        """For any `Enum` typed field, attempt to """
        type_ = field.type_
        if not (isinstance(type_, type) and issubclass(type_, Enum)):
            return v  # field is not an enum type
        if isinstance(v, type_):
            return v  # value is already an enum member
        try:
            return type_(v)  # get enum member by value
        except ValueError:
            try:
                return type_[v]  # get enum member by name
            except KeyError:
                raise ValueError(f"Invalid {type_.__name__} `{v}`")

That validator is agnostic of the specific Enum subtype and it should work for all of them because it uses the common EnumType API, such as EnumType.__getitem__ to get the member by name.

The nice thing about this approach is that while valid Enum names will be turned into the correct Enum members, passing a valid Enum value still works as it did before. As does passing the member directly.

Enum names in the JSON Schema

This is a bit more hacky, but not too bad.

Pydantic actually allows us to easily customize schema generation for specific fields. This is done by adding the __modify_schema__ classmethod to the type in question.

For Enum this turns out to be tricky, especially since you want to it to be created dynamically (via the Functional API). We cannot simply subclass Enum and add our modifier method there due to some magic around the EnumType. What we can do is simply monkey-patch it into Enum (or alternatively do that to our specific Enum subclasses).

Either way, this method again gives us all we need to replace the default "enum" schema section with an array of names instead of values:

from enum import Enum
from typing import Any, Optional

from pydantic.fields import ModelField


def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    enum_cls = field.type_
    assert isinstance(enum_cls, type) and issubclass(enum_cls, Enum)
    field_schema["enum"] = list(enum_cls.__members__.keys())


# Monkey-patch `Enum` to customize schema modification:
Enum.__modify_schema__ = __modify_enum_schema__  # type: ignore[attr-defined]

And that is all we need. (Mypy will complain about the monkey-patching of course.)

Full demo

from enum import Enum
from random import choices, random
from string import ascii_lowercase
from typing import Any, Optional

from pydantic import BaseModel, validator
from pydantic.fields import ModelField


def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    enum_cls = field.type_
    assert isinstance(enum_cls, type) and issubclass(enum_cls, Enum)
    field_schema["enum"] = list(enum_cls.__members__.keys())


# Monkey-patch `Enum` to customize schema modification:
Enum.__modify_schema__ = __modify_enum_schema__  # type: ignore[attr-defined]


class CustomBaseModel(BaseModel):
    @validator("*", pre=True)
    def coerce_to_enum_member(cls, v: Any, field: ModelField) -> Any:
        """For any `Enum` typed field, attempt to """
        type_ = field.type_
        if not (isinstance(type_, type) and issubclass(type_, Enum)):
            return v  # field is not an enum type
        if isinstance(v, type_):
            return v  # value is already an enum member
        try:
            return type_(v)  # get enum member by value
        except ValueError:
            try:
                return type_[v]  # get enum member by name
            except KeyError:
                raise ValueError(f"Invalid {type_.__name__} `{v}`")


# Randomly generate an enum of floats:
_members = {
    name: round(random(), 1)
    for name in choices(ascii_lowercase, k=3)
}
Factor = Enum("Factor", _members)  # type: ignore[misc]
first_member_name = next(iter(Factor)).name
print("Random `Factor` members:", Factor.__members__)
print("First member:", first_member_name)


class Model(CustomBaseModel):
    factor: Factor
    foo: str
    bar: int

    class Config:
        json_encoders = {Factor: lambda field: field.name}


obj = Model.parse_obj({
    "factor": first_member_name,
    "foo": "spam",
    "bar": -1,
})
print(obj.json(indent=4))
print(Model.schema_json(indent=4))

Output:

Random `Factor` members: {'a': <Factor.a: 0.9>, 'q': <Factor.q: 0.6>, 'e': <Factor.e: 0.8>}
First member: a
{
    "factor": "a",
    "foo": "spam",
    "bar": -1
}
{
    "title": "Model",
    "type": "object",
    "properties": {
        "factor": {
            "$ref": "#/definitions/Factor"
        },
        "foo": {
            "title": "Foo",
            "type": "string"
        },
        "bar": {
            "title": "Bar",
            "type": "integer"
        }
    },
    "required": [
        "factor",
        "foo",
        "bar"
    ],
    "definitions": {
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": [
                "a",
                "q",
                "e"
            ]
        }
    }
}

Notes

I chose this super weird way of randomly generating an Enum just for illustrative purposes. I wanted to show that both validation and schema generation still work fine in that case. But in practice I would assume that the names actually don't change that drastically every time the program is run. (At least I hope they don't for the sake of your users.)

The value of factor is still a regular Enum member, so obj.factor.value will still give us 0.9 (for this random example).

The validator will obviously prevent invalid names/values to be passed. You can make it more specific, if you like or restrict it to only deal with str arguments assuming them to be Enum member names and delegate the rest to Pydantic's default validator. As it is written right now, it essentially replaces that default Enum validator.

Any other schema modifications (such as the description) can be done according to the docs I linked as well.

like image 76
Daniil Fajnberg Avatar answered Dec 08 '25 09:12

Daniil Fajnberg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!