Unicode constructor will accept a unicode object, but ONLY if no kwargs are passed

Question

example:

>>> uni = u'some text'
>>> print unicode(uni)
some text
>>> print unicode(uni, errors='ignore')
TypeError                                 
Traceback (most recent call last)
----> 1 print unicode(uni, errors='ignore')
TypeError: decoding Unicode is not supported

Why does this blow up only if I pass additional parameters to the constructor?

unutbu · Accepted Answer

Looking at the source code,

static PyObject *
unicode_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *x = NULL;
    static char *kwlist[] = {"object", "encoding", "errors", 0};
    char *encoding = NULL;
    char *errors = NULL;

    if (type != &PyUnicode_Type)
        return unicode_subtype_new(type, args, kwds);
    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|Oss:str",
                                     kwlist, &x, &encoding, &errors))
        return NULL;
    if (x == NULL)
        _Py_RETURN_UNICODE_EMPTY();
    if (encoding == NULL && errors == NULL)
        return PyObject_Str(x);
    else
        return PyUnicode_FromEncodedObject(x, encoding, errors);
}

notice that near the bottom,

if (encoding == NULL && errors == NULL)
    return PyObject_Str(x);
else
    return PyUnicode_FromEncodedObject(x, encoding, errors);

So when called without the errors parameter, PyObject_Str(x) is called, and this raises no TypeError. But when error and/or encoding is supplied, then PyUnicode_FromEncodedObject is called, and now x must be an encoded string, not a unicode.

Unicode constructor will accept a unicode object, but ONLY if no kwargs are passed

Tags:

python

red

1 Answers

unutbu

Recent Activity

Donate For Us

Unicode constructor will accept a unicode object, but ONLY if no kwargs are passed

Tags:

python

red

1 Answers

unutbu

Related questions

Recent Activity

Donate For Us