Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django - Url pattern regex not matching string parameter with accents

Tags:

python

django

I'm having some trouble passing string arguments with accents to my Django application. I have the following url pattern:

url(r'^galeria/(?P<page>\d+)/(?P<order>\w+)/(?P<query>[\w|\W]+)', 'possible_brastemp.views.gallery_with_page_and_query'),

When I try a url like:

 http://127.0.0.1:8000/galeria/1/ultimos/Julian%20Andr%E9s

the pattern is not matched. I have isolated the problem to the '%E9' character (the '%20' doesn't break the match).

How can I change the regex to match parameters with encoded characters?

Thank you

like image 860
Raphael Avatar asked Oct 26 '25 06:10

Raphael


1 Answers

Use %c3%a9 instead of %e9 in the URL. The regex isn't failing... Django isn't even getting to the urlconf. Check the logs, you're probably getting 400 errors.

URI paths should contain UTF-8-encoded characters only. Any UTF-8 character that cannot be represented as a normal, printable ASCII character (and is not on the reserved list) should be percent-encoded.

é (U+00E9) is a multibyte character in UTF-8: 0xc3a9. The percent-encoded form would be %C3%A9. The single byte 0xe9 is NOT a valid UTF-8 character.

See RFC 3986.

[\w|\W]+ successfully matches URLs containing %C3%A9. Django appears to percent-decode the URL byte string into a Unicode string, then converts it to UTF-8 for urlconf matching.

like image 55
Colin Dunklau Avatar answered Oct 27 '25 20:10

Colin Dunklau



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!