Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python UTF8 encoding with arabic

I have a encoding problem, When I try to crawl youtube (arabic channel) :

#!/usr/bin/python
# -*- coding: utf8 -*- 
from django.core.management.base import BaseCommand, CommandError
import requests, lxml, re
from lxml import html

class Command(BaseCommand):
    def handle(self, *args, **options):
        r = requests.get("https://www.youtube.com/user/aljazeerachannel/videos?view=0")
        root = lxml.html.fromstring(r.content)

        for data in root.xpath('.//*[@id="branded-page-body"]/div/div/div[1]/div/div[2]/ul/li[1]/span/span/a'):
            print data.text

The result is :

[root@vmi9105 buzzbal]# python manage.py youtube

        Ø§ÙØªØ®Ø§Ø¨Ø§Øª اÙÙØ¬Ø§Ùس Ø§ÙØ¨ÙØ¯ÙØ© ÙÙ Ø³ÙØ·ÙØ© عÙÙØ§Ù
like image 793
Benabra Avatar asked Feb 04 '26 03:02

Benabra


1 Answers

try this it sloved my problem in python:

f"{yourString}".encode('latin-1').decode("utf-8")
like image 141
fddfdffd Avatar answered Feb 07 '26 01:02

fddfdffd



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!