Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to install module for BeautifulSoup XML parsing?

In this answer, I was told to not use BeautifulSoup(xmlData, 'html.parser') for XML parsing but to use BeautifulSoup(xmlData, 'xml'). That parser, however, does not come with BeautifulSoup.

As per one of the comments, I tried:

python -m pip install lxml

But got:

Collecting lxml
  Using cached lxml-3.6.4.tar.gz
Installing collected packages: lxml
  Running setup.py install for lxml ... error
    Complete output from command D:\SOFT\Python3\python.exe -u -c "import setuptools, tokenize;__file__='C:\\U
sers\\myuser\\AppData\\Local\\Temp\\pip-build-hl9fxzny\\lxml\\setup.py';f=getattr(tokenize, 'open', open)(__fi
le__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:
\Users\myuser\AppData\Local\Temp\pip-ivemv19a-record\install-record.txt --single-version-externally-managed --
compile:
    Building lxml version 3.6.4.
    Building without Cython.
    ERROR: b"'xslt-config' is not recognized as an internal or external command,\r\noperable program or batch
file.\r\n"
    ** make sure the development packages of libxml2 and libxslt are installed **

    Using build configuration of libxslt
    running install
    running build
    running build_py
    creating build
    creating build\lib.win32-3.5
    creating build\lib.win32-3.5\lxml
    copying src\lxml\builder.py -> build\lib.win32-3.5\lxml
    copying src\lxml\cssselect.py -> build\lib.win32-3.5\lxml
    copying src\lxml\doctestcompare.py -> build\lib.win32-3.5\lxml
    copying src\lxml\ElementInclude.py -> build\lib.win32-3.5\lxml
    copying src\lxml\pyclasslookup.py -> build\lib.win32-3.5\lxml
    copying src\lxml\sax.py -> build\lib.win32-3.5\lxml
    copying src\lxml\usedoctest.py -> build\lib.win32-3.5\lxml
    copying src\lxml\_elementpath.py -> build\lib.win32-3.5\lxml
    copying src\lxml\__init__.py -> build\lib.win32-3.5\lxml
    creating build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\__init__.py -> build\lib.win32-3.5\lxml\includes
    creating build\lib.win32-3.5\lxml\html
    copying src\lxml\html\builder.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\clean.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\defs.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\diff.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\ElementSoup.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\formfill.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\html5parser.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\soupparser.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\usedoctest.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\_diffcommand.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\_html5builder.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\_setmixin.py -> build\lib.win32-3.5\lxml\html
    copying src\lxml\html\__init__.py -> build\lib.win32-3.5\lxml\html
    creating build\lib.win32-3.5\lxml\isoschematron
    copying src\lxml\isoschematron\__init__.py -> build\lib.win32-3.5\lxml\isoschematron
    copying src\lxml\lxml.etree.h -> build\lib.win32-3.5\lxml
    copying src\lxml\lxml.etree_api.h -> build\lib.win32-3.5\lxml
    copying src\lxml\includes\c14n.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\config.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\dtdvalid.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\etreepublic.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\htmlparser.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\relaxng.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\schematron.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\tree.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\uri.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xinclude.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xmlerror.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xmlparser.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xmlschema.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xpath.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\xslt.pxd -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\etree_defs.h -> build\lib.win32-3.5\lxml\includes
    copying src\lxml\includes\lxml-version.h -> build\lib.win32-3.5\lxml\includes
    creating build\lib.win32-3.5\lxml\isoschematron\resources
    creating build\lib.win32-3.5\lxml\isoschematron\resources\rng
    copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib.win32-3.5\lxml\isoschematron\
resources\rng
    creating build\lib.win32-3.5\lxml\isoschematron\resources\xsl
    copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win32-3.5\lxml\isoschematron\reso
urces\xsl
    copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win32-3.5\lxml\isoschematron\reso
urces\xsl
    creating build\lib.win32-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstract_expand.xsl -> build\lib.win
32-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_include.xsl -> build\lib.win32-
3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_message.xsl -> build\lib.
win32-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematron_skeleton_for_xslt1.xsl ->
 build\lib.win32-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_for_xslt1.xsl -> build\lib.win3
2-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt -> build\lib.win32-3.5\lxml\i
soschematron\resources\xsl\iso-schematron-xslt1
    running build_ext
    building 'lxml.etree' extension
    error: Unable to find vcvarsall.bat

    ----------------------------------------
Command "D:\SOFT\Python3\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\myuser\\AppData\\L
ocal\\Temp\\pip-build-hl9fxzny\\lxml\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().repl
ace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\myuser\AppData\Lo
cal\Temp\pip-ivemv19a-record\install-record.txt --single-version-externally-managed --compile" failed with err
or code 1 in C:\Users\myuser\AppData\Local\Temp\pip-build-hl9fxzny\lxml\

I am using Python 3.5.2 and would like something that will work right out of pip, meaning won't need to be compiled separately.

like image 261
amphibient Avatar asked Oct 22 '25 08:10

amphibient


1 Answers

You would need a compiler on Windows to install lxml through pip.

Some unofficial builds are available here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

Find URL for the right wheel package then this should work:

pip install http://url_to_wheel
like image 96
Stephane Martin Avatar answered Oct 23 '25 23:10

Stephane Martin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!