Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why not use fewer lines of code in this function (urllib.urlopen)?

I am working through a tutorial for beginner's level python, and I found that I can get the same result with one less line of code, and I don't know why the instructor (he is a professional for 15+ years) chose to do it with an additional line/variable. My assumption is that it either has to do with low-level use of the urllib library, and/or it may be programming best-practices, but I'm hoping someone might be able to tell me why.

The function in question is:

from urllib.request import urlopen

def load_page(url: str):
    response = urlopen(url)
    html = response.read().decode('utf')
    return html

As I've learned, this returns the actual HTML of the request. But while I was debugging and just inspecting what each piece of that function does/returns, I found that I can get the same result (the HTML of the request) by removing the "response" variable completely:

def load_page(url: str):
    html = urlopen(url).read().decode('utf')
    return html

Is there a reason you would want to first assign urlopen(url) to response, instead of just running read() and decode() directly on urlopen(url)?

like image 899
triplethreatguy Avatar asked Dec 07 '25 19:12

triplethreatguy


1 Answers

Like functions should "do one thing" the same could be said about each line of code. There has been a similar question (about the broader concept of reducing line count on Software Engineering: "Is fewer lines of code always better?")

It's one operation to make the request, one operation to get the contents of the request.

from urllib.request import urlopen

def load_page(url: str):
    response = urlopen(url)               # make request
    html = response.read().decode('utf')  # get contents
    return html

This has a number of advantages:

  • Debugging: If it fails you know where it fails (in terms of line number). Did it fail when doing the urlopen or when decode?
  • Extending the code: You also want to inspect the status of the response or the header? Easy, just access response.status or response.headers.
  • Readability: The more operations/calls there are in one line the harder it is to immediately understand what the line is doing.

However in this case it would be "acceptable" to do the read().decode(...) in the return line:

from urllib.request import urlopen

def load_page(url: str):
    response = urlopen(url)               # make request
    return response.read().decode('utf')  # get contents and return
like image 181
MSeifert Avatar answered Dec 11 '25 23:12

MSeifert



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!