I am using python to create a "favorites" section of a website. Part of what I want to do is grab an image to put next to their link. So the process would be that the user puts in a URL and I go grab a screenshot of that page and display it next to the link. Easy enough?
I have currently downloaded pywebshot and it works great from my terminal on my local box. However, when I put it on the server, I get a Segmentation Fault with the following traceback:
/usr/lib/pymodules/python2.6/gtk-2.0/gtk/__init__.py:57: GtkWarning: could not open display
warnings.warn(str(e), _gtk.Warning)
./pywebshot.py:16: Warning: invalid (NULL) pointer instance
self.parent = gtk.Window(gtk.WINDOW_TOPLEVEL)
./pywebshot.py:16: Warning: g_signal_connect_data: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed
self.parent = gtk.Window(gtk.WINDOW_TOPLEVEL)
./pywebshot.py:49: GtkWarning: Screen for GtkWindow not set; you must always set
a screen for a GtkWindow before using the window
self.parent.show_all()
./pywebshot.py:49: GtkWarning: gdk_screen_get_default_colormap: assertion `GDK_IS_SCREEN (screen)' failed
self.parent.show_all()
./pywebshot.py:49: GtkWarning: gdk_colormap_get_visual: assertion `GDK_IS_COLORMAP (colormap)' failed
self.parent.show_all()
./pywebshot.py:49: GtkWarning: gdk_screen_get_root_window: assertion `GDK_IS_SCREEN (screen)' failed
self.parent.show_all()
./pywebshot.py:49: GtkWarning: gdk_window_new: assertion `GDK_IS_WINDOW (parent)' failed
self.parent.show_all()
Segmentation fault
I know that some things can't run in a pts environment, but honestly that's a little beyond me right now. If I need to somehow pretend that my pts connection is tty, I can try it. But at this point I'm not even sure what's going on and I admit it's a bit over my head. Any help would be greatly appreciated.
Also, if there's a web service that I can pass a url and receive an image, that would work just as well. I am NOT married to the idea of pywebshot.
I do know that the server I'm on is running X and has all the necessary python modules installed.
Thanks in advance.
This is the code I used to get the screenshot of the whole scrolled webpage:
from PIL import Image
from io import BytesIO
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import logging
import os
import time
# Set default download folder for ChromeDriver
videos_folder = r"./download"
if not os.path.exists(videos_folder):
os.makedirs(videos_folder)
prefs = {"download.default_directory": videos_folder}
def open_url(address):
# SELENIUM SETUP
logging.getLogger('WDM').setLevel(logging.WARNING) # just to hide not so rilevant webdriver-manager messages
chrome_options = Options()
chrome_options.headless = True
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options)
driver.implicitly_wait(1)
driver.maximize_window()
driver.get(address)
driver.set_window_size(1920, 1080) # to set the screenshot width
save_screenshot(driver, '{}/Screenshot.png'.format(videos_folder))
driver.quit()
def save_screenshot(driver, file_name):
height, width = scroll_down(driver)
driver.set_window_size(width, height)
img_binary = driver.get_screenshot_as_png()
img = Image.open(BytesIO(img_binary))
img.save(file_name)
# print(file_name)
print("Screenshot saved!")
def scroll_down(driver):
total_width = driver.execute_script("return document.body.offsetWidth")
total_height = driver.execute_script("return document.body.parentNode.scrollHeight")
viewport_width = driver.execute_script("return document.body.clientWidth")
viewport_height = driver.execute_script("return window.innerHeight")
rectangles = []
i = 0
while i < total_height:
ii = 0
top_height = i + viewport_height
if top_height > total_height:
top_height = total_height
while ii < total_width:
top_width = ii + viewport_width
if top_width > total_width:
top_width = total_width
rectangles.append((ii, i, top_width, top_height))
ii = ii + viewport_width
i = i + viewport_height
previous = None
part = 0
for rectangle in rectangles:
if not previous is None:
driver.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))
time.sleep(0.5)
# time.sleep(0.2)
if rectangle[1] + viewport_height > total_height:
offset = (rectangle[0], total_height - viewport_height)
else:
offset = (rectangle[0], rectangle[1])
previous = rectangle
return total_height, total_width
open_url("https://stackoverflow.com/questions/4091940/how-to-save-web-page-as-image-using-python")
Here the screenshot obtained:

The current stable release of ChromeDriver is 114.0.5735.90, which is not compatible with the current version (as of 2024.06.04) of Chrome (125.0.6422.141), so the script, as above, would not work.
To fix this, at the moment, the change to be made is unfortunately manual, by downloading the ChromeDriver version (relative to the current stable version of Chrome) from here, as shown in the image below (for Chrome 125.0.6422.141):

Once the chromedriver-linux64.zip archive has been saved, the extracted folder must be renamed with the relevant version of Chrome (125.0.6422.141) and then moved to the path ~/.wdm/drivers/chromedriver/linux64/ (obtaining ~/.wdm/drivers/chromedriver/linux64/125.0.6422.141/chromedriver), and therefore the script must be modified by replacing driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options) with driver = webdriver.Chrome(executable_path=r"~/.wdm/drivers/chromedriver/linux64/125.0.6422.141/chromedriver", options=chrome_options).
That's all!
Steps for the following configuration:
My OS: Linux Fedora 40
My Browser: Chrome 138.0.7204.183 (latest official 64-bit build for my OS)
Visit the previously mentioned page and filter the JSON with your Chrome version (138.0.7204.183, in my case), downloading the version of chromedriver appropriate for your OS;
Extract chromedriver-linux64.zip and rename the extracted folder to 138.0.7204.183 by moving it to ~/.wdm/drivers/chromedriver/linux64/.
Based on what was reported here, create a modified webpage_screenshot.py file as follows:
import argparse
import logging
import os
import time
from io import BytesIO
from PIL import Image
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
def parse_args():
parser = argparse.ArgumentParser(description="Cattura screenshot di una pagina web.")
parser.add_argument("-d", "--directory", type=str, default="./download",
help="Percorso della directory in cui salvare lo screenshot.")
parser.add_argument("-u", "--url", type=str, required=True,
help="URL della pagina da cui fare lo screenshot.")
return parser.parse_args()
def open_url(address, download_folder):
# Impostazioni cartella download
download_folder = os.path.expanduser(download_folder) # <<< expand ~
if not os.path.exists(download_folder):
os.makedirs(download_folder)
prefs = {"download.default_directory": os.path.abspath(download_folder)}
# SELENIUM SETUP
logging.getLogger('WDM').setLevel(logging.WARNING)
chrome_options = Options()
chrome_options.headless = True
chrome_options.add_experimental_option("prefs", prefs)
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)
driver.implicitly_wait(1)
driver.set_window_size(1920, 1080)
driver.get(address)
output_path = os.path.join(download_folder, "Screenshot.png")
save_screenshot(driver, output_path)
driver.quit()
def save_screenshot(driver, file_name):
height, width = scroll_down(driver)
driver.set_window_size(width, height)
img_binary = driver.get_screenshot_as_png()
img = Image.open(BytesIO(img_binary))
img.save(file_name)
print(f"Screenshot salvato in: {file_name}")
def scroll_down(driver):
total_width = driver.execute_script("return document.body.offsetWidth")
total_height = driver.execute_script("return document.body.parentNode.scrollHeight")
viewport_width = driver.execute_script("return document.body.clientWidth")
viewport_height = driver.execute_script("return window.innerHeight")
rectangles = []
i = 0
while i < total_height:
ii = 0
top_height = min(i + viewport_height, total_height)
while ii < total_width:
top_width = min(ii + viewport_width, total_width)
rectangles.append((ii, i, top_width, top_height))
ii += viewport_width
i += viewport_height
previous = None
for rectangle in rectangles:
if previous is not None:
driver.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))
time.sleep(0.5)
previous = rectangle
return total_height, total_width
if __name__ == "__main__":
args = parse_args()
open_url(args.url, args.directory)
The script now supports launching with two options (-u, --url; -d, --directory), instead of having to write them statically in the code.
Run the script by entering the following commands:
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install selenium Pillow image webdriver_manager
python webpage_screenshot.py -u "https://stackoverflow.com/questions/4091940/how-to-save-web-page-as-image-using-python" -d "~/Desktop/Screenshot"
~/Desktop/Screenshot) will be created with the screenshot inside it!Cheers
from selenium import webdriver
from xvfbwrapper import Xvfb
d=Xvfb(width=400,height=400)
d.start()
browser=webdriver.Firefox()
url="http://stackoverflow.com/questions/4091940/how-to-save-web-page-as-image-using-python"
browser.get(url)
destination="screenshot_filename.jpg"
if browser.save_screenshot(destination):
print "File saved in the destination filename"
browser.quit()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With