I was using Cheerio to find the largest image inside a webpage. Here is the code I used:
const { src } = $('img')
.map((i, el) => ({
src: el.attribs.src,
width: el.attribs.width ? Number(el.attribs.width.match(/\d+/)[0]) : -1,
}))
.toArray()
.reduce((prev, current) => (prev.width > current.width ? prev : current));
However, it works only if with width is inline for img. If there is no width I'd to set it's width to -1 and consider it in sorting
Is there any way to find the largest image in a webpage without these hacks, using Puppeteer? Since the browser is rendering these all, it can easily figure out which one is the largest
You can use page.evaluate() to execute JavaScript within the Page DOM context, and return the src attribute of the largest image back to Node/Puppeteer:
const largest_image = await page.evaluate(() => {
return [...document.getElementsByTagName('img')].sort((a, b) => b.naturalWidth * b.naturalHeight - a.naturalWidth * a.naturalHeight)[0].src;
});
console.log(largest_image);
You should use the naturalWidth and naturlaHeight properties.
const image = await page.evaluate(() => {
function size(img) {
if (!img) {
return 0;
}
return img.naturalWith * img.naturalHeight;
}
function info(img) {
if (!img) {
return null;
}
return {
src: img.src,
size: size(img)
}
}
function largest() {
let best = null;
let images = document.getElementsByTagName("img");
for (let img of images) {
if (size(img) > size(best)) {
best = img
}
}
return best;
}
return info(largest());
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With