I am using the playwright library for web scraping and URLs are stored in a CSV file. I am trying to read the CSV file and pick the URLs in an array to utilize in the scraping code.
Here is the code I wrote.
// Support
const csv = require('csv-parser');
const fs = require('fs');
// Array to store the URL.
var urls = [];
// This prints an empty array.
console.log(urls);
fs.createReadStream('sample.csv')
.pipe(csv())
.on('data', (row) => {
// Trying push the URL in the array
urls.push(row);
// This prints the values of URLs
console.log(urls);
})
.on('end', () => {
console.log('CSV file successfully processed');
});
// Here I don't see the URLs but an empty array.
console.log("URLS:" + urls);
In the method ".on('data'" the value gets pushed to the array and the console is also printing those, however, post-execution when I try to get the URLs from the array it returns an empty array.
This answer is written assuming that links are the only thing that are in your CSV file.
const { test } = require('@playwright/test');
const fs = require("fs");
// Reads the CSV file and saves it
var links = fs.readFileSync('Path/to/csv')
.toString() // convert Buffer to string
.split('\n') // split string to lines
.map(e => e.trim()) // remove white spaces for each line
// Start of for loop, to loop through csv file
for (const link of links) {
// Normal test set up for playwright. adding the + link.toString to avoid duplicate test name error
test('test for ' + link.toString(), async ({ page }) => {
// First csv file item sent to console
console.log(link);
// Goes to that csv link item
await page.goto(link);
// Do whatever else you need
});
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With