Sorry if the title is a bit confusing, I wasn't sure how to word it in a couple of words.
I'm currently dealing with a situation where the user uploads a .csv or excel file, and the data must be mapped properly to prepare for a batch upload. It will make more sense as you read the code below!
First step: The user uploads the .csv/excel file, it's transformed into an array of objects. Generally the first array will be the headers.
The data will look like the below(including headers). This will be anywhere between 100 items to up to ~100,000 items:
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
]
Once this is uploaded, the user will map out each field to the proper schema. This could either be all of the fields or only a select few.
For instance the user only wants to exclude the address portion except for zip code. We would get back the 'mapped fields' array, renamed to the proper schema names (i.e. First Name => firstName):
const MAPPED_FIELDS = [firstName, lastName, company, email, phone, <empty>, <empty>, <empty>, zipCode]
I've made it so the indexes of the mapped fields will always match the 'headers'. So any unmapped headers will have an value.
So in this scenario we know to only upload the data (of DUMMY_DATA) with the indexes [0, 1, 2, 3, 4, 8].
We then get to the final part where we want to upload the proper fields for all the data, so we would have the properly mapped schemas from MAPPED_FIELDS matching the mapped values from DUMMY_DATA...
const firstObjectToBeUploaded = {
firstName: 'Lambert',
lastName: 'BeckHouse',
company: 'StackOverflow',
email: '[email protected]',
phone: '512-555-1738',
zipCode: '78721'
}
try {
await uploadData(firstObjectToBeUploaded)
} catch (err) {
console.log(err)
}
All the data will be sent to an AWS lambda function written in Node.js to handle the upload / logic.
I'm struggling a bit on how to implement this efficiently as the data can get quite large.
If you're looking for some performance gains at larger array sizes you can apply the same logic as Nick's answer but implemented in standard for loops.
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
const fieldLength = MAPPED_FIELDS.length;
const dataLength = DUMMY_DATA.length;
const objectsToUpload = [];
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = DUMMY_DATA[i][j];
}
}
objectsToUpload.push(obj);
}
console.log(objectsToUpload);
Here isolating the entries() of the MAPPED_FIELDS array once before the loop to avoid repeated generation of the entries iterator and simply skipping null keys rather than filtering them later. Destructuring and the possibly the iterator creation/spread seems to put this below Nick's for small arrays, but faster on larger arrays (Chrome based browser tests).
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
const MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
const objectsToUpload = [];
for (const datum of DUMMY_DATA.slice(1)) {
const obj = {};
for (const [idx, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[idx];
}
}
objectsToUpload.push(obj);
}
console.log(objectsToUpload);
Rough benchmark below with results as follows on my machine.
for 1,000: 0.400ms
for...of 1,000: 2.900ms
entries 1,000: 1.700ms
for 10,000: 4.100ms
for...of 10,000: 11.700ms
entries 10,000: 13.900ms
for 100,000: 30.200ms
for...of 100,000: 56.500ms
entries 100,000: 100.200ms
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
];
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode'];
function makeBigData(size) {
const [header, ...data] = DUMMY_DATA;
const r = [header];
for (let l = 0; l < size; l += 1) {
r.push([...data[Math.round(Math.random())]]);
}
return r;
}
let data = makeBigData(1000);
console.time('for 1,000');
let objectsToUpload = [];
let fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i][j];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 1,000');
data = makeBigData(1000);
console.time('for...of 1,000');
objectsToUpload = [];
let MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 1,000');
data = makeBigData(1000);
console.time('entries 1,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 1,000');
console.log();
data = makeBigData(10000);
console.time('for 10,000');
objectsToUpload = [];
fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 10,000');
data = makeBigData(10000);
console.time('for...of 10,000');
objectsToUpload = [];
MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 10,000');
data = makeBigData(10000);
console.time('entries 10,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 10,000');
console.log();
data = makeBigData(100000);
console.time('for 100,000');
objectsToUpload = [];
fieldLength = MAPPED_FIELDS.length, dataLength = data.length;
for (let i = 1; i < dataLength; i++) {
const obj = {};
for (let j = 0; j < fieldLength; j++) {
if (MAPPED_FIELDS[j] !== null) {
obj[MAPPED_FIELDS[j]] = data[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for 100,000');
data = makeBigData(100000);
console.time('for...of 100,000');
objectsToUpload = [];
MAPPED_FIELDS_ENTRIES = [...MAPPED_FIELDS.entries()];
for (const datum of data.slice(1)) {
const obj = {};
for (const [i, key] of MAPPED_FIELDS_ENTRIES) {
if (key !== null) {
obj[key] = datum[i];
}
}
objectsToUpload.push(obj);
}
console.timeEnd('for...of 100,000');
data = makeBigData(100000);
console.time('entries 100,000');
objectsToUpload = data.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.timeEnd('entries 100,000');
You can map the DUMMY_DATA array (minus the headers) into a set of arrays with values being
MAPPED_FIELDS andDUMMY_DATA value with the same indexYou can then filter those arrays to remove null keys and turn them into objects using Object.fromEntries:
const DUMMY_DATA = [
['First Name', 'Last Name', 'company', 'email', 'phone', 'Address', 'City', 'State', 'Zip Code'],
['Lambert', 'Beckhouse', 'StackOverflow', '[email protected]', '512-555-1738', '316 Arapahoe Way', 'Austin', 'TX', '78721'],
['Maryanna', 'Vassman', 'CDBABY', '[email protected]', '479-204-8976', '1126 Troy Way', 'Fort Smith', 'AR', '72916']
]
const MAPPED_FIELDS = ['firstName', 'lastName', 'company', 'email', 'phone', null, null, null, 'zipCode']
const objectsToUpload = DUMMY_DATA.slice(1).map(data =>
Object.fromEntries(MAPPED_FIELDS
.map((key, idx) => [key, data[idx]])
.filter(a => a[0])
)
)
console.log(objectsToUpload)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With