Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently parsing email body in javascript

Tags:

javascript

I need to parse multiple email bodies that look like:

Name: Bob smith
Email: [email protected]
Phone Number: 4243331212

As part of a larger program I have the following function to parse the page:

function parseBody(i, body) {

result = []
result[0] = i

var name = body.match(new RegExp(/\*Name:\*(.*) /))
if (name) { 
     result[1] = name[1]
}

......

   return result;

}

rather than have potentially 10 fields to parse and load into the result array by if statements , is there a more efficient way using JavaScript to parse this page and load the array?

like image 205
user1592380 Avatar asked Dec 18 '25 11:12

user1592380


2 Answers

handles numbers and undefined

var mailBody = `
Name: Bob smith
Email: [email protected]
Phone Number: 4243331212
key4:value4
key5:value 5
key6:
  key7: value7
`;

var obj = {}; 
mailBody.split('\n').forEach(v=>v.replace(/\s*(.*)\s*:\s*(.*)\s*/, (s,key,val)=>{
  obj[key]=isNaN(val)||val.length<1?val||undefined:Number(val);
}));

console.log( obj );
like image 58
joopmicroop Avatar answered Dec 21 '25 01:12

joopmicroop


I would suggest to find a way to split the body and identify the keys and values to build a result object.

  • In order to split the body, you can use a regular expression to match the structure of a key:

    let delimiter = new RegExp('(\w*): ')
    
  • Then use the split method on the body with this regexp to get an array with an alternance of keys and values :

    let split = body.split(delimiter)
    
  • Finally sort the keys from the values with a loop :

    let res = {}
    for(let i = 0; i < split.length; i += 2)
      res[ split[i] ] = split[ i+1 ]  // even indexes are key, odd ones are values
    

Pushing forward you can remove empty keys and trailing spaces and carriage return with a more advanced regexp. Here is a possible implementation :

    function parseBody (body) {
      let split = body
        .split(new RegExp('(\w*): ([^\t\r\n]*)[\t\r\n]*')) // removes trailing line feed
        .filter(x=>x) // remove empty matches
      let res = {}
      for(let i = 0; i < split.length; i += 2)
        res[ split[i] ] = split[ i+1 ] // even indexes are key, odd ones are values
      return res
    }

This returns an associative array but you have the idea if you want a simple array.

like image 26
Maxime Avatar answered Dec 21 '25 01:12

Maxime