Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separating email strings by a delimeter

Tags:

email

ruby

I have an array of email addresses (roughly over 50,000) and I am interested in counting the frequency of particular email domains. For example, if I had

emails = [
  '[email protected]',
  '[email protected]', 
  '[email protected]',
  '[email protected]',
  '[email protected]'
]

and I am interested in which email domain appears the most, I would want to return 'gmail' with frequency 2.

To do this, I thought it would be a good idea to go through the array and discard everything occurring before the @ and just keep the domains as a new array, which I could then iterate over. How would I do this?

like image 521
Chumbawoo Avatar asked Dec 08 '25 07:12

Chumbawoo


2 Answers

Assuming your emails are string, you can do something like this:

emails = ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]"]
counts = Hash.new(0)
emails.each { |t| counts[t.partition("@").last] += 1}
counts #{"gmail.com"=>2, "yahoo.com"=>1, "aol.com"=>1, "someuni.xyz.com"=>1} 
like image 167
pyfl88 Avatar answered Dec 09 '25 23:12

pyfl88


Similar to mudasobwa's answer.

emails
.group_by{|s| s.partition("@").last}
.map{|k, v| [k, v.length]}
.max_by(&:last)
# => ["gmail.com", 2]
like image 24
sawa Avatar answered Dec 09 '25 23:12

sawa