Recover email address from special application of MD5 hash function

Question

First, we segment the email address into 2-character strings.
Then, for every segment s, we compute the following hash J:

md5(md5(s) + s + md5(s))  [where + is the string concatenation operator].

Finally, we concatenate all hash strings J to form the long hash below.

For example: for an input of [email protected], we would compute:

md5(md5('he') + 'he' + md5('he')) +
md5(md5('ll') + 'll' + md5('ll')) +
md5(md5('ow') + 'ow' + md5('ow')) +
...

Long Hash:

f894e71e1551d1833a977df952d0cc9de44a1f9669fbf97d51309a2c6574d5eaa746cdeb9ee1a5df
c771d280d33e5672bf024973657c99bf80cb242d493d5bacc771b3b0b422d5c13595cf3e73cfb1df
91caedee7a6c5f3ce2c283564a39c52d3306d60cbc0e3e33d7ed01e780acb1ccd9174cfea4704eb2
33b0f06e52f6d5aba5a5a89e6122dd55f8efcf024961c1003d116007775d60a0d5781d2e35d747b5
dece2e0e3d79d272e40c8c66555f5525

How can I recover the email address from the hash? As I understand it, a "Hash" is a One Way Function. I can only compare it to another hash to see if they match or generate a Hash of the original text.

savanto · Accepted Answer

While it may be true in general that it is impractical to extract the original message from a hash, this clearly looks like an exercise with conditions carefully crafted to make it possible to break the "encryption".

Consider that the email address is broken up into two-character segments. If you limit yourself to just lowercase letters (26 letters + 2 symbols, @ and ., there are only 28 * 28 = 784 possible two-letter combinations. Even if the emails have lowercase and uppercase letters and numbers, there are only 64 * 64 = 4096 combinations -- well within computational limits.

The thing to do is to pre-compute a rainbow table, or table of all possible hash values in your search space. You could do this with a matrix:

 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+
 |                a                 |                b                 |                c                       |             ...             |
 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+
a| md5(md5('aa') + 'aa' + m5('aa')) | md5(md5('ba') + 'ba' + m5('ba')) | md5(md5('ca') + 'ca' + m5('ca'))       |             ...             |
 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+
b| md5(md5('ab') + 'ab' + m5('ab')) | md5(md5('bb') + 'bb' + m5('bb')) | md5(md5('cb') + 'cb' + m5('cb'))       |             ...             |
 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+
c| md5(md5('ac') + 'ac' + m5('ac')) | md5(md5('bc') + 'bc' + m5('bc')) | md5(md5('cc') + 'cc' + m5('cc'))       |             ...             |
 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+
 |               ...                |               ...                |               ...                      |             ...             |
 +----------------------------------+----------------------------------+----------------------------------------+-----------------------------+

but then you would have to traverse the matrix each time to find a match -- slow!

An alternative is to use a dictionary with the key being the hash, and the value being the 'decoded' letters:

{ 
   md5(md5('aa') + 'aa' + md5('aa')): 'aa',
   md5(md5('ab') + 'ab' + md5('ab')): 'ab',
   md5(md5('ac') + 'ac' + md5('ac')): 'ac',
  ...
}

Either way, you will now have the hashes for all possible two-letter combinations. Now you process the input string. Since MD5 produces 32-character long hashes, break the input up into 32-character strings, and perform lookups against your table:

'f894e71e1551d1833a977df952d0cc9d' => 'he'
'e44a1f9669fbf97d51309a2c6574d5ea' => 'll'
...

NsaNinja · Answer

Here is implementation of your question in python.

My Code:

import hashlib, string

# lambda function for MD5
md5hashFunction = lambda data: hashlib.md5(data.encode()).hexdigest()

# lambda function for  md5(md5(data) + data + md5)
finalHash = lambda data: md5hashFunction(
    md5hashFunction(data) + data + md5hashFunction(data)
)


# All MD5 hashes are 32 char length size therefore we need dive 32 fixed parts
hashes = [
    "f894e71e1551d1833a977df952d0cc9d",
    "e44a1f9669fbf97d51309a2c6574d5ea",
    "a746cdeb9ee1a5dfc771d280d33e5672",
    "bf024973657c99bf80cb242d493d5bac",
    "c771b3b0b422d5c13595cf3e73cfb1df",
    "91caedee7a6c5f3ce2c283564a39c52d",
    "3306d60cbc0e3e33d7ed01e780acb1cc",
    "d9174cfea4704eb233b0f06e52f6d5ab",
    "a5a5a89e6122dd55f8efcf024961c100",
    "3d116007775d60a0d5781d2e35d747b5",
    "dece2e0e3d79d272e40c8c66555f5525",
]


# Enumurate all alphabet and extra characters for decryption => "_+.@"
alphabet = list(
    string.ascii_lowercase + string.ascii_uppercase + string.digits + "_+.@"
)

# Create python dictionary for map hashes to string
rainbowTable = {finalHash(x + y): x + y for x in alphabet for y in alphabet}
"""
rainbowTable

'31453dd786a8c6f6c7c8860d5fcea4be': 'aa',
 '857dce5bcf6b6b32bec281207b2dba80': 'ab',
 'e90d94b4b65ac19188fdae82acf7fbbc': 'ac',
 '67299b8cedc5eafea7dda1daf9356b54': 'ad',
 '40fca4e80bfc6e1faa2c4e2b7e0929f0': 'ae',
 'de48fc1bd98f5508c513f9947a514ce8': 'af',
 '4852089b1b43b45204907df0066c0edf': 'ag',
 'e1b82a5fe4fdcf73d034a0d5063ffe3f': 'ah',
         ...... Continues....

"""

# Search for matched hash and join to single string
print("".join([rainbowTable[hash] for hash in hashes]))


"""
f894e71e1551d1833a977df952d0cc9de44a1f9669fbf97d51309a2c6574d5eaa746cdeb9ee1a5df
c771d280d33e5672bf024973657c99bf80cb242d493d5bacc771b3b0b422d5c13595cf3e73cfb1df
91caedee7a6c5f3ce2c283564a39c52d3306d60cbc0e3e33d7ed01e780acb1ccd9174cfea4704eb2
33b0f06e52f6d5aba5a5a89e6122dd55f8efcf024961c1003d116007775d60a0d5781d2e35d747b5
dece2e0e3d79d272e40c8c66555f5525

"""


"""
Output ==> [email protected]
"""

Recover email address from special application of MD5 hash function

Tags:

hash

md5

encryption

user3701097

2 Answers

savanto

NsaNinja

Recent Activity

Donate For Us

Recover email address from special application of MD5 hash function

Tags:

hash

md5

encryption

user3701097

2 Answers

savanto

NsaNinja

Related questions

Recent Activity

Donate For Us