Meaningful Collisions (for SHA-1)
This website is supposed to illustrate progress in the cryptanalysis of commonly used hash functions in an informal way. We do this by considering the amount of control an attacker has over the content of a pair of colliding messages. The focus is on SHA-1.
We classify the amount of control with increasing difficulty for an attacker as follows:
- One Commonly Chosen Prefix
- One Commonly Chosen Prefix + Partial Control over Colliding Blocks
- Two Arbitrary Different Chosen Prefixes
- Two Arbitrary Different Chosen Prefixes + Partial Control over Colliding Blocks
Note that for colliding SHA-1 message pairs (as for all other hash functions following a similar design principle) it is always possible to append suffixes to both messages as long as they are the same. To simplify matters here, this is not mentioned in the classification. Subsequently we only list those examples where there is reason to belief that the methods scale from step-reduced to full SHA-1. Attacks of type 3 and type 4 have so far not been shown on members of the MD4 family including SHA-1. We are currently preparing a paper where we analyze and demonstrate the feasibility of such attacks.
Color Code
Under control, attacker can freely choose the blue part of the message without affecting the running time of the collision search algorithm. |
1. One Commonly Chosen Prefix
This is certainly the simplest and easiest setting for an attacker. No constraints on the messages. The example given by Wang et al. for 58-step SHA-1 or our early example for 64-step SHA-1 are of that kind.
The attacker gets to choose an (almost arbitrary long) prefix, the starting point for the collision search attack changes. For some attacks this does not matter. The attacks by Wang et al. on MD4 and MD5 are of that type. The differential path / characteristics used for the attack on SHA-1 by Wang et al. (here or here) do however not allow all prefixes, but require an additional prefix block do adjust the starting point for the actual attack accordingly.
![]() |
Common prefix, arbitrary length, under attacker’s control |
| Colliding message pair, length is usually one or two blocks, not under attacker’s control | |
In (paper, slides) we develop a new view on this problem and describe a practical computer implementable method for finding a collision search method for SHA-1 for any starting points.
Applications, demonstrations and some of their limitations:
- Colliding binary executables (examples for MD5: here, here)
- Colliding postscript (here, here) and other file formats (here). A common property of these examples is as follows. They exploit indexing or scripting features of the format. Hence in the files that hash to the same value, both versions of the content have to be included. To overcome this limitation, a more powerful collision search attack is needed. See type 3: Two Arbitrary Different Chosen Prefixes
- Colliding X.509 certificates. X.509 certificates have the same signature if their ToBeSigned part hashes to the same value. In (here), a method is demonstrated which uses type-1 meaningful collisions for MD5 to replace a key in an X.509 certificate without changing its signature. However, it is not possible to change other fields like SubjectName which would cause more confusion. See type 3: Two Arbitrary Different Chosen Prefixes on how this limitation can be removed.
up
2. One Commonly Chosen Prefix + Partial Control over Colliding Blocks
In addition to choosing a common prefix for both messages, the attacker also gets to influence parts of the different but colliding message blocks.
![]() |
Common prefix, arbitrary length, under attacker’s control |
| Colliding message pair, length is usually one or two blocks, partial influence by the attacker | |
Our method announced at the CRYPTO 2006 rump session (slides) allows this kind of influence by the attacker. Up to a certain level of influence this can be done without seriously affecting the effort required to find a colliding pair of messages. According to our experiments, more than 25% are achievable for 64-step SHA-1.
Applications:
All previous applicataions can get easier to do in some settings, especially if format restrictions apply.
up3. Two Arbitrary Different Chosen Prefixes
If we allow two different prefixes before searching for a colliding message pair, dedicated collision search attacks get more complex. The reason is that an uncontrolled difference in the internal state of the hash function needs to be cancelled out as well.
![]() |
Different prefixes, arbitrary length, both under attacker’s control. |
| Colliding message pair, not under attacker’s control | |
Possible impact of such an attack:
- Data files (Postscript, PDF, Images, etc...): The big difference to weaker attacks as outlined above is that it is no longer necessary to have both meaningful contents in each file. Hence, given one of the files, it gets harder for an expert to judge if a colliding file exists. It remains to hide the randomness (red part of the illustration above). Interlaced GIF or Progressive JPEG might offer suitable places to do so. Partial control over the colliding blocks (see type 4) can also help to deal with this issue.
- Certificates (e.g. according to X.509): The ability to have different prefixes under control allows to have completely different X.509 certificates that have the same signature. Randomness can be hidden in the keys or in other places.
- Faster Herding Attacks: Herding attacks (preprint) are a special kind of 2nd preimage attack on all iterated hash functions. Taking SHA-1 as an example, the complexity of such an attack is 2^109 instead of 2^160. With the ability to perform type 3 collision attacks, this complexity can be reduced even further. Commitment schemes as e.g. used in some lottery systems fall into this catergory.
- Assumptions for security proofs for certain signature schemes are affected.
- ... some more? Send us a mail.
In parallel to our work on this type of meaningful collisons for SHA-1, Marc Stevens and Benne de Weger work on a similar attack on MD5. Make sure to check our their project HashClash.
up4. Two Arbitrary Different Chosen Prefixes + Partial Control over Colliding Blocks
In addition to choosing prefixes for both messages independently, the attacker also gets to influence parts of the different but colliding message blocks.
![]() |
Common prefix, arbitrary length, under attacker’s control. |
| Colliding message pair, partial influence by the attacker | |
Possible impact of such an attack:
Same as type 3, but hiding randomness made easier.
up
Responsible for the content:




