All HashMaps use the same default salt to hash keys #265

sjakobi · 2020-06-04T19:42:54Z

The Hashable class that this package relies on has a hashWithSalt method that can be used to hash values with a custom salt. Currently this package doesn't use it, relying on the hash method instead, which uses a hard-coded default salt.

This means that it is currently rather easy to artificially produce hash collisions that trigger the performance degration reported in #121.

The text was updated successfully, but these errors were encountered:

sjakobi · 2020-06-04T19:51:19Z

In order to allow users to specify custom salts, HashMap would need to be extended with an Int field to store the salt. We could then add constructors like

emptyWithSalt :: Int -> HashMap k v

…to allow users to pass in a custom salt.

sjakobi · 2020-06-04T19:59:07Z

@tibbe mentioned in an email:

a new salt is unfortunately not enough if you don't have a strong hash function. You can generate collisions without knowing the salt. SipHash with a per-process random salt is the only solution I know for strings that don't perform terribly. Bryan O'Sullivan and I tried that but we couldn't get it to work with OK performance (i.e. not worse than Data.Map). Other language ecosystems seem to have come to the same conclusion about SipHash, at least when I looked years ago.

To me it seems that custom salts might not make it impossible to artificially produce collisions, but they should make it quite a bit harder.

Other implementations also do this. For example Rust's std::collections::HashMap:

By default, HashMap uses a hashing algorithm selected to provide resistance against HashDoS attacks. The algorithm is randomly seeded, and a reasonable best-effort is made to generate this seed from a high quality, secure source of randomness provided by the host without blocking the program.

tibbe · 2020-06-05T06:35:55Z

The "randomly seeded" part here is important. There are really two things that go together:

Random (i.e. unknown to the attacker) salt. If it's known to the attacker there's no difference between using a salt and not. Note that a weak hash function doesn't really benefit from a salt.
Strong enough hash function. From my understanding of SipHash the random salt is required for this strength. Perhaps because otherwise one can do an offline attack against SipHash,

infinity0 · 2020-06-05T11:53:23Z

Yes, the seed has to be unpredictable to an attacker, and the hash function itself (taking in the seed) has to be strong enough so the attacker can't just brute-force the unpredictability. So Hashable probably won't cut it, but also using SipHash itself with a predictable seed won't be sufficient either. What I wrote in another ticket:

[..] since unordered-containers is pure, to support a protection against this type of attack, the container constructor would have to take in an additional key parameter for seeding the hash function with, which could be generated by the caller via IO. For convenience, there could be a default seed in a top-level binding that is generated from runtime entropy via unsafePerformIO, but IMO the API should retain the possibility for the user to pass an explicit seed of their choice.

sjakobi · 2020-06-06T11:26:32Z

the hash function itself (taking in the seed) has to be strong enough so the attacker can't just brute-force the unpredictability

Does this imply that enabling HashMap to use different/random seeds would be useless, unless we also switch to a stronger hash function than what Hashable currently provides? (Which hash function does hashable currently use BTW? – I can't find any documentation on this)

My impression was that susceptibility to HashDoS is a somewhat gradual affair where some attacks could already be prevented by just using an unpredictable seed. Is that wrong or possibly just a bad line of thought?

tibbe · 2020-06-06T16:20:08Z

Yes, the random seed is useless, from a security perspective, without a stronger hash function. It's still needed to chain hashes to e.g. hash a record.

We currently use FNV1a for byte string and text I think. One could imagine having a newtype to use another one (but this doesn't give security by default for users).

A more random seed really doesn't help without a strong hash function.

sjakobi · 2021-09-27T17:47:21Z

I currently see two main approaches for making the hash salt less predictable for an attacker:

Allow setting a different salt per map, as proposed in Allow the hash salt to be set when constructing a map/set #45. Add HashMapT salt, which allows setting of salt with Nat. #321 is one possible implementation of this. A different variant would look like somewhat like this:
```
data Hashmap = HashMap
  { hm :: !HashMap' -- the old HashMap type
  , salt :: !Int
  }
```
Use a global salt that is randomly generated on start-up, similar to hashable's random-init-seed flag.

For (1.) I see two big disadvantages:

It requires enhancing the API with functions that allow passing in a salt.
Combining maps seems to cause issues:
1. If we disallow combining maps with different salts, that seems like a hole in the API to me.
2. If the API doesn't distinguish between same-salt and different-salt cases, performance becomes less predictable.
3. If we add a second set of functions for combining maps with different salts, that would bloat the API even further.

Therefore (2.) currently seems more attractive to me. However, might (2.) be less secure than (1.)? How does leaking a salt work, and will (2.) be more prone to that than (1.)?

sjakobi added the performance label Jun 4, 2020

sjakobi mentioned this issue Jun 4, 2020

Cuckoo loses data when inserting 65th element via mutate gregorycollins/hashtables#55

Closed

sjakobi mentioned this issue May 29, 2021

Allow the hash salt to be set when constructing a map/set #45

Open

sjakobi mentioned this issue Sep 13, 2021

Vulnerability to collision attacks #319

Open

sjakobi added the collision-attacks label Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All HashMaps use the same default salt to hash keys #265

All HashMaps use the same default salt to hash keys #265

sjakobi commented Jun 4, 2020 •

edited

Loading

sjakobi commented Jun 4, 2020

sjakobi commented Jun 4, 2020

tibbe commented Jun 5, 2020

infinity0 commented Jun 5, 2020 •

edited

Loading

sjakobi commented Jun 6, 2020

tibbe commented Jun 6, 2020

sjakobi commented Sep 27, 2021 •

edited

Loading

All HashMaps use the same default salt to hash keys #265

All HashMaps use the same default salt to hash keys #265

Comments

sjakobi commented Jun 4, 2020 • edited Loading

sjakobi commented Jun 4, 2020

sjakobi commented Jun 4, 2020

tibbe commented Jun 5, 2020

infinity0 commented Jun 5, 2020 • edited Loading

sjakobi commented Jun 6, 2020

tibbe commented Jun 6, 2020

sjakobi commented Sep 27, 2021 • edited Loading

sjakobi commented Jun 4, 2020 •

edited

Loading

infinity0 commented Jun 5, 2020 •

edited

Loading

sjakobi commented Sep 27, 2021 •

edited

Loading