Skip to content

Commit

Permalink
Add HashMapT salt, which allows creation of salt with Nat.
Browse files Browse the repository at this point in the history
This allows clients to create custom salted hashmaps.
For backwards compatibility we use

```haskell
 -- backwards compatibility
type HashMap = HashMapT DefaultSalt
```

Then modify the functions to be free of salt if they can, for example insert:

```haskell
insert :: forall k v salt . (Eq k, Hashable k) => k -> v -> HashMapT salt k v -> HashMapT salt k v
insert k v m = insert' (hash salt k) k v m
  where
    salt = natVal (Proxy :: Proxy salt)
```

This allows the default HashMap with backwards compatibility,
but also any other HashMapT
I think this solves the issue with having different salts in
an intersect:

```haskell
intersection :: (Eq k, Hashable k) => HashMapT salt k v -> HashMapT salt k w -> HashMapT salt k v
```

Because salt is the same type variable for all arguments,
it's enforced to be the same.
Then you can also provide a function to resalt if the user ever ends
up with different salts and still wants to do an intersect.
(which results in a reconstruction of the hashmap).

See thread: #319

Fix the defaultHash issues

Be more verbose about default value

Add comma

Fix CI maybe

link to source of magick number

Fix default hash assertion
  • Loading branch information
jappeace committed Sep 22, 2021
1 parent 6588174 commit 73fa02e
Show file tree
Hide file tree
Showing 10 changed files with 437 additions and 229 deletions.
407 changes: 237 additions & 170 deletions Data/HashMap/Internal.hs

Large diffs are not rendered by default.

123 changes: 77 additions & 46 deletions Data/HashMap/Internal/Strict.hs

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions Data/HashMap/Lazy.hs
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,13 @@ module Data.HashMap.Lazy
-- $strictness

HashMap
, HashMapT

-- * Construction
, empty
, singleton
, empty'
, singleton'

-- * Basic interface
, null
Expand Down Expand Up @@ -101,6 +104,9 @@ module Data.HashMap.Lazy
, fromList
, fromListWith
, fromListWithKey
, fromList'
, fromListWith'
, fromListWithKey'

-- ** HashSets
, HS.keysSet
Expand Down
6 changes: 5 additions & 1 deletion Data/HashMap/Strict.hs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ module Data.HashMap.Strict
-- * Strictness properties
-- $strictness

HashMap
HashMapT
, HashMap

-- * Construction
, empty
Expand Down Expand Up @@ -100,6 +101,9 @@ module Data.HashMap.Strict
, fromList
, fromListWith
, fromListWithKey
, fromList'
, fromListWith'
, fromListWithKey'

-- ** HashSets
, HS.keysSet
Expand Down
32 changes: 31 additions & 1 deletion benchmarks/Benchmarks.hs
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
{-# LANGUAGE CPP, DeriveAnyClass, DeriveGeneric, GADTs, PackageImports, RecordWildCards #-}
{-# LANGUAGE CPP, DeriveAnyClass, DataKinds, DeriveGeneric, GADTs, PackageImports, RecordWildCards #-}

#if __GLASGOW_HASKELL__ >= 802
{-# LANGUAGE TypeApplications #-}
#endif

module Main where

Expand Down Expand Up @@ -342,11 +346,30 @@ main = do
, bench "ByteString" $ whnf HM.fromList elemsBS
, bench "Int" $ whnf HM.fromList elemsI
]
#if __GLASGOW_HASKELL__ >= 802
, bgroup "long custom salt" -- 18446744073710551615 = fromInteger ((toInteger (maxBound :: Word64)) + 1000000)
[ bench "String" $ whnf (HM.fromList' @_ @_ @18446744073710551615) elems
, bench "ByteString" $ whnf (HM.fromList' @_ @_ @18446744073710551615) elemsBS
, bench "Int" $ whnf (HM.fromList' @_ @_ @18446744073710551615) elemsI
]
#endif
, bgroup "short"
[ bench "String" $ whnf HM.fromList elemsDup
, bench "ByteString" $ whnf HM.fromList elemsDupBS
, bench "Int" $ whnf HM.fromList elemsDupI
]
#if __GLASGOW_HASKELL__ >= 802
, bgroup "short custom salt" -- 18446744073710551615 * 10
[ bench "String" $ whnf (HM.fromList' @_ @_ @184467440737105516150) elemsDup
, bench "ByteString" $ whnf (HM.fromList' @_ @_ @184467440737105516150) elemsDupBS
, bench "Int" $ whnf (HM.fromList' @_ @_ @184467440737105516150) elemsDupI
]
, bgroup "short custom salt 42" -- 18446744073710551615 * 10
[ bench "String" $ whnf (HM.fromList' @_ @_ @42) elemsDup
, bench "ByteString" $ whnf (HM.fromList' @_ @_ @42) elemsDupBS
, bench "Int" $ whnf (HM.fromList' @_ @_ @42) elemsDupI
]
#endif
]
-- fromListWith
, bgroup "fromListWith"
Expand All @@ -360,6 +383,13 @@ main = do
, bench "ByteString" $ whnf (HM.fromListWith (+)) elemsDupBS
, bench "Int" $ whnf (HM.fromListWith (+)) elemsDupI
]
#if __GLASGOW_HASKELL__ >= 802
, bgroup "short custom salt"
[ bench "String" $ whnf ((HM.fromListWith' @_ @_ @10) (+)) elemsDup
, bench "ByteString" $ whnf ((HM.fromListWith' @_ @_ @10) (+)) elemsDupBS
, bench "Int" $ whnf ((HM.fromListWith' @_ @_ @10) (+)) elemsDupI
]
#endif
]
]
]
Expand Down
33 changes: 33 additions & 0 deletions docs/migration-salt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
The salt changes are backwards compatible.

However, if you want to let a client make use of
custom salts some effort is required,
running commands like these should get you somewhere:

```ed
:%s/HashMap k/HashMapT salt k/g
:%s/Hashable k)/Hashable k, KnownNat salt)
```

HashMap is now an alias to HashMapT with a hardcoded DefaultSalt.
These changes allow salt to be anything.
semigroup operations (a -> a -> a) can use the same salt to guarantee
not having to rebuild the hahsmap.


If you encounter this error:
```
• Illegal instance declaration for ‘SomeTypeClass (HashMap k v)’
(All instance types must be of the form (T t1 ... tn)
where T is not a synonym.
Use TypeSynonymInstances if you want to disable this.)
```
usually it's good enough to provide the instance with a free salt:

```haskell
instance SomeTypeClass (HashMap salt k v) where
...

```
If it it starts complaining about not salt not matching DefaultSalt,
use the `'` constructors such as `empty'` and `singleton'`
10 changes: 6 additions & 4 deletions tests/HashMapProperties.hs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ module Main (main) where

import Control.Monad ( guard )
import qualified Data.Foldable as Foldable
import GHC.TypeLits(KnownNat)
#if MIN_VERSION_base(4,10,0)
import Data.Bifoldable
#endif
Expand All @@ -16,11 +17,11 @@ import Data.Hashable (Hashable(hashWithSalt))
import qualified Data.List as L
import Data.Ord (comparing)
#if defined(STRICT)
import Data.HashMap.Strict (HashMap)
import Data.HashMap.Strict (HashMap, HashMapT)
import qualified Data.HashMap.Strict as HM
import qualified Data.Map.Strict as M
#else
import Data.HashMap.Lazy (HashMap)
import Data.HashMap.Lazy (HashMap, HashMapT)
import qualified Data.HashMap.Lazy as HM
import qualified Data.Map.Lazy as M
#endif
Expand All @@ -41,8 +42,9 @@ newtype Key = K { unK :: Int }
instance Hashable Key where
hashWithSalt salt k = hashWithSalt salt (unK k) `mod` 20

instance (Eq k, Hashable k, Arbitrary k, Arbitrary v) => Arbitrary (HashMap k v) where
arbitrary = fmap (HM.fromList) arbitrary
instance (Eq k, Hashable k, Arbitrary k, Arbitrary v, KnownNat salt)
=> Arbitrary (HashMapT salt k v) where
arbitrary = fmap (HM.fromList') arbitrary

------------------------------------------------------------------------
-- * Properties
Expand Down
32 changes: 31 additions & 1 deletion tests/Regressions.hs
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE CPP #-}

module Main where

import Control.Applicative ((<$>))
import Control.Exception (evaluate)
import Control.Monad (replicateM)
import Data.Hashable (Hashable(..))
import qualified Data.Hashable as HAS
import qualified Data.HashMap.Strict as HM
import qualified Data.HashMap.Lazy as HML
import Data.List (delete)
Expand All @@ -16,11 +19,14 @@ import GHC.IO (IO (..))
import System.Mem (performGC)
import System.Mem.Weak (mkWeakPtr, deRefWeak)
import System.Random (randomIO)
import Test.HUnit (Assertion, assert)
import Test.HUnit (Assertion, assert, (@=?))
import Test.Framework (Test, defaultMain)
import Test.Framework.Providers.HUnit (testCase)
import Test.Framework.Providers.QuickCheck2 (testProperty)
import Test.QuickCheck
import Data.Proxy
import qualified Data.HashMap.Internal as Internal
import Data.Text(Text, pack)

issue32 :: Assertion
issue32 = assert $ isJust $ HM.lookup 7 m'
Expand Down Expand Up @@ -124,6 +130,28 @@ issue254Strict = do
touch mp
assert $ isNothing res

goldenHash :: Assertion
goldenHash = do
HAS.hash someString @=?
fromIntegral (toInteger (Internal.hash (Proxy :: Proxy Internal.DefaultSalt) someString))

someString :: Text
someString = pack "hello world"

assertDefaultHash :: Assertion
assertDefaultHash = do
-- I only found this by testing, the WORD_SIZE_IN_BITS doesn't
-- appear to be set, we get the warning, yet the
-- larger value in upstream is still being used.
-- https://github.com/haskell-unordered-containers/hashable/blob/master/src/Data/Hashable/Class.hs#L221
#if MIN_VERSION_hashable(1,3,1)
HAS.hash someString @=?
HAS.hashWithSalt 14695981039346656037 someString
#else
HAS.hash someString @=?
HAS.hashWithSalt 0xdc36d1615b7400a4 someString
#endif

------------------------------------------------------------------------
-- * Test list

Expand All @@ -135,6 +163,8 @@ tests =
, testProperty "issue39b" propEqAfterDelete
, testCase "issue254 lazy" issue254Lazy
, testCase "issue254 strict" issue254Strict
, testCase "make sure default hash remains the same for backwards compatbility" goldenHash
, testCase "asserts the default hash in case they change it" assertDefaultHash
]

------------------------------------------------------------------------
Expand Down
13 changes: 8 additions & 5 deletions tests/Strictness.hs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
{-# LANGUAGE CPP, FlexibleInstances, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}

module Main (main) where
Expand All @@ -14,13 +15,15 @@ import Data.Maybe (fromMaybe, isJust)
import Control.Arrow (second)
import Control.Monad (guard)
import Data.Foldable (foldl')
import GHC.TypeLits(KnownNat)
#if !MIN_VERSION_base(4,8,0)
import Data.Functor ((<$))
import Data.Foldable (all)
import Prelude hiding (all)
#endif

import Data.HashMap.Strict (HashMap)

import Data.HashMap.Strict (HashMapT, HashMap)
import qualified Data.HashMap.Strict as HM

-- Key type that generates more hash collisions.
Expand All @@ -30,9 +33,9 @@ newtype Key = K { unK :: Int }
instance Hashable Key where
hashWithSalt salt k = hashWithSalt salt (unK k) `mod` 20

instance (Arbitrary k, Arbitrary v, Eq k, Hashable k) =>
Arbitrary (HashMap k v) where
arbitrary = HM.fromList `fmap` arbitrary
instance (Arbitrary k, Arbitrary v, Eq k, Hashable k, KnownNat salt) =>
Arbitrary (HashMapT salt k v) where
arbitrary = HM.fromList' `fmap` arbitrary

instance Show (Int -> Int) where
show _ = "<function>"
Expand Down Expand Up @@ -100,7 +103,7 @@ pFromListWithKeyStrict f =
-- could be lazy in the "new" value. fromListWith must, however,
-- be strict in whatever value is actually inserted into the map.
-- Getting all these properties specified efficiently seems tricky.
-- Since it's not hard, we verify that the converted HashMap has
-- Since it's not hard, we verify that the converted HashMapT salt has
-- no unforced values. Rather than trying to go into detail for the
-- rest, this test compares the strictness behavior of fromListWith
-- to that of insertWith. The latter should be easier to specify
Expand Down
4 changes: 3 additions & 1 deletion unordered-containers.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,9 @@ test-suite regressions
test-framework >= 0.3.3,
test-framework-hunit,
test-framework-quickcheck2,
unordered-containers
unordered-containers,
text,
hashable

default-language: Haskell2010
ghc-options: -Wall
Expand Down

0 comments on commit 73fa02e

Please sign in to comment.