boost hash returning the same value for different inputs

advertisements

I have two objects, Account and Transaction where Transaction is the unique pair of Account and an incrementing id number. I want to use boost::hash to get unique values for these and have overloaded the hash_value method per the instructions: http://www.boost.org/doc/libs/1_53_0/doc/html/hash/custom.html

class Account {
  ...
};

class Transaction
{
    Account account;
    unsigned int id;
};

Account's hash_value method works correctly, and the value returned is always unique for a given account, however to make the unique pair, Transaction's method needs to use hash _combine (per boost's instructions ):

inline std::size_t hash_value( const Account& acct )
{
    boost::hash<int> hasher;
    size_t rval = hasher( acct.id() ); //just an int. guaranteed to be unique
    return rval;
}

inline std::size_t hash_value( const Transaction& t )
{
    std::size_t seed = 0;
    boost::hash_combine( seed, t.account );
    boost::hash_combine( seed, t.id );

    return seed;
}

This returns the same values for different inputs sometimes. Why?? I only have a few thousand accounts, and the id number only goes up to a few hundred thousand. this doesn't seem like an upper bound issue.

Does anyone know if this is a bug, or if I need to seed boost hash?

Thanks


Look up perfect hashing, and the birthday paradox, and for completeness's sake the pigeonhole principle.

What it boils down to is hash functions generally do produce collisions,unless what you're hashing has very specific properties you've taken advantage of. Your chances of seeing a hash collision for any given set of keys is going to be counterintuitively high just because that's one of the mathematical realities we're not wired for: with a 1/365 chance of getting any particular hash, your odds of a collision are 50/50 given just 23 keys.