c++
What would be a good hash function for this list of English words?
Here's my code on Github if anyone would like to see it : https://github.com/daicain/Hashing-Project Currently, I am using a tablesize of 80, since I have about 73 words in the file. My current method of hashing is pretty basic and generic. I add up the ASCII value of the letters after I make them all lowercase, then I mod (%) by the tablesize (80 currently). I am getting a lot of collisions, and a lot of unused bucket/indexes. Since I know exactly which words I need to hash and how many, are there better methods to use, for the least possible collisions? My goal is to get 6 or less. Also, side question. Once the words are in the hashtable, if I want to look up a certain word, but type that word incorrectly, or scrambled up, how would I find it in the hashtable? For example, if I have "apple" in the hashtable, and for my search, I use "leppa", which is apple spells backward, whats a good way to unscramble "leppa" in such a way that, apple would come out? Please ask me if you're unsure about what I just ask, sorry if I'm not clear!
Murmur hash is considered fast and will probably give good distribution http://en.wikipedia.org/wiki/MurmurHash In order to look for a "scrambled" text in a hash, you need to use hash-function that is agnostic to the letters order - pretty bad idea since all permutations will be in the same hash bucket
Try md5, you won't have collisions in your dictionary. You may simply use std::hash: #include <string> #include <iostream> #include <functional> int main() { std::string str = "air conditioner"; size_t h = std::hash<std::string>()(str); std::cout << "hash of \"" << str << "\" is " << h << std::endl; } commonly it might be implemented as fnv1 hash. Another good hash function is murmur. Check related question on stackexchange for other common hash functions.
Related Links
code::blocks debugger stops in wrong places (c++, windows 10)
[C++]Retrieve object data from multimap with values from csv
std::bind a regular function but with an epilogue
How can a std::vector name be treated like a C-type array name?
Spirit X3: Basic example for compound components does not compile
GLEW _First was nullptr
Visual Studio C++ Issues [closed]
Find server IP using Boost Asio
Why is a const variable sometimes not required to be captured in a lambda?
Universal templated setter
Normalize samples with ffmpeg
How to do Kinect v2 Fusion with multiple cameras properly using Microsoft API
Wrapping C++ with Go build
xcb Unknown sequence number while processing queue in libgraph
Installation SymbolicC++
tbb increment number of vector element without using mutex