2014-11-17 146 views
1

我对C++还比较陌生,并且在实现实际工作的add方法时遇到问题。我已经在java中做了一个哈希映射,但是将它转换为C++已被证明是困难的。最重要的是,我必须使用约束(比如不改变头文件中的任何内容),并且不能使用除std :: string和std :: cout/cin之外的任何std库函数。将值添加到单独链接的散列表C++

基本上,我必须创建一个哈希映射,最终将存储用户名(作为一个键)和密码(作为一个值)。在这一点上,用户名/密码组合并不重要,因为实施非常普遍的课程是练习的要点。

只是试图用默认构造函数测试我的HashMap并添加一个值最终给我一个分段错误。当我尝试实现这个哈希表时,我100%确定自己正在做一些可怕的错误。要么我没有正确地将节点索引与节点连接起来,要么不正确地初始化某些内容。

这里是我一起工作的头文件:

#ifndef HASHMAP_HPP 
#define HASHMAP_HPP 

#include <functional> 
#include <string> 

class HashMap 
{ 
public: 
    // Hash functions must conform to these properties: 
    // 
    // (1) Given a particular string s repeatedly, they must always 
    //  return the same hash value. 
    // (2) They do not take the number of buckets into account (as they 
    //  do not receive a parameter that tells them how many buckets 
    //  there are). Any unsigned int value is fair game as a result. 
    //  It will be the job of the HashMap class to reduce the results 
    //  to the range of available bucket indices (e.g., by using the 
    //  % operator). 
    typedef std::function<unsigned int(const std::string&)> HashFunction; 

    // This constant specifies the number of buckets that a HashMap will 
    // have when it is initially constructed. 
    static constexpr unsigned int initialBucketCount = 10; 


public: 
    // This constructor initializes the HashMap to use whatever default 
    // hash function you'd like it to use. A little research online will 
    // yield some good ideas about how to write a good hash function for 
    // strings; don't just return zero or, say, the length of the string. 
    HashMap(); 

    // This constructor instead initializes the HashMap to use a particular 
    // hash function instead of the default. (We'll use this in our unit 
    // tests to control the scenarios more carefully.) 
    HashMap(HashFunction hasher); 

    // The "Big Three" need to be implemented appropriately, so that HashMaps 
    // can be created, destroyed, copied, and assigned without leaking 
    // resources, interfering with one another, or causing crashes or 
    // undefined behavior. 
    HashMap(const HashMap& hm); 
    ~HashMap(); 
    HashMap& operator=(const HashMap& hm); 

    // add() takes a key and a value. If the key is not already stored in 
    // this HashMap, the key/value pair is added; if the key is already 
    // stored, the function has no effect. 
    // 
    // If adding the new key/value pair will cause the load factor of this 
    // HashMap to exceed 0.8, the following must happen: 
    // 
    // (1) The number of buckets should be increased by doubling it and 
    //  adding 1 (i.e., if there were 10 buckets, increase it to 
    //  2 * 10 + 1 = 21). 
    // (2) All key/value pairs should be rehashed into their new buckets, 
    //  important because changing the number of buckets will likely 
    //  change which bucket a particular key hashes to (especialy if 
    //  you're using % to determine the index of that bucket). 
    void add(const std::string& key, const std::string& value); 

    // remove() takes a key and removes it (and its associated value) from 
    // this HashMap if it is already present; if not, the function has no 
    // effect. 
    void remove(const std::string& key); 

    // contains() returns true if the given key is in this HashMap, false 
    // if not. 
    bool contains(const std::string& key) const; 

    // value() returns the value associated with the given key in this HashMap 
    // if the key is stored in this HashMap; if not, the empty string is 
    // returned. (Going forward, we'll discover that throwing an exception 
    // is a better way to handle the scenario where the key is not present, 
    // but we'll conquer that at a later date.) 
    std::string value(const std::string& key) const; 

    // size() returns the number of key/value pairs stored in this HashMap. 
    unsigned int size() const; 

    // bucketCount() returns the number of buckets currently allocated in 
    // this HashMap. 
    unsigned int bucketCount() const; 

    // loadFactor() returns the proportion of the number of key/value pairs 
    // to the number of buckets, a measurement of how "full" the HashMap is. 
    // For example, if there are 20 key/value pairs and 50 buckets, we would 
    // say that the load factor is 20/50 = 0.4. 
    double loadFactor() const; 

    // maxBucketSize() returns the number of key/value pairs stored in this 
    // HashMap's largest bucket. 
    unsigned int maxBucketSize() const; 


private: 
    // This structure describes the nodes that make up the linked lists in 
    // each of this HashMap's buckets. 
    struct Node 
    { 
     std::string key; 
     std::string value; 
     Node* next; 
    }; 


    // Store the hash function (either the default hash function or the one 
    // passed to the constructor as a parameter) in this member variable. 
    // When you want to hash a key, call this member variable (i.e., follow 
    // it with parentheses and a parameter) just like you would any other 
    // function. 
    HashFunction hasher; 


    // You will no doubt need to add at least a few more private members 

    public: 
    // our hash function 
    unsigned int hashFunc(const std::string& key) const; 

private: 
    Node** hashTable; 

    // We need a variable that will always let us know what the current amount 
    // of buckets is. bucketCount will use this and return this variable. 
    unsigned int amountOfBuckets; 
    // we also need the number of keys currently in the hash map. This is stored here 
    unsigned int sz; 
}; 

#endif // HASHMAP_HPP 

这里是我如何实现我的课(HashMap.cpp):

#include "HashMap.hpp" 

// default constructor will initialize Node to default values 
// Create a new hash table with the initial bucket count 
// Set the amount of buckets to the initial bucket count 
// Set the current amount of key/value pairs to zero. 
HashMap::HashMap() 
    : hashTable{new Node*[initialBucketCount]}, amountOfBuckets{initialBucketCount}, sz{0} 
{ 
} 


// constructor that initializes HashMap to use a different hash function other 
// than the default 
HashMap::HashMap(HashFunction hashFunc) 
    : hasher{hashFunc}, hashTable{new Node*[initialBucketCount]}, amountOfBuckets{initialBucketCount}, sz{0} 
{ 
} 

// copy constructor, initializes a new HashMap to be a copy of an existing one 
HashMap::HashMap(const HashMap& hm) 
// commented out for now : 
{ 
} 

// destructor: deallocate the HashMap 
HashMap::~HashMap() 
{ 
    // delete something here 
} 

// Assignment operator that overloads equals 
HashMap& HashMap::operator=(const HashMap& hm) 
{ 
    // FIX COMPILER WARNINGS, DELETE 
    return *this; 
} 

// our hash function, this is for our type def HashFunction 
// pass this through the constructor 
unsigned int HashMap::hashFunc(const std::string& key) const 
{ 
    unsigned int hashValue = 0; // what we end up returning 
    for(int i = 0; i < key.length(); i++) { // iterate through string 
     int letterIndex = key.at(i) - 96; // create an index for each ASCII char 
     // first multiply our current hashValue by a prime number 
     // add to the letter index, to maintain a stable result 
     // mod by the current amount of buckets on each iteration to prevent overflow 
     hashValue = (hashValue * 27 + letterIndex) % bucketCount(); 
    } 
    return hashValue; 
} 

// add function 
void HashMap::add(const std::string& key, const std::string& value) 
{ 
    // Check if key being stored matches key already in hashmap 
    /* BASIC ADD FUNCTION, JUST TO IMPLEMENT UNIT TESTS */ 
    unsigned int hashVal = hashFunc(key); 
    //Node* prev = nullptr; // keeps track of where we are 
    Node* current = hashTable[hashVal]; // the place we store a data item into 
    while(current != nullptr) { 
     //prev = current; // update previous node to point to current 
     // this lets us move current without losing our place 
     current = current->next; // move current to the next node 
    } // stop once we find an empty node 
    // current should equal a nullptr 
    current->key = key; // set key (user) 
    current->value = value; // set password 
    current->next = nullptr; // set the next ptr to be null 

} 

// takes in a key (username), removes it and the value (password) associated 
// with it, otherwise, it has no effect 
void HashMap::remove(const std::string& key) 
{ 
} 

// returns true if given key is in hash map, otherwise returns false 
// this acts as a find method 
bool HashMap::contains(const std::string& key) const 
{ 
    unsigned int hashedValue = hashFunc(key); // hash the key given to get an index 
    if(hashTable[hashedValue] == nullptr) { // if there are no nodes at given index 
     return false; 
    } else { // there are some nodes in the hash table 
     // iterate through each node in the linked list 
     // Node* current = hashTable[hashedValue]; 
     // start at first node (this is current) 
     Node* current = hashTable[hashedValue]; 
     while(current != nullptr && current->key == key) { 
      current = current->next; 
     } // end while 
     if(current == nullptr) { // we reached the end of our linked list 
      return false; // couldn't find a value 
     } else { // we found the key provided 
      return true; 
     } 
    } // end if-else 
} 

// value() returns the value associated with the given key in this HashMap 
// if the key is stored in this HashMap; if not, the empty string is returned. 
std::string HashMap::value(const std::string& key) const 
{ 
    // HANDLES COMPILER WARNINGS, DELETE LATER 
    return ""; 
} 

// size() returns the number of key/value pairs stored in this HashMap. 
unsigned int HashMap::size() const 
{ 
    return sz; 
} 

// bucketCount() returns the number of buckets currently allocated in this HashMap. 
// each bucket is an index for the array, we do not include the linked lists. 
unsigned int HashMap::bucketCount() const 
{ 
    return amountOfBuckets; 
} 

// loadFactor() returns the proportion of the number of key/value pairs 
// to the number of buckets, a measurement of how "full" the HashMap is. 
// For example, if there are 20 key/value pairs and 50 buckets, we would 
// say that the load factor is 20/50 = 0.4. 
double HashMap::loadFactor() const 
{ 
    return sz/amountOfBuckets; 
} 

// maxBucketSize() returns the number of key/value pairs stored in this 
// HashMap's largest bucket. 
unsigned int HashMap::maxBucketSize() const 
{ 
    // HANDLE COMPILER WARNINGS, DELETE LATER 
    return 0; 
} 

主要的方法,我实现现在是添加。目前,我只是试图用一个主函数来测试这个类,以便看看我是否可以向地图添加值,然后测试它是否能够识别地图中是否包含东西。我意识到班级中很少有人是完整的,而且这些功能本身并不完整,但我只是试图测试最基本的案例。

最后,这里是我的主:

#include <iostream> 
#include <string> 
#include "HashMap.hpp" 

int main() 
{ 
    // initialize test 
    HashMap test1; 
    std::cout << "TEST 1 HASHMAP OBJECT CREATED" << std::endl; 

    // add some values 
    // at this point (11/16/2014), I only have contains, add, and hashFunc 
    // test these methods below 
    // constructor doesn't quite work right 
    std::string key1 = "Alex"; 
    std::string value1 = "password1"; 
    std::string key2 = "Danielle"; 
    std::string value2 = "password2"; 

    std::cout << "strings have been created" << std::endl; 

    // add to hash map 
    test1.add(key1, value1); 
    test1.add(key2, value2); 

    std::cout << "Keys and values have been added to hash map" << std::endl; 

    // does key1 contain the word "hi"? no, should return false 
    std::cout << "Hash map contains word hi?: " << test1.contains("hi") << std::endl; 
    // does key2 contain word "Danielle"? yes, should return true 
    std::cout << "Hash map contains word Danielle?: " << test1.contains("Danielle") << std::endl; 
    return 0; 
} 

我使用的是预建的脚本后,我建立它来运行程序。当运行时,我得到这个输出:

TEST 1 HASHMAP OBJECT CREATED 
strings have been created 
./run: line 43: 10085 Segmentation fault  (core dumped) $SCRIPT_DIR/out/bin/a.out.$WHAT_TO_RUN 

基本上,分段故障发生在add函数期间。那么实际上添加了什么?我怎样才能理解散列图应该做得更好?

回答

1

你自己的意见,即目前应等于nullptr正确地预示着失败的下一行:

// current should equal a nullptr 
current->key = key; // set key (user) 

假设一个新分配的数组将充满nullptr通常是不好的做法。 您需要将它设置为构造函数中的所有nullptr

您的add()函数还有其他问题。

这里是在使其至少达到目的的刺:

void HashMap::add(const std::string& key, const std::string& value) 
{ 
// Check if key being stored matches key already in hashmap 
/* BASIC ADD FUNCTION, JUST TO IMPLEMENT UNIT TESTS */ 
unsigned int hashVal = hashFunc(key); 
Node* head=hashTable[hashVal];  
Node* current = head; 
if(current==nullptr){ 
    //Nothing in this bucket so it's definitely new. 
    current=new Node(); 
    current->key = key; // set key (user) 
    current->value = value; // set password 
    current->next = nullptr; // set the next ptr to be nullptr. 
    hashTable[hashVal]=current;  
    return; 
} 
do { //It's a do-while because the if statement above has handled current==nullptr. 
    if(current->key==key){ 
     //We've found a match for the key. 
     //Common hash-table behavior is to overwrite the value. So let's do that. 
     current->value=value; 
     return; 
    } 
    current = current->next; // move current to the next node 
} while(current != nullptr) // stop once we go past the last node. 

//Finally, we found hash collisions but no match on key. 
//So we add a new node and chain it to the node(s) already there. 
// 
//Sometimes it's a good idea to put the new one at the end or if it's likely to get looked up 
//it's also a good idea to put it at the start. 
//It might be a good idea to keep the collision chain sorted and insert into it accordingly. 
//If we sort we can dive out of the loop above when we pass the point the key would be. 
// 
//However for a little example like this let's put the new node at the head. 

current=new Node(); 
current->key = key; // set key (user) 
current->value = value; // set password 
current->next = head; // set the next pointer to be the old head. 
hashTable[hashVal]=current;  

} 

PS:还有你的哈希函数的一个问题。 你正在模主循环内的哈希表大小。 这只会产生限制分配并导致可怕传播的效果。 至少它移动到该函数的最后一行:

return hashValue % bucketCount() ; 

不过我建议将它移入add功能:

unsigned int hashVal = hashFunc(key); //Assume modified to not use bucketCount(). 
unsigned int tableIndex=hashVal % bucketCount();//Reduce hash to a valid index. 
Node* head=hashTable[tableIndex]; //TODO: Do same in the other accesses to hashTable... 

然后你可以在全哈希值存储在Node结构并在current->key==key之前将其用作较强的预对比。如果散列表可能非常满,那么可以获得很大的性能提升。这取决于如果你想腾出一个unsigned intNode字节。 如果你这样做,你把尖端排序的冲突链,则可以通过将哈希码这样做,可能会通常会避免在add()新的密钥或get()未出现,通常只有一次成功get()比较任意字符串或覆盖add()

0

在你的HashMap :: add中,你解引用了一个空指针。您可以在构造函数中创建一个大小为10个元素的节点指针数组,但在您的代码中,您从不创建任何节点对象并将其分配给数组中的指针。所以,当你这样做:

current->key = key; 

你访问一个节点指针为0

+0

我怎么会有在某个数组值点我节点指针以一个新的节点对象?我认为Node * current = hashTable [hashVal];会这样做。你说的话是有道理的,我只是不能想办法把它工作:( – Alex

+0

你的表是一个指针数组,指针都是NULL。你顺利地拿到了一个空指针你的表,但不帮忙写回表您应该将哈希值转换成桶的索引,然后存储'hashTable中[斗] =新节点(...);' – Useless