2
\$\begingroup\$

I am trying to implement a basic hash table using a list of tuples. The code produce the desired output. Please help me improve my code as I feel there are a lot of while loops.

#global variables to keep track of the hash table count = 0 size = 1000 def hash_table(size): return [None] * size def hash_item(key, value): return (key,value) def hash_func(key): #simple hash function index = hash(key) % 1000 return index def insert_hash(table,key,value): index = hash_func(key) item = hash_item(key,value) global count global size if count == size: raise Exception("Hash table limit reached") if table[index] == None: table[index] = item else: collision_handler(table, index, item) count = count + 1 def print_hash(table,key): index = hash_func(key) while table[index][0] != key: #in case there is a collision: index = index + 1 print(table[index]) def delete_hash(table,key): index = hash_func(key) while table[index][0] != key: index = index + 1 table.pop(index) def collision_handler(table,index,item): #handling collisions using open addressing while table[index] != None: index = index + 1 table[index] = item a = hash_table(size) b = hash_func('1') insert_hash(a, '1','john') print_hash(a,'1') delete_hash(a,'1') insert_hash(a,'1','edward') print_hash(a,'1') 
\$\endgroup\$

    1 Answer 1

    1
    \$\begingroup\$

    Despite the code produce the desired output in your simple test case, your implementation is quite flawed.

    a = hash_table(size) insert_hash(a, 1, 'john') insert_hash(a, 2, 'edward') delete_hash(a, 1) print_hash(a, 2) 

    Output:

    Traceback (most recent call last): File "C:/Code Review/hash_table.py", line 53, in <module> print_hash(a, 2) File "C:/Code Review/hash_table.py", line 30, in print_hash while table[index][0] != key: #in case there is a collision: TypeError: 'NoneType' object is not subscriptable >>> 

    The issue: table.pop(index) will move items after the one being delete forward by one position in the table. Given that they are indexed by the hash values, this would mean that they can't be found anymore.

    a = hash_table(size) for _ in range(2000): insert_hash(a, 1, 'john') delete_hash(a, 1) 

    Output:

    Traceback (most recent call last): File "C:/Code Review/hash_table.py", line 51, in <module> insert_hash(a, 1, 'john') File "C:/Code Review/hash_table.py", line 22, in insert_hash if table[index] == None: IndexError: list index out of range >>> 

    After each insert/delete, the table size has shrunk by one item. After a thousand insert/deletes, the table can't hold any values!

    a = hash_table(size) insert_hash(a, 999, 'john') insert_hash(a, 999, 'edward') 

    Output:

    Traceback (most recent call last): File "C:/Code Review/hash_table.py", line 51, in <module> insert_hash(a, 999, 'edward') File "C:/Code Review/hash_table.py", line 25, in insert_hash collision_handler(table, index, item) File "C:/Code Review/hash_table.py", line 42, in collision_handler while table[index] != None: IndexError: list index out of range >>> 

    A collision at the last slot in the hash table doesn't have room to handle the collision, despite the table only having a 999 empty slots. You need to use circular addressing.

    You need to add more test cases, and rework the code to handle these issues. I generated these issues quickly by exploiting the fact that hash(int) == int for sufficiently small integers, but they would arise using strings for keys as well. It is just be harder to generate cases which deterministicly fail using string keys.

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.