The temporal complexity of the Search operation in the TRIE data structure
I’m trying to implement a dictionary-based trie data structure. Please find the Python code below.
class Trie:
def __init__(self):
self.word_end = '_end_'
self.root = {}
def add_word(self, word):
self.add_words([word])
def add_words(self, words):
for word in words:
current_dict = self.root
for letter in word:
current_dict = current_dict.setdefault(letter, {})
current_dict[self.word_end] = self.word_end
def check_word(self, word):
current_dict = self.root
for letter in word:
if letter in current_dict:
current_dict = current_dict[letter]
else:
return False
else:
if self.word_end in current_dict:
return True
else:
return False
tree = Trie()
tree.add_words(['foo', 'bar', 'baz', 'barz'])
print tree
"""
{'b': {'a': {'r': {'_end_': '_end_', 'z': {'_end_': '_end_'}},
'z': {'_end_': '_end_'}}},
'f': {'o': {'o': {'_end_': '_end_'}}}}
"""
print check_word('baz')
# True
print check_word('barz')
# True
print check_worf('barzz')
# False
I see that the complexity of searching for a word is O(m), where m is the length of the searched term. Also, the complexity of adding a word is very similar – O(m), where m is the length of the word to be added.
Question:
These complexities are too good. Can someone confirm these complexities? Is there a problem with my Trie tree implementation?
Solution
The
time complexity of searching in TRIE is indeed O(k), where k is the length of the string to be searched. This is a true and proven fact. However, storage requirements are a disadvantage. Each node of Trie consists of multiple branches. Each branch represents one possible key character. We need to mark the last node of each key as the end of the word node. A Trie node field, isEndOfWord, is
used to distinguish the node as a end-of-word node.
In general, the cost of inserts and searches is O(length_of_string), but the memory requirement for Trie is O(ALPHABET_SIZE * length_of_string * N), where N is the number of keys in Trie. There are efficient trie node representations (e.g. compressed trie, ternary search tree, etc.) to minimize the memory requirements of trie.