Python – The temporal complexity of the Search operation in the TRIE data structure

The temporal complexity of the Search operation in the TRIE data structure… here is a solution to the problem.

The temporal complexity of the Search operation in the TRIE data structure

I’m trying to implement a dictionary-based trie data structure. Please find the Python code below.

class Trie:
  def __init__(self):
     self.word_end = '_end_'
     self.root = {}

def add_word(self, word):
     self.add_words([word])

def add_words(self, words):
     for word in words:
         current_dict = self.root
         for letter in word:
             current_dict = current_dict.setdefault(letter, {})
         current_dict[self.word_end] = self.word_end

def check_word(self, word):
     current_dict = self.root
     for letter in word:
         if letter in current_dict:
             current_dict = current_dict[letter]
         else:
             return False
     else:
         if self.word_end in current_dict:
             return True
         else:
             return False

tree = Trie()
tree.add_words(['foo', 'bar', 'baz', 'barz'])
print tree
"""
{'b': {'a': {'r': {'_end_': '_end_', 'z': {'_end_': '_end_'}}, 
             'z': {'_end_': '_end_'}}}, 
 'f': {'o': {'o': {'_end_': '_end_'}}}}
 """

print check_word('baz')
# True
print check_word('barz')
# True
print check_worf('barzz')
# False

I see that the complexity of searching for a word is O(m), where m is the length of the searched term. Also, the complexity of adding a word is very similar – O(m), where m is the length of the word to be added.

Question:
These complexities are too good. Can someone confirm these complexities? Is there a problem with my Trie tree implementation?

Solution

The

time complexity of searching in TRIE is indeed O(k), where k is the length of the string to be searched. This is a true and proven fact. However, storage requirements are a disadvantage. Each node of Trie consists of multiple branches. Each branch represents one possible key character. We need to mark the last node of each key as the end of the word node. A Trie node field, isEndOfWord, is used to distinguish the node as a end-of-word node.

In general, the cost of inserts and searches is O(length_of_string), but the memory requirement for Trie is O(ALPHABET_SIZE * length_of_string * N), where N is the number of keys in Trie. There are efficient trie node representations (e.g. compressed trie, ternary search tree, etc.) to minimize the memory requirements of trie.

Related Problems and Solutions