Ontology

Quora has many questions on different topics, and a common product use-case for our @mention selectors and search service is to look-up questions under a certain topic as quickly as possible.

For this problem, imagine a simplified version of Quora where each question has only one topic associated with it. In turn, the topics form a simplified ontology where each topic has a list of children, and all topics are descendants of a single root topic.

Design a system that allows for fast searches of questions under topics. There are NN topics, MM questions, and KK queries, given in this order. Each query has a desired topic as well as a desired string prefix. For each query, return the number of questions that fall under the queried topic and begin with the desired string. When considering topics, we want to include all descendants of the queried topic as well as the queried topic itself. In other words, each query searches over the subtree of the topic.

The topic ontology is given in the form of a flattened tree of topic names, where each topic may optionally have children. If a topic has children, they are listed after it within parentheses, and those topics may have children of their own, etc. See the sample for the exact input format. The tree is guaranteed to have a single root topic.

For ease of parsing, each topic name will be composed of English alphabetical characters, and each question and query text will be composed of English alphabetical characters, spaces, and question marks. Each question and query text will be well behaved: there will be no consecutive spaces or leading/trailing spaces. All queries, however, are case sensitive.

Constraints For 100% of the test data, 1≤N,M,K≤1051≤N,M,K≤105 and the input file is smaller than 5MB For 50% of the test data, 1≤N,M,K≤2⋅1041≤N,M,K≤2⋅104 and the input file is smaller than 1MB

Input Format Line 1: One integer NN Line 2: NN topics arranged in a flat tree (see sample) Line 3: One integer MM Line 4...M+3: Each line contains a topic name, followed by a colon and a space, and then the question text. Line M+4: One integer KK Line M+5...M+K+4: Each line contains a topic name, followed by a space, and then the query text.

Output Format Line 1...K: Line ii should contain the answer for the iith query.

Sample Input

6
Animals ( Reptiles Birds ( Eagles Pigeons Crows ) )
5
Reptiles: Why are many reptiles green?
Birds: How do birds fly?
Eagles: How endangered are eagles?
Pigeons: Where in the world are pigeons most densely populated?
Eagles: Where do most eagles live?
4
Eagles How en
Birds Where
Reptiles Why do
Animals Wh

Sample Output

1
2
0
3

Solution

from collections import deque


class TrieNode:
    def __init__(self):
        self.count = 0
        self.children = {}


class Trie:
    def __init__(self):
        self.head = TrieNode()

    def insert(self, sentence):
        """
        Insert sentence into Trie.

        Args:
            sentence: sentence to be inserted into Trie
        Returns:
            None
        """
        current = self.head
        for char in sentence:
            if char in current.children:
                current.children[char].count += 1
            else:
                current.children[char] = TrieNode()
                current.children[char].count = 1
            current = current.children[char]

    def return_query_count(self, query):
        """
        Return Query Count.
        Args:
            query: query entered by user.
        Returns:
            number of relevant queries
        """
        current = self.head
        try:
            for char in query:
                current = current.children[char]
            return current.count
        except KeyError:
            return 0


class TreeNode:
    def __init__(self):
        self.trie = Trie()
        self.children = []


class Tree:
    def __init__(self):
        self.map = {}

    def insert_node(self, root, children):
        """Insert the tree node."""
        if root not in self.map:
            self.map[root] = TreeNode()
        self.map[root].children += children

    def insert_sentence(self, node, sentence):
        """Insert sentence into Trie."""
        self.map[node].trie.insert(sentence)

    def query(self, node, query):
        """Get query count."""
        return self.map[node].trie.return_query_count(query)


class Ontology:
    def __init__(self):
        """Constructor"""
        self.tree = Tree()

    def build_query(self, query):
        """
        Save query into tree.
        Args:
            query: string representing the query to save
        Returns:
            None
        Raises:
            Value Error in case the query is malformed.
        """
        try:
            node, sentence = query.split(":")
            self.tree.insert_sentence(node.strip(), sentence.strip())
        except ValueError:
            return ValueError("Not a valid query")

    def parse_flat_tree(self, ps):
        """
        Parse the flat tree.
        Args:
            parse_string: string containing the flat tree
        Returns:
            Ontology Structure
        """
        words = deque(ps.split(" "))
        try:
            root = words.popleft()
            if words.popleft() == "(":
                self.parse_recurse(root, words)
        except IndexError:
            return IndexError("No ")

    def parse_recurse(self, root, words):
        """
        Recursively build the Tree.
        Args:
            root: current parent to add children too.
            words: queue containing tokens of query
        Returns:
            None
        Raises:
            IndexError
        """
        child, children, previous_root = words.popleft(), [], None
        while child:
            if child == "(":
                self.parse_recurse(previous_root, words)
                child = words.popleft()
            elif child == ")":
                if len(children) > 0:
                    self.tree.insert_node(root, children)
                break
            else:
                children.append(child)
                self.tree.insert_node(child, [])
                previous_root = child
                try:
                    child = words.popleft()
                except IndexError:
                    child = None

    def get_query_count(self, query):
        """
        Get query count.
        Args:
            query: query to get count of matching questions
        Returns:
            count of related topics to query
        """
        try:
            head, sep, query = query.partition(" ")
        except ValueError:
            return ValueError("Not a valid query to match")
        queue, count = deque(), 0
        queue.append(head)
        while queue:
            current = queue.popleft()
            try:
                queue.extend(self.tree.map[current].children)
            except KeyError:
                return 0
            count += self.tree.query(current, query)
        return count


def main():
    o = Ontology()
    n_topics = int(raw_input())
    flat_tree = raw_input()
    o.parse_flat_tree(flat_tree)

    n_query = int(raw_input())
    for i in xrange(n_query):
        o.build_query(raw_input())

    n_match_query = int(raw_input())
    for i in xrange(n_match_query):
        print o.get_query_count(raw_input())

if __name__ == '__main__':
    main()

results matching ""

    No results matching ""