Re-tree: an efficient index structure for regular expressions
01 January 2002
Due to their flexibility, expressive power, and ability to capture structure, Regular Expressions (REs) are quickly becoming an integral part of language specifications for several important application scenarios. Examples include the XPath pattern language for XML documents and the policy language of the Border Gateway Protocol (BGP) for exchanging routing information across the Internet domains. Many of these applications have to manage huge databases of RE specifications and need to provide an effective matching mechanism that, given an input string, quickly identifies the REs in the database that match it. This "RE retrieval" problem is important for a number of Internet-infrastructure components, including XML routing, XML filtering, BGP routing, information dissemination, and event notification. In this paper, we propose the RE-tree, a novel index structure for large databases of RE specification.