Life science research labs today manage increasing volumes of sequencedata. Much of the data management and querying today is accomplishedprocedurally using Perl, Python, or Java programs that integrate datafrom different sources and query tools. The dangers of this proceduralapproach are well known to the database community-- a) severelimitations on the ability to rapidly express queries and b)inefficient query plans due to the lack of sophisticated optimizationtools. This situation is likely to get worse with advances inhigh-throughput technologies that make it easier to quickly producevast amounts of sequence data. The need for a declarative andefficient system to manage and query biological sequence data isurgent. To address this need, we designed the Periscope/SQ system.Periscope/SQ extends current relational systems to enablesophisticated queries on sequence data and can optimize and executethese queries efficiently. This thesis describes the problems that need to be solved to make itpossible to build the Periscope/SQ system.First, we describe thealgebraic framework which forms the backbone of Periscope/SQ. Second,we describe algorithms to construct large scale suffix tree indexesfor efficiently answering sequence queries. Third, we describetechniques for selectivity estimation and optimization in the contextof queries over biological sequences. Next, we demonstrate how some ofthe techniques developed for Periscope/SQ can be applied to produce apowerful mining algorithm that we call FLAME. Finally, wedescribe GeneFinder, a biological application built on top ofPeriscope/SQ. GeneFinder is currently being used to predict the targets oftranscription factors.Today, genomic and proteomic sequences are the most abundantlyavailable source of high-quality biological data. By making it possible todeclaratively and efficiently query vast amount of sequence data,Periscope/SQ opens the door to vast improvements in the pace ofbioinformatics research.