帳號:guest(44.222.242.27)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):游智强
作者(外文):Yu, Chih-Chiang
論文名稱(中文):Improved Algorithms for Some Pattern Matching and Vertex-Disjoint Paths Problems
論文名稱(外文):針對字串索引及點互斥路徑問題之改進演算法
指導教授(中文):王炳豐
指導教授(外文):Wang, Biing-Feng
學位類別:博士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:924324
出版年(民國):99
畢業學年度:98
語文別:英文
論文頁數:111
中文關鍵詞:演算法資料結構字串索引點互斥路徑完全多項式時間近似方案
外文關鍵詞:algorithmsdata structurestext indexingvertex-disjoint pathsfully polynomial-time approximation schemes
相關次數:
  • 推薦推薦:0
  • 點閱點閱:54
  • 評分評分:*****
  • 下載下載:2
  • 收藏收藏:0
This dissertation is composed of the following two fundamental research topics in computer science: string matching and finding disjoint paths.
In the first part, we study the following three indexing problems: the positional text indexing problem, the position-restricted text indexing problem, and the variable- length don't care text indexing problem. Previous solutions for these indexing problems heavily rely on efficient indexes to the range successor problem. Thus, any improvement on the range successor problem immediately leads to improved results for these indexing problems. In this dissertation, we present three new indexes for the range successor problem. More specifically, we give (1) a succinct n + o(n)-word index with O(log n / loglog n) query time, whose query time improves previous O(n)-word index by an O(loglog n) factor; (2) an O(n loglog n)-word index with O((loglog n)^2) query time, whose space-time product is better than all previous solutions; and (3) an O(n^{1+\epsilon})-word index with O(1) query time, which is simpler than the previous O(1)-time index. In addition, our index in (2) can be extended to solve the well-known orthogonal range successor problem in R^3. The extended index needs O(n log^{1+\epsilon} n) words and supports O(log n loglog n) query time, improving a long-standing result which uses O(n log^2 n) words with the same query time.
Our results on the range successor problem immediately lead to improved results for the above three indexing problems over general alphabets. For real world applications, the alphabet size is usually small. In this dissertation, we also consider these three indexing problems over alphabets of size O(polylog(n)). For the first and third problems, we present optimal O(n)-word indexes with O(p) query time. For the second problem, we show that a query can be answered in O(n log^\epsilon n) space and O(p + occ) time, or in O(n) space and O(p + occ log^\epsilon n) time. When |Σ| = O(polylog(n)), our indexes are better than all the previous results.
In the second part of this dissertation, we consider the following two problems: the problem of finding length-bounded two vertex-disjoint paths in an undirected planar graph, and the problem of finding minmax k vertex-disjoint paths in a directed acyclic graph. For the first problem, an improved algorithm is presented, which requires O(n^3 b_min) time and O(n^2 b_min) space, where b_min is the smaller of the two given length bounds. For the second problem, an improved algorithm and a faster fully polynomial-time approximation scheme are proposed. The proposed algorithm requires O(n^{k+1} M^{k-1}) time and O(n^k M^{k-1}) space, and the proposed approximation scheme requires O((1/\epsilon)^{k-1} n^{2k} log^{k-1} M) time and O((1/\epsilon)^{k-1} n^{2k-1} log^{k-1} M) space, where \epsilon is the given approximation parameter and M is the length of the longest path in an optimal solution.
本篇論文探討了資訊科學領域中相當基礎且重要的兩個研究議題: 字串索引問題及找尋互斥路徑問題。
本論文的第一部分探討下列三個有位置限制的字串索引問題: 特定位置之後的字串索引問題、區間範圍內的字串索引問題、以及包含 variable-length don't care 符號的字串索引問題。過去的結果主要是依賴解決 the range successor problem 的資料結構作為工具,所以只要 range successor 這個問題的資料結構有任何的改進,就可以得到這三個問題的改進結果。在這篇論文中,我們首先針對 range successor 這個問題提出三個新的索引資料結構。更明確地說,我們提出了 (1) 一個使用 n + o(n) 空間且支援 O(log n / loglog n) 查詢時間的資料結構,此結果的查詢時間比先前使用相同空間的資料結構快 O(loglog n) 倍;(2) 一個使用 O(n loglog n) 空間且支援 O((loglog n)^2) 查詢時間的資料結構,此結果的空間時間乘積優於過去所有的結果;以及 (3) 一個使用 O(n^{1+\epsilon}) 空間且支援 O(1) 查詢時間的資料結構,此結果比先前有相同複雜度的資料結構簡單許多。此外,第二個資料結構也可以用來解決在 R^3 空間中一個稱為 orthogonal range successor problem 的重要問題,使用 O(n log^{1+\epsilon} n) 空間且支援 O(log n loglog n) 查詢時間;這個結果改進了過去已經存在很久的最好結果。
利用我們在 range successor 這個問題上的改進結果,上述三個有位置限制的字串索引問題在任意大小的字符集下都可以被立刻改進。在現實生活的應用中,字符集通常很小,因此在本論文中,我們也探討了在小字符集下的上述三個字串索引問題。當字符集大小是 O(polylog(n)) 時,我們為這些問題提出了更有效率的改進演算法。針對第一個與第三個問題,我們提出了使用 O(n) 空間且支援 O(p) 查詢時間的最佳資料結構。針對第二個問題,我們提出了一個使用 O(n log^\epsilon n) 空間且支援 O(p) 查詢時間的資料結構;與一個使用 O(n) 空間且支援 O(p + occ log^\epsilon n) 查詢時間的資料結構。當字符集大小是 O(polylog(n)) 時,我們的演算法改進了過去所有的結果。
這篇論文的第二部分探討以下兩個問題: 在一個無向平面圖中找尋兩條有長度限制的點互斥路徑問題、以及在一個有向無循環圖中找尋 k 條點互斥路徑且讓最長路徑最小化的問題。針對第一個問題,我們提出一個改進演算法,時間與空間的複雜度分別為 O(n^3 b_min) 與 O(n^2 b_min),其中 b_min 為兩個給定的長度限制中較小的一個。針對第二個問題,我們提出了一個改進的演算法與一個更快的完全多項式時間近似方案 (fully polynomial-time approximation scheme)。提出的改進演算法使用了 O(n^{k+1} M^{k-1}) 時間與 O(n^k M^{k-1}) 空間;而提出的近似方案使用了 O((1/\epsilon)^{k-1} n^{2k} log^{k-1} M) 時間與 O((1/\epsilon)^{k-1} n^{2k-1} log^{k-1} M) 空間,其中 \epsilon 為給定的近似參數、M 為最佳解中最長路徑的長度。
Abstract i
Acknowledgement iv
Contents v
List of Figures viii
List of Tables x
Preface xi

Part I: String Matching with Position Restrictions

Chapter 1. Introduction 1
1.1 Related Work 5
1.2 Summary of Results 8
1.3 Organization of Part I 13
Chapter 2. Preliminaries 14
2.1 Common Notation and Definitions 14
2.2 Suffix Trees 14
2.3 Segment Trees 16
2.4 Wavelet Trees 16
2.5 Rank and Select Queries 17
2.6 Predecessor and Successor Queries 18
2.7 Presistent Data Structures 19
Chapter 3. The Orthogonal Range Successor Problem 20
3.1 Notation and Preliminaries 20
3.2 A Succinct Index for Range Successor Queries 22
3.2.1 The Core RS Index: Handling Ranges of Size O(lg n / (lglg n)^2) 22
3.2.2 The Core rankc Index: Handling Ranges of Size O(lg n / (lglg n)^2) 27
3.2.3 The General Index: Handling Ranges of Size n 31
3.3 A Simple Index with O(1) Query Time 36
3.3.1 An O(1)-time 2-Dimensional Range Successor Index 37
3.3.2 Extension to Higher Dimensions 39
3.4 An Index with Efficient Space-Time Product 40
3.4.1 Preliminaries: Two-Sided RS Queries 41
3.4.2 Preliminaries: Reduction to Rank Space 42
3.4.3 An O(n loglog n)-word RS Index 43
3.5 Efficient RS Indexes for Higher Dimensions 50
Chapter 4. Pattern Matching Problems with Position Restrictions 54
4.1 Notation and Definitions 54
4.2 Positional Text Indexing 55
4.2.1 An Index for Finite Alphabets 55
4.2.1.1 An Index for Short Patterns 56
4.2.1.2 An Index for Long Patterns 60
4.2.2 An Index for |Σ| = O(polylog(n)) 63
4.3 Position-Restricted Text Indexing 65
4.4 Variable-Length Don't Care Text Indexing 68
Chapter 5. Conclusion and Future Work 71

Part II: Finding Vertex-Disjoint Paths with Length Constraints

Chapter 6. Introduction 74
6.1 Summary of Results 77
6.2 Organization of Part II 79
Chapter 7. Length-Bounded Vertex-Disjoint Paths Problem on a Planar Graph 80
7.1 The Multi-Bounded Path Problem 80
7.2 Length-Bounded Two Vertex-Disjoint Paths 82
7.2.1 Notation and Definitions 83
7.2.2 Properties 84
7.2.3 An O(n3bmin)-time algorithm 87
Chapter 8. Minmax k Vertex-Disjoint Paths Problem on a Directed Acyclic Graph 93
8.1 An Improved Algorithm 93
8.2 A Faster Approximation Scheme 97
Chapter 9. Conclusion and Future Work 103

Bibliography 105
[1] P. K. Agarwal and J. Erickson, "Geometric range searching and its relatives," Advances in Discrete and Computational Geometry, pp. 1-56, 1999.
[2] T. Akutsu, "Approximate string matching with variable-length don't care characters," IEICE Transactions on Information and Systems, vol. E79-D, no. 9, pp. 1353-1354, 1996.
[3] S. Alstrup, G. S. Brodal, and T. Rauhe, "New data structures for orthogonal range searching," in Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000, pp. 198-207.
[4] M. A. Bender and M. Farach-Colton, "The LCA problem revisited," in Proceedings of the 4th Latin American Theoretical Informatics Symposium, 2000, pp. 88-94.
[5] J. L. Bentley, "Solutions to Klee’s rectangle problems," Department of Computer Science, Carnegie Mellon University, Manuscript, 1977.
[6] J. L. Bentley, "Multidimensional divide-and-conquer," Communications of the ACM, vol. 23, no. 4, pp. 214-229, 1980.
[7] A. A. Bertossi and E. Lodi, "Parallel string matching with variable length don't cares," Journal of Parallel and Distributed Computing, vol. 22, no. 2, pp. 229-234, 1994.
[8] P. Bose, M. He, A. Maheshwari, and P. Morin, "Succinct orthogonal range search structures on a grid with applications to text indexing," in Proceedings of 11th International Symposium on Algorithms and Data Structures, 2009, pp. 98-109.
[9] R. S. Boyer and J. S. Moore, "A fast string searching algorithm," Communications of the ACM, vol. 20, no. 10, pp. 762-772, 1977.
[10] G. S. Brodal, R. Fagerberg, M. Greve, and A. López-Ortiz, "Online sorted range reporting," in Proceedings of the 20th International Symposium on Algorithms and Computation, 2009, pp. 173-182.
[11] G. S. Brodal and A. G. Jørgensen, "Data structures for range median queries," in Proceedings of the 20th International Symposium on Algorithms and Computation, 2009, pp. 822-831.
[12] B. Chazelle, "A functional approach to data structures and its use in multidimensional searching," SIAM Journal on Computing, vol. 17, no. 3, pp. 427-462, 1988.
[13] D. Clark, "Compact pat trees," PhD Thesis, University of Waterloo, 1996.
[14] D. R. Clark and J. I. Munro, "Efficient suffix trees on secondary storage," in Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, 1996, 383-391.
[15] R. Cole, L.-A. Gottlieb, and M. Lewenstein, "Dictionary matching and indexing with errors and don't cares," in Proceedings of the 36th Annual ACM Symposium on Theory of Computing, 2004, pp. 91-100.
[16] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, second ed. McGraw-Hill, 2001.
[17] M. Crochemore, C. S. Iliopoulos, M. Kubica, M. S. Rahman, and T. Walen, "Improved algorithms for the range next value problem and applications," in Proceedings of the 25th Annual Symposium on Theoretical Aspects of Computer Science, 2008, pp. 205-216.
[18] M. Crochemore, C. S. Iliopoulos, and M. S. Rahman, "Finding patterns in given intervals," in Proceedings of the 32nd International Symposium on Mathematical Foundations of Computer Science, 2007, pp. 645-656.
[19] P. F. Dietz and R. Raman, "Persistence, amortization and randomization," in Proceedings of the 2nd Annual ACM-SIAM Symposium on Discrete Algorithms, 1991, pp. 78-88.
[20] J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan, "Making data structures persistent," Journal of Computer and System Sciences, vol. 38, no. 1, pp. 86-124, 1989.
[21] P. van Emde Boas, "Preserving order in a forest in less than logarithmic time and linear space," Information Processing Letters, vol. 6, no. 3, pp. 80-82, 1977.
[22] P. Ferragina, G. Manzini, V. Mäkinen, and G. Navarro, "Compressed representations of sequences and full-text indexes," ACM Transactions on Algorithms, vol. 3, no. 2, article 20, 2007.
[23] J. Fischer and V. Heun, "A new succinct representation of RMQ-information and improvements in the enhanced suffix array," in Proceedings of the first International Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, 2007, pp. 459-470.
[24] R. Fleischer, Q. Ge, J. Li, and H. Zhu, "Efficient algorithms for k-disjoint paths problems on DAGs," in Proceedings of the 3rd International Conference on Algorithmic Aspects in Information and Management, 2007, pp. 134-143.
[25] S. Fortune, J. Hopcroft, and J. Wyllie, "The directed subgraph homeomorphism problem," Theoretical Computer Science, vol. 10, pp. 111-121, 1980.
[26] M. L. Fredman and D. E. Willard, "Trans-dichotomous algorithms for minimum spanning trees and shortest paths," Journal of Computer and System Sciences, vol. 48, no. 3, pp. 533-551, 1994.
[27] H. N. Gabow, J. L. Bentley, and R. E. Tarjan, "Scaling and related techniques for geometry problems," in Proceedings of the 16th Annual ACM Symposium on Theory of Computing, 1984, pp. 135-143.
[28] R. Grossi, A. Gupta, and J. S. Vitter, "High-order entropy-compressed text indexes," in Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, 2003, pp. 841-850.
[29] Q.-P. Gu and S. Peng, "Algorithms for node disjoint paths in incomplete star networks," in Proceedings of the 1994 International Conference on Parallel and Distributed Systems, 1994, pp. 296-303.
[30] Q.-P. Gu and S. Peng, "An efficient algorithm for k-pairwise disjoint paths in star graphs," Information Processing Letters, vol. 67, no. 6, pp. 283-287, 1998.
[31] Q.-P. Gu and S. Peng, "An efficient algorithm for the k-pairwise disjoint paths problem in hypercubes," Journal of Parallel Distributed Computing, vol. 60, no. 6, pp. 764-774.
[32] T. Hagerup, P. B. Miltersen, and R. Pagh, "Deterministic dictionaries," Journal of Algorithms, vol. 41, no. 1, pp. 69-85, 2001.
[33] R. Hassin, "Approximation schemes for the restricted shortest path problem," Mathematics of Operations Research, vol. 17, no. 1, pp. 36-42, 1992.
[34] H. van der Holst, "An algorithm for finding length-bounded three vertex-disjoint paths in a planar graph," Manuscript, Department of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands.
[35] H. van der Holst and J. C. de Pina, "Length-bounded disjoint paths in planar graphs," Discrete Applied Mathematics, vol. 120, no. 1-3, pp. 251-261, 2002.
[36] C. S. Iliopoulos and M. S. Rahman, "Indexing factors with gaps," Algorithmica, vol. 55, no. 1, pp. 60-70, 2009.
[37] S. Inenaga, M. Takeda, A. Shinohara, H. Hoshino, and S. Arikawa, "The minimum DAWG for all suffixes of a string and its applications," in Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching, 2002, pp. 153-167.
[38] A. Itai, Y. Perl, and Y. Shiloach, "The complexity of finding maximum disjoint paths with length constraints," Networks, vol. 12, no. 3, pp. 277-286, 1982.
[39] G. Jacobson, "Space-efficient static trees and graphs," in Proceedings of the 30th Annual Symposium on Foundations of Computer Science, 1989, pp. 549-554.
[40] J. JáJá, C. W. Mortensen, and Q. Shi, "Space-efficient and fast algorithms for multidimensional dominance reporting and counting," in Proceedings of the 15th International Symposium on Algorithms and Computation, 2004, pp. 558-568.
[41] R. M. Karp, "On the computational complexity of combinatorial problems," Networks, pp. 45-48, 1975.
[42] O. Keller, T. Kopelowitz, and M. Lewenstein, "Range non-overlapping indexing and successive list indexing," in Proceedings of the 10th Workshop on Algorithms and Data Structures, 2007, pp. 625-636.
[43] D. E. Knuth, J. H. Morris, and V. R. Pratt, "Fast pattern matching in strings," SIAM Journal on Computing, vol. 6, no. 2, pp. 323-350, 1977.
[44] E. Korach and A. Tal, "General vertex disjoint paths in series-parallel graphs," Discrete Applied Mathematics, vol. 41, no. 2, pp. 147-164, pp. 1993.
[45] A. Kosowski, "The maximum edge-disjoint paths problem in complete graphs," Theoretical Computer Science, vol. 399, no. 1-2, pp. 128-140, 2008.
[46] D. Krizanc, P. Morin, and M. H. M. Smid, "Range mode and range median queries on lists and trees," Nordic Journal of Computing, vol. 12, no. 1, pp. 1-17, 2005.
[47] G. Kucherov and M. Rusinowitch, "Matching a set of strings with variable length don't cares," Theoretical Computer Science, vol. 178, no. 1-2, pp. 129-154, 1997.
[48] T. W. Lam, W.-K. Sung, S.-L. Tam, and S.-M. Yiu, "Space efficient indexes for string matching with don't cares," in Proceedings of the 18th International Symposium on Algorithms and Computation, 2007, pp. 846-857.
[49] R. C. T. Lee, Exact and Approximate String Matching Algorithms, Manuscript, National Tsing Hua University.
[50] H.-P. Lenhof and M. H. M. Smid, "Using persistent data structures for adding range restrictions to searching problems," RAIRO Theoretical Informatics and Applications, vol. 28, no. 1, pp. 25-49, 1994.
[51] C.-L. Li, S. T. McCormick, and D. Simchi-Levi, "The complexity of finding two disjoint paths with min-max objective function," Discrete Applied Mathematics, vol. 26, no. 1, pp. 105-115, 1990.
[52] J. F. Lynch, "The equivalence of theorem proving and the interconnection problem," ACM SIGDA Newsletter, vol. 5, no. 3, pp. 31-36, 1975.
[53] V. Mäkinen and G. Navarro, "Rank and select revisited and extended," Theoretical Computer Science, vol. 387, no. 3, pp. 332-347, 2007.
[54] U. Manber and E. W. Myers, "Suffix arrays: a new method for on-line string searches," SIAM Journal on Computing, vol. 22, no. 5, pp. 935-948, 1993.
[55] E. M. McCreight, "A space-economical suffix tree construction algorithm," Journal of the ACM, vol. 23, no. 2, pp. 262-272, 1976.
[56] J. I. Munro, "Tables," in Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science, 1996, pp. 37-42.
[57] Y. Nekrich, "A data structure for multi-dimensional range reporting," in Proceedings of the 23rd ACM Symposium on Computational Geometry, 2007, pp. 344-353.
[58] Y. Nekrich, "Space efficient dynamic orthogonal range reporting," Algorithmica, vol. 49, pp. 94-108, 2007.
[59] Y. Nekrich, "Orthogonal range searching in linear and almost-linear space," Computational Geometry, vol. 42, no. 4, pp. 342-351, 2009.
[60] Y. Perl and Y. Shiloach, "Finding two disjoint paths between two pairs of vertices in a graph," Journal of the ACM, vol. 25, no. 1, pp. 1-9, 1978.
[61] P. Peterlongo, J. Allali, and M.-F. Sagot, "Indexing gapped-factors using a tree," International Journal of Foundations of Computer Science, vol. 19, no. 1, pp. 71-87, 2008.
[62] H. Petersen, "Improved bounds for range mode and range median queries," in Proceedings of the 34th Conference on Current Trends in Theory and Practice of Computer Science, 2008, pp. 418-423.
[63] R. Y. Pinter, "Efficient string matching with don't-cares," Combinatorial Algorithms on Words, vol. 12, pp. 11-29, 1985.
[64] F. P. Preparata and M. I. Shamos, Computational Geometry: An Introduction. Springer, Heidelberg, 1985.
[65] M. S. Rahman and C. S. Iliopoulos, "Pattern matching algorithms with don't cares," in Proceedings of the 33rd Conference on Current Trends in Theory and Practice of Computer Science, 2007, pp. 116-126.
[66] N. Robertson and P. D. Seymour, "Graph Minors. XIII. The disjoint paths problem," Journal of Combinatorial Theory, Series B, vol. 63, no. 2, pp. 65-110, 1995.
[67] L. Salmela, J. Tarhio, and P. Kalsi, "Approximate Boyer-Moore string matching for small alphabets," Algorithmica, to appear.
[68] M. A. Sustik and J. S. Moore, "String searching over small alphabets," Technical Report TR-07-62, Department of Computer Sciences, University of Texas at Austin, 2007.
[69] R. Thathoo, A. Virmani, S. S. Lakshmi, N. Balakrishnan, and K. Sekar, "TVSBS: A fast exact pattern matching algorithm for biological sequences," Current Sciences, vol. 91, no. 1, pp. 47-53, 2006.
[70] T. Tholey, "Solving the 2-disjoint paths problem in nearly linear time," Theory of Computing Systems, vol. 39, pp. 55-78, 2006.
[71] D. Wagner and K. Weihe, "A linear-time algorithm for edge-disjoint paths in planar graphs," Combinatorica, vol. 15, no. 1, pp. 135-150, 1995.
[72] J.-J. Wang, "An improved algorithm for finding two length-bounded vertex-disjoint paths in planar graphs," Master's Thesis, National Tsing Hua University, 2004.
[73] P. Weiner, "Linear pattern matching algorithms," in Proceedings of the 14th Annual Symposium on Switching and Automata Theory, 1973, pp. 1-11.
[74] D. B. West, Introduction to Graph Theory, second ed. Prentice Hall, 2001.
[75] D. E. Willard, "Examining computational geometry, van Emde Boas trees, and hashing from the perspective of the fusion tree," SIAM Journal on Computing, vol. 29, no. 3, pp. 1030-1049, 2000.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *