1 /*! 2 A lower level API for packed multiple substring search, principally for a small 3 number of patterns. 4 5 This sub-module provides vectorized routines for quickly finding matches of a 6 small number of patterns. In general, users of this crate shouldn't need to 7 interface with this module directory, as the primary 8 [`AhoCorasick`](../struct.AhoCorasick.html) 9 searcher will use these routines automatically as a prefilter when applicable. 10 However, in some cases, callers may want to bypass the Aho-Corasick machinery 11 entirely and use this vectorized searcher directly. 12 13 # Overview 14 15 The primary types in this sub-module are: 16 17 * [`Searcher`](struct.Searcher.html) executes the actual search algorithm to 18 report matches in a haystack. 19 * [`Builder`](struct.Builder.html) accumulates patterns incrementally and can 20 construct a `Searcher`. 21 * [`Config`](struct.Config.html) permits tuning the searcher, and itself will 22 produce a `Builder` (which can then be used to build a `Searcher`). 23 Currently, the only tuneable knob are the match semantics, but this may be 24 expanded in the future. 25 26 # Examples 27 28 This example shows how to create a searcher from an iterator of patterns. 29 By default, leftmost-first match semantics are used. (See the top-level 30 [`MatchKind`](../enum.MatchKind.html) type for more details about match 31 semantics, which apply similarly to packed substring search.) 32 33 ``` 34 use aho_corasick::packed::{MatchKind, Searcher}; 35 36 # fn example() -> Option<()> { 37 let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?; 38 let matches: Vec<usize> = searcher 39 .find_iter("foobar") 40 .map(|mat| mat.pattern()) 41 .collect(); 42 assert_eq!(vec![0], matches); 43 # Some(()) } 44 # if cfg!(target_arch = "x86_64") { 45 # example().unwrap() 46 # } else { 47 # assert!(example().is_none()); 48 # } 49 ``` 50 51 This example shows how to use [`Config`](struct.Config.html) to change the 52 match semantics to leftmost-longest: 53 54 ``` 55 use aho_corasick::packed::{Config, MatchKind}; 56 57 # fn example() -> Option<()> { 58 let searcher = Config::new() 59 .match_kind(MatchKind::LeftmostLongest) 60 .builder() 61 .add("foo") 62 .add("foobar") 63 .build()?; 64 let matches: Vec<usize> = searcher 65 .find_iter("foobar") 66 .map(|mat| mat.pattern()) 67 .collect(); 68 assert_eq!(vec![1], matches); 69 # Some(()) } 70 # if cfg!(target_arch = "x86_64") { 71 # example().unwrap() 72 # } else { 73 # assert!(example().is_none()); 74 # } 75 ``` 76 77 # Packed substring searching 78 79 Packed substring searching refers to the use of SIMD (Single Instruction, 80 Multiple Data) to accelerate the detection of matches in a haystack. Unlike 81 conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring 82 search tend to do better with a small number of patterns, where as Aho-Corasick 83 generally maintains reasonably consistent performance regardless of the number 84 of patterns you give it. Because of this, the vectorized searcher in this 85 sub-module cannot be used as a general purpose searcher, since building the 86 searcher may fail. However, in exchange, when searching for a small number of 87 patterns, searching can be quite a bit faster than Aho-Corasick (sometimes by 88 an order of magnitude). 89 90 The key take away here is that constructing a searcher from a list of patterns 91 is a fallible operation. While the precise conditions under which building a 92 searcher can fail is specifically an implementation detail, here are some 93 common reasons: 94 95 * Too many patterns were given. Typically, the limit is on the order of 100 or 96 so, but this limit may fluctuate based on available CPU features. 97 * The available packed algorithms require CPU features that aren't available. 98 For example, currently, this crate only provides packed algorithms for 99 `x86_64`. Therefore, constructing a packed searcher on any other target 100 (e.g., ARM) will always fail. 101 * Zero patterns were given, or one of the patterns given was empty. Packed 102 searchers require at least one pattern and that all patterns are non-empty. 103 * Something else about the nature of the patterns (typically based on 104 heuristics) suggests that a packed searcher would perform very poorly, so 105 no searcher is built. 106 */ 107 108 pub use packed::api::{Builder, Config, FindIter, MatchKind, Searcher}; 109 110 mod api; 111 mod pattern; 112 mod rabinkarp; 113 mod teddy; 114 #[cfg(test)] 115 mod tests; 116 #[cfg(target_arch = "x86_64")] 117 mod vector; 118