.net - Boyer-Moore Practical in C#? -

- July 15, 2010

boyer-moore fastest non-indexed text-search algorithm known. i'm implementing in c# black belt coder website.

i had working , showed expected performance improvements compared string.indexof(). however, when added stringcomparison.ordinal argument indexof, started outperforming boyer-moore implementation. sometimes, considerable amount.

i wonder if can me figure out why. understand why stringcomparision.ordinal might speed things up, how faster boyer-moore? because of the overhead of .net platform itself, perhaps because array indexes must validated ensure they're in range, or else altogether. algorithms not practical in c#.net?

below key code.

// base search classes abstract class searchbase {     public const int invalidindex = -1;     protected string _pattern;     public searchbase(string pattern) { _pattern = pattern; }     public abstract int search(string text, int startindex);     public int search(string text) { return search(text, 0); } }  /// <summary> /// simplified boyer-moore implementation. ///  /// note: uses single skip array, uses more memory needed , /// may not large enough. replaced multi-stage table. /// </summary> class boyermoore2 : searchbase {     private byte[] _skiparray;      public boyermoore2(string pattern)         : base(pattern)     {         // todo: replaced multi-stage table         _skiparray = new byte[0x10000];          (int = 0; < _skiparray.length; i++)             _skiparray[i] = (byte)_pattern.length;         (int = 0; < _pattern.length - 1; i++)             _skiparray[_pattern[i]] = (byte)(_pattern.length - - 1);     }      public override int search(string text, int startindex)     {         int = startindex;          // loop while there's still room search term         while (i <= (text.length - _pattern.length))         {             // if have match @ position             int j = _pattern.length - 1;             while (j >= 0 && _pattern[j] == text[i + j])                 j--;              if (j < 0)             {                 // match found                 return i;             }              // advance next comparision             += math.max(_skiparray[text[i + j]] - _pattern.length + 1 + j, 1);         }         // no match found         return invalidindex;     } }

edit: i've posted test code , conclusions on matter @ http://www.blackbeltcoder.com/articles/algorithms/fast-text-search-with-boyer-moore.

based on own tests , comments made here, i've concluded reason string.indexof() performs stringcomparision.ordinal because method calls unmanaged code employs hand-optimized assembly language.

i have run number of different tests , string.indexof() seems faster can implement using managed c# code.

if anyone's interested, i've written i've discovered , posted several variations of boyer-moore algorithm in c# @ http://www.blackbeltcoder.com/articles/algorithms/fast-text-search-with-boyer-moore.

Search This Blog

Manage

.net - Boyer-Moore Practical in C#? -

Comments

Post a Comment

Popular posts from this blog

How do .net 4.0 [named] tuples work under the hood? -

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -