.net - Boyer-Moore Practical in C#? -
boyer-moore fastest non-indexed text-search algorithm known. i'm implementing in c# black belt coder website.
i had working , showed expected performance improvements compared string.indexof()
. however, when added stringcomparison.ordinal
argument indexof
, started outperforming boyer-moore implementation. sometimes, considerable amount.
i wonder if can me figure out why. understand why stringcomparision.ordinal
might speed things up, how faster boyer-moore? because of the overhead of .net platform itself, perhaps because array indexes must validated ensure they're in range, or else altogether. algorithms not practical in c#.net?
below key code.
// base search classes abstract class searchbase { public const int invalidindex = -1; protected string _pattern; public searchbase(string pattern) { _pattern = pattern; } public abstract int search(string text, int startindex); public int search(string text) { return search(text, 0); } } /// <summary> /// simplified boyer-moore implementation. /// /// note: uses single skip array, uses more memory needed , /// may not large enough. replaced multi-stage table. /// </summary> class boyermoore2 : searchbase { private byte[] _skiparray; public boyermoore2(string pattern) : base(pattern) { // todo: replaced multi-stage table _skiparray = new byte[0x10000]; (int = 0; < _skiparray.length; i++) _skiparray[i] = (byte)_pattern.length; (int = 0; < _pattern.length - 1; i++) _skiparray[_pattern[i]] = (byte)(_pattern.length - - 1); } public override int search(string text, int startindex) { int = startindex; // loop while there's still room search term while (i <= (text.length - _pattern.length)) { // if have match @ position int j = _pattern.length - 1; while (j >= 0 && _pattern[j] == text[i + j]) j--; if (j < 0) { // match found return i; } // advance next comparision += math.max(_skiparray[text[i + j]] - _pattern.length + 1 + j, 1); } // no match found return invalidindex; } }
edit: i've posted test code , conclusions on matter @ http://www.blackbeltcoder.com/articles/algorithms/fast-text-search-with-boyer-moore.
based on own tests , comments made here, i've concluded reason string.indexof()
performs stringcomparision.ordinal
because method calls unmanaged code employs hand-optimized assembly language.
i have run number of different tests , string.indexof()
seems faster can implement using managed c# code.
if anyone's interested, i've written i've discovered , posted several variations of boyer-moore algorithm in c# @ http://www.blackbeltcoder.com/articles/algorithms/fast-text-search-with-boyer-moore.
Comments
Post a Comment