Newb question on Google Search implementation

Apr 9, 2010 at 5:22 AM

I've implemented the google search grammar for full text searching in Sql and noticed some exceptions that happen that I'm trying to figure out the best way to fix.  I'm pretty new to this and I think an explanation of how to solve this would begin to help me understand going forward.

For example:

test+    -causes an error, this for me is an actual term that should be able to be searched, similar to "test+"

"test     -is there a way to just ignore unmatched parenthesis?  What would be the way to handle this instead of just failing? Process standard failures if errors occur and try to re-process?

foo/bar   -this would just result in an or clause


I think if I see how you would fix these then it may start clicking in terms of how to do more.




Apr 11, 2010 at 3:42 PM

The last one is the simplest I think - just add "/" to the expression for binary op:

BinaryOp.Rule = ImpliedAnd | "and" | "&" | "-" | "or" | "|" | "/";

and add handling it in converter. 

For unmatched quote - you can add double-quote as a standalone term to 


          Expression.Rule = PrimaryExpression | BinaryExpression;

    var unmantchedQuote = ToTerm("\""); 

   Expression.Rule = PrimaryExpression | BinaryExpression | unmatchedQuote;

Also add it to MarkPunctuation(...) list - so it would be eliminated after parsing. By default it would be assigned the lowest priority, so if double-quoted term fails to scan a token (because quote is unmatched) then this terminal will take it, produce token and it will be eliminated. 

For test+   :  you can try the following: hook to ValidateToken event of Term element, and in the event handler check the source stream and see if it is followed by "+" and nothing else - in this case add "+" to token content and move source stream one symbol ahead



Jun 19, 2010 at 3:17 PM

I'm trying to implement these too.  I got the "/" as an OR operator working.  I'm having trouble with the other two.  Can you provide a little more detail?

Thanks so much for your time!

Jun 21, 2010 at 4:38 PM

1. Case with "+" as part of term, for ex: test+

The problem here is that SearchGrammar treats "+" always as operator, and what is wanted is to treat it part of search term - when there's no space in between. 

Note that this MIGHT be confusing for the user; the following two expressions will have different meaning:

one+ two   //search for text with "one+" AND "two"

one + two  // search for "one" AND "two"


Now how to do this. Notice, I take back my previous recommendation in prior response, disregard it)

To implement this, just add "+" sign to character lists in term's declaration, in a call to identifier, in CreateTerm method:

          var term = new IdentifierTerminal(name,   "!@#$%^*_'.?-+", "!@#$%^*_'.?0123456789+");

notice "+" at the end of both lists. 

I think that should do it


2. For unmatched quote, that might be tricky. Again, disregard my previous recipe, there's currently a bug that prevents you from setting precedence on key terms, so that would not work as described. So define a single quote as a term, add it to PrimaryExpression.Rule expression:

PrimaryExpression.Rule = unmatchedQuote | Term | ...;

Also register unmatchedQuote as punctuation (to eliminate after parsing):


Then set the priority of Phrase (quoted string) to lowest:

Phrase.Priority = Terminal.LowestPriority;

that should work then.