Struggling to recognise a date literal

Oct 18, 2011 at 9:21 AM
Edited Oct 18, 2011 at 3:54 PM

Hi

My grammar successfully parses the expressions

1/1/1970 and '1/1/1970'

as <dateExpression>

a) When I am trying to parse

mydata <= 01/01/1971

I woulds like it to recognise it as a binary expression of <string, binaryoperator, datetime>. My grammar is parsing it as <string, binaryoperator, binaryexpression> obviously recognising the "/" as a binary operator before recognising the text as a date. Here is my grammar

'Terminals
        Dim number = New NumberLiteral("number")
        Dim identifier = New IdentifierTerminal("identifier")
        Dim quotedLiteral = New StringLiteral("quoted", "'", StringOptions.AllowsDoubledQuote)
        Dim dateLiteral = New DateTimeLiteral("datetime")

        Dim comma = ToTerm(",", "comma")
        Dim dataitem = New RegexBasedTerminal("dataitem", "M[1-9][0-9]{0,2}E[1-9][0-9]{0,2}I[1-9][0-9]{0,2}\b")
        Dim [NOT] = ToTerm("NOT")
        Dim [NULL] = ToTerm("NULL")

        'Non-terminals
        Dim rootExpression = New NonTerminal("root")
        Dim expression = New NonTerminal("expression")
        Dim expressionList = New NonTerminal("expressionList", GetType(ListExpressionNode))
        Dim term = New NonTerminal("term")
        Dim literalExpression = New NonTerminal("literalExpression", GetType(LiteralExpressionNode))
        Dim dateExpression = New NonTerminal("dateExpression", GetType(DateExpressionNode))
        Dim unaryExpression = New NonTerminal("unaryExpression", GetType(UnaryExpressionNode))
        Dim binaryExpression = New NonTerminal("binaryExpression", GetType(BinaryExpressionNode))
        Dim booleanExpression = New NonTerminal("booleanExpression", GetType(BooleanExpressionNode))
        Dim bracketedExpression = New NonTerminal("bracketedExpression", GetType(BracketedExpressionNode))
        Dim functionExpression = New NonTerminal("functionExpression", GetType(FunctionExpressionNode))
        Dim unaryOperator = New NonTerminal("unaryOperator", GetType(UnaryOperatorNode))
        Dim binaryOperator = New NonTerminal("binaryOperator", GetType(BinaryOperatorNode))
        Dim booleanOperator = New NonTerminal("booleanOperator", GetType(BooleanOperatorNode))
        Dim dataitemExpression = New NonTerminal("dataitemExpression", GetType(DataItemExpressionNode))
        Dim quotedExpression = New NonTerminal("quotedExpression", GetType(QuotedExpressionNode))
        Dim functionNameExpression = New NonTerminal("functionNameExpression", GetType(FunctionNameExpressionNode))
        Dim equalityExpression = New NonTerminal("equalityExpression", GetType(EqualityExpressionNode))
        Dim equalityOperator = New NonTerminal("equalityOperator", GetType(EqualityOperatorNode))
        Dim nullOperator = New NonTerminal("nullOperator", GetType(NullOperatorNode))
        Dim isOperator = New NonTerminal("isOperator", GetType(IsOperatorNode))
        Dim isExpression = New NonTerminal("isExpression", GetType(IsExpressionNode))

        'BNF Rules
        rootExpression.Rule = expressionList
        expressionList.Rule = MakePlusRule(expressionList, comma, expression)
        expression.Rule = term Or
                            binaryExpression Or
                            booleanExpression Or
                            equalityExpression Or
                            unaryExpression Or
                            isExpression
        bracketedExpression.Rule = "(" + expressionList + ")"
        term.Rule = dataitemExpression Or
                    literalExpression Or
                    dateExpression Or
                    quotedExpression Or
                    functionExpression Or
                    bracketedExpression

        literalExpression.Rule = number Or identifier
        binaryOperator.Rule = ToTerm("+") Or "-" Or "*" Or "/" Or "\" Or "^"
        booleanOperator.Rule = ToTerm("AND") Or "OR"
        equalityOperator.Rule = ToTerm("=") Or "!=" Or "<>" Or
                                    ">=" Or "<=" Or ">" Or "!>" Or "<" Or "!<" Or
                                    "LIKE" Or [NOT] + "LIKE" Or "IN" Or [NOT] + "IN"
        binaryExpression.Rule = expression + binaryOperator + expression
        booleanExpression.Rule = expression + booleanOperator + expression
        equalityExpression.Rule = expression + equalityOperator + expression
        functionExpression.Rule = functionNameExpression + "(" + expressionList + ")"
        dataitemExpression.Rule = "[" + dataitem + "]" Or dataitem
        dateExpression.Rule = dateLiteral
        quotedExpression.Rule = quotedLiteral
        functionNameExpression.Rule = GetFunctionNames()   'identifier
        unaryOperator.Rule = ToTerm("+") Or "-" Or [NOT]
        unaryExpression.Rule = unaryOperator + expression
        nullOperator.Rule = [NULL]
        isOperator.Rule = ToTerm("IS") Or "IS" + [NOT]
        isExpression.Rule = expression + isOperator + nullOperator

        'Terminal priority
        identifier.Priority = 10
        dataitem.Priority = 20
        quotedLiteral.Priority = 30
        dateLiteral.Priority = 40

        'Operator precedence            
        RegisterOperators(10, "*", "/", "\", "%")
        RegisterOperators(9, "+", "-")
        RegisterOperators(8, "=", ">", "<", ">=", "<=", "<>", "!=", "!<", "!>")
        RegisterOperators(7, "^", "&", "|")
        RegisterOperators(6, "NOT", "IS")
        RegisterOperators(5, "AND")
        RegisterOperators(4, "OR", "LIKE", "IN")

        MarkPunctuation("(", ")", ".")
        MarkTransient(term, expression, rootExpression)

        Me.Root = rootExpression
        Me.LanguageFlags = LanguageFlags.CreateAst

and here is the DateTimeLiteral

 

Public Class DateTimeLiteral
        Inherits Terminal

        Public Sub New(name As String)
            MyBase.New(name)
        End Sub

        Public Overrides Function TryMatch(context As Irony.Parsing.ParsingContext, source As Irony.Parsing.ISourceStream) As Irony.Parsing.Token

            If Not IsValidDate(StripQuotes(source.Text)) Then Return Nothing

            source.PreviewPosition += source.Text.Length

            Return source.CreateToken(Me.OutputTerminal)

        End Function

        Protected Overrides Sub InvokeValidateToken(context As Irony.Parsing.ParsingContext)        

            Dim dateValue = StripQuotes(Convert.ToString(context.CurrentToken.Value))
            Dim result As DateTime

            If IsValidDate(dateValue, result) Then
                context.CurrentToken.Value = Convert.ToString(result).Substring(0, dateValue.Length)
            Else
                context.CurrentToken = context.Source.CreateErrorToken("{0} is not a valid date-time value", dateValue)
            End If

        End Sub

        Private Overloads Function IsValidDate(ByVal value As String) As Boolean
            Return IsValidDate(value, Nothing)
        End Function

        Private Overloads Function IsValidDate(ByVal value As String, ByRef result As DateTime) As Boolean
            Return DateTime.TryParse(value, result)
        End Function

        Private Function StripQuotes(ByVal value As String) As String
            Return value.Replace("'", String.Empty)
        End Function

    End Class

 

 

2) If I try and parse

myfield = '1/1/1971'

it identifies it as a binary expression of <string, binaryoperator, quotedExpression>.

I am struggling to understand why since when the date element is parsed on its own it successfully recognises it as a <dateExpression>

Any pointers would be gratefully received.

Many thx again

Simon

Coordinator
Oct 18, 2011 at 4:37 PM

YOu don't need custom terminal, use QuotedValueLiteral instead, with TypeCode = DAteTime; also you may need to set DAteTimeFormat properly.

On another subject - literalExpression is not valid, get rid of it, merge Number and Identifier into Term. 

Oct 19, 2011 at 6:27 AM

OK. Thanks very much.

The only problem with using QuotedValueLiteral is that it converts 1/1/1971 to 1/1/1971 00:00:00 and I was hoping to retain the format the user had entered whilst at the same time being able to id the text as a date. Having said that, I now wonder whether it'd be better to I decided to sanitize the input to enforce dates to be enclosed in single quotes or whether just to not bother trying to identify dates and just identify quoted values which can then be handled as dates ot something else when processing the AST.

Wrt literalExpression, I am probably doing it wrong, but I have a custom AST node (admittedly not doing very much) which maps onto that which I am using elsewhere when processing the AST. Should I not be doing that or perhaps it is unnecessary - ie should I be dealing with a different object in the AST?

Coordinator
Oct 19, 2011 at 4:32 PM

About retaining original text - it is still there, inside Token.Text. 

About when to actually parse Dates - it's up to you, both cases would work well - either doing it immediately at parsing/scanning (with QuotedValueLiteral), or later at AST analysis stage. There was a discussion thread before about parsing dates, the problem was to allow users to enter dates in localized formats - find it, it might be interesting for you. 

About literalExpression - hard to say, but it looks like you are not doing it correctly. LiteralValueNode is supposed to be attached to Literals (terminals like Number or String), not to NonTerminals.

Roman

Oct 19, 2011 at 5:12 PM
Edited Oct 19, 2011 at 5:24 PM

Ok. Thx very much again, Roman

Despite me thinking I would sanitize the data before parsing, you obviously suggest that it can parsed without sanitation. So  can I just confirm again, to parse

a = 1/1/1971

so that 1/1/1971 is recognised as a date, surely I shouldn't be using the QuotedValueLiteral given that there is no quote? I presumably need to structure the grammar in a particularly way. I'm struggling to see how you would do that such that precedence is given to parsing 1/1/1971 as a date before recognising it as a binary expression.

Sorry if I'm a bit slow on the uptake!

Thx again

Simon

Coordinator
Oct 19, 2011 at 6:46 PM

You can try FixedLengthLiteral; or create custom terminal. In both cases you should set higher Priority value on the terminal, so it gets to scan the input first, before the Number literal. 

Oct 20, 2011 at 7:47 AM

Thx once again

Apr 27, 2012 at 3:24 AM
Edited Apr 27, 2012 at 3:45 AM

I have almost identical requirement, I need to have 2012-01-23T09:00:00-08:00 (without quotes) recogized as a datetime literal.

I have tried FixedLengthLiteral and also looked at creating custom terminal, but failed to make it work due to my poor parse knowledge.

Is there a sample around that can accomplish this?

Thanks very much.