Talk:Regex Posix

From HaskellWiki
Jump to navigation Jump to search

Just a quick note that a lot of the other-implementations-are-not-compliant examples appear to be about empty patterns (e.g., issues with the matching of () in (()|.)(b)).

If you read the linked to POSIX standard, however, it seems that such empty expressions are not actually valid regexs. For example, the defined extended regex grammar is

extended_reg_exp   :                      ERE_branch
                   | extended_reg_exp '|' ERE_branch
                   ;
ERE_branch         :            ERE_expression
                   | ERE_branch ERE_expression
                   ;
ERE_expression     : one_char_or_coll_elem_ERE
                   | '^'
                   | '$'
                   | '(' extended_reg_exp ')'
                   | ERE_expression ERE_dupl_symbol
                   ;
one_char_or_coll_elem_ERE  : ORD_CHAR
                   | QUOTED_CHAR
                   | '.'
                   | bracket_expression
                   ;
ERE_dupl_symbol    : '*'
                   | '+'
                   | '?'
                   | '{' DUP_COUNT               '}'
                   | '{' DUP_COUNT ','           '}'
                   | '{' DUP_COUNT ',' DUP_COUNT '}'
                   ;

from which I don't see how you can form () as it must contain a extended_reg_exp which has to consist of at least one ERE_branch which must consist of at least one ERE_expression which must have at least one character of some sort.