Tag Archives: []

Regular Expression : Escaping Meta Characters within character classes

Many of us have written applications where we have used regular expression for different tasks, like validation , parsing and other related task. Regular Experssion is quite a powerful tool, and has been available in most of the programming languages that are used today (either natively supported or by using libraries).

Usually in a regular expression ‘\’ is used as a the escape sequence (it may be different for different languages , I am using C# convention), to escape meta characters. Regular Expression also support a construct called ‘character classes’ ,which can be roughly taken as a set of characters. There are some pre-defined character classes like \d \w \s etc. and if you need a more customized version you can define your own using Square brackets notation ‘[]’.

The world inside the square brackets is much different than the one outside.  Inside the brackets , there are only two meta characters, ‘^’ and ‘-‘; even an opening bracket ‘[‘, asterisk ‘*’ , plus sign ‘+’ are not considered as meta inside []. Furthermore , [] has no escape sequence within them. Now what you will do if you want to have ‘^’ , ‘-‘  and ‘]’ inside a character class.

Well these characters are escaped using particular placement within brackets.

One key syntactic difference is that the backslash is NOT a metacharacter in a POSIX bracket expression. So in POSIX, the regular expression [\d] matches a \ or a d. To match a ], put it as the first character after the opening [ or the negating ^. To match a -, put it right before the closing ]. To match a ^, put it before the final literal – or the closing ]. Put together, []\d^-] matches ], \, d, ^ or -.

source :http://www.regular-expressions.info/posixbrackets.html

Problem Solved . 😀

By the way , this web site gives a good introduction about Regular Expressions.

regards

Faraz

Advertisements