Regular expression is very useful for validating emails,phone numbers based on countries, postal-zip codes etc, searching a string, file name and more. Even its useful,writing a new regular expression is quite difficult to freshers like me. In fact it’s very easy to understand and very easy to create a new regex as you needed. Here some tricks and tips for crack the regex, I used and suggested to my friends.
Let us look into some points. So what is a regular expression?
“Regular expression or regex is a sequence of symbols and characters expressing a string or pattern to be searched for within a longer piece of text.”
It’s the simple answer I got after googling it. It sounds very simple. If I wanted to find a string “fox” from “the quick brown fox jumps over the lazy dog”, then we can use a simple regex that matches to the word “fox”.
Then how we process the regular expressions?
There is e some piece of software called regular expression engine to process the regex. They trying to match the patterns with the given strings. They ensure the pattern is right and then matches the correct strings. There are many regular expression engines are available. Each one is different in working and string matching patterns. Some commonly used engines are Perl,PCRE, PHP, POSTFIX etc
Then let’s look the structure of the regex. The primary attention goes to characters. Characters mean what we used to create a regex. Commonly we used ASCII, including letters, numbers and special characters. Unicode is also used to match in other languages.
Now let’s crack Regular expression.. We can search a string by direct. We can search exact string, like find option of the text editors and word processors.
Here we search “abc” in the string and result is highlighted. Also, we can provide a number or special character as search pattern.
Simple. Uh? Next, we can look into simple deeper. How to create a simple pattern, First, let’s find the pattern to find any digits. ‘\d’ is the keyword used to find the digits between 0 to 9. ‘\’ is used to distinguish from letter ‘d’. Similarly, we can find all non-numeric value by ‘\D’.
Catch it? Then let’s move to next important thing. Wildcard, sound familiar in card games. Yes. This is a character we can substitute for all other character and is denoted by ‘.’. That is we can represent any digit, any letter, special character, or whitespace with a ‘.’ . As we learned early, if we can find a ‘.’ in our string, then use ‘\.’ .
Let’s move on to next section. Matching a particular list of characters For this purpose, we use [](Brackets). The characters we need to find is enclosed in this [ ]. eg; [a,b,c] match with a,b or c. Similarly, we can find another interesting symbol here .^(hat). It’s used to exclude the letters inside the [ ]. eg: [^ a,b,c] means excluding a,b or c, all the remaining will match for this regex.
It’s easy,Isn’t? Now we can specify the range of characters instead of a set of characters in the [ ]. It’s more minify our regular expression.
Tip : [A-Za-Z0-9] commonly denoted as ‘\w’. This is used to check the entered string is English or not.
Let’s learn how repetition avoids in the regular expression. For example, I can validate the zeros in 1,000,000. Then it’s very unfair to use ‘\0\0\0\0\0\0’. Instead, we can use 0{6}. { } used to denote how many time the character or pattern repeats. Let have a look
Here also we can specify the limits, means a minimum value and a maximum value. To obtain that we can specify the lower and upper limit of the count like{2,6}. This means minimum repetition of 2 and maximum repetition of 6
Is it ok? Then let’s move to another simple thing. Kleene star and Kleene plus.& Don’t worry!, It’s easy. Kleene star is simply denoted by ‘*’ and Kleene plus is denoted by ‘+’. The difference is ‘*’ is denote zero to infinite count and ‘+’ denote 1 to infinite count.Let’s have a look.
Now let’s learn a new thing. Optionality. It’s denoted by ‘?’. If we can match ‘?’, then use ‘\?’.
It actually adds none or preceding character.
Next, one I starts with or end with. This is a common type operation we found in the regular expression. Start with denoted by ‘^’ and end with by ‘$’
These are the things we have to understand properly to generate a regex. Grouping of these rules is possible in regex by using ( ). Let try it yourself. Hope you enjoy it.
You must be logged in to post a comment.