Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Understanding regex basics
#1
A regex can be used to look for literal text, text patterns, or a combination of both. To find a match with a PCRE, use the PHP function preg_match(), which normally takes two arguments: a string containing the pattern being searched for and the string to be searched. Although you can pass the pattern directly to the function, it’s often more convenient to store it first in a variable.


PCREs are always enclosed in a pair of delimiters. Traditionally, forward slashes are used (for example, /pattern/), but you can use any nonalphanumeric character other than a backslash. You can also use a pair of curly braces ({}), square brackets ([]), or angle brackets (<>) as delimiters. This can be useful when the pattern includes slashes—for example, when you’re searching for a path name or URL.

Placing an i after the final delimiter, as in the previous email example, makes the search case insensitive.

This is how you use the regex to check whether an email address is likely to be genuine (note that the pattern is enclosed in quotes, as it must be a string):

Quote:$email = '[email protected]';
$pattern = '/^\w[-.\w]*@([-a-z0-9]+\.)+[a-z]{2,4}$/i';
$ok = preg_match($pattern, $email);
if (!$ok) {
// code to handle spurious address
}

Setting the limits of the pattern If you look at the email regex, you will notice that the first character after the opening delimiter is ^, and the final character before the closing delimiter is $. These are known as anchors:

^ as the first character in a regex indicates the following pattern must come at the beginning of the line or string.
$ as the final character in a regex indicates the preceding pattern must come at the end of the line or string.

Quote:$pattern = '/^ment/'; // must come at start
preg_match($pattern, 'mentor') // match
preg_match($pattern, 'mentality') // match
preg_match($pattern, 'cement') // no match
$pattern = '/ment$/'; // now must come at end
preg_match($pattern, 'mentor') // no match
preg_match($pattern, 'cement') // match

By using both anchors in the email regex, you are specifying that there must be an exact match. Consequently, the regex will match an email on its own, as you would expect it to be mentered in a form. It will not match the same email in the middle of a sentence. This degree of precision is often very important when validating user input.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)