First off, regular expressions are great.
They are a handy quick way to validate or parse data and you can use them in almost all languages. But of all things, I forget neat regex’s and in fairness they are a pita to recall as the syntax is plain nutty.
This is where it ends. I am going to reference all the neat regex’s in this blog as I come across them rather than rely on mother Google.
Starting with a couple of simple functions I use to validate email addresses (thanks WordPress core!) and URLs.
[source language=“php“]
function is_url ( $url ) {
return ( ! preg_match ( ‚/^(http|https|ftp):\/\/([A-Z0-9][A-Z0-9_-]*(?:\.[A-Z0-9][A-Z0-9_-]*)+):?(\d+)?\/?/i‘, $url ) ) ? FALSE : TRUE;
}
function is_email( $email ) {
// Test for the minimum length the email can be
if ( strlen( $email ) < 3 ) {
return false;
}
// Test for an @ character after the first position
if ( strpos( $email, ‚@‘, 1 ) === false ) {
return false;
}
// Split out the local and domain parts
list( $local, $domain ) = explode( ‚@‘, $email, 2 );
// LOCAL PART
// Test for invalid characters
if ( !preg_match( ‚/^[a-zA-Z0-9!#$%&\’*+\/=?^_`{|}~\.-]+$/‘, $local ) ) {
return false;
}
// DOMAIN PART
// Test for sequences of periods
if ( preg_match( ‚/\.{2,}/‘, $domain ) ) {
return false;
}
// Test for leading and trailing periods and whitespace
if ( trim( $domain, " \t\n\r\0\x0B." ) !== $domain ) {
return false;
}
// Split the domain into subs
$subs = explode( ‚.‘, $domain );
// Assume the domain will have at least two subs
if ( 2 > count( $subs ) ) {
return false;
}
// Loop through each sub
foreach ( $subs as $sub ) {
// Test for leading and trailing hyphens and whitespace
if ( trim( $sub, " \t\n\r\0\x0B-" ) !== $sub ) {
return false;
}
// Test for invalid characters
if ( !preg_match(‚/^[a-z0-9-]+$/i‘, $sub ) ) {
return false;
}
}
// Congratulations your email made it!
return true;
}
[/source]
Lastly and most recently, I wanted to parse a string of code to find an assignment value.
I came across a neat regex to help me parse out the values of these variables.
[source language=“php“]
$code = "var string_variable = Superduper;
var digit_variable = 123456;";
function get_assignment_value( $needle, $haystack, $type = ’string‘ ) {
if( $type == ‚digit‘ )
preg_match( ‚/.‘.$needle.‘ = (?P<value>\d+)/‘, $haystack, $matches );
else
preg_match( ‚/.‘.$needle.‘ = (?P<value>\w+)/‘, $haystack, $matches );
if( empty( $matches[ ‚value‘ ] ) )
return false;
if( $type == ‚digit‘ )
return (int) $matches[ ‚value‘ ];
return $matches[ ‚value‘ ];
}
var_dump( get_assignment_value( ’string_variable‘, $code ) ); // string(10) "Superduper"
var_dump( get_assignment_value( ‚digit_variable‘, $code, ‚digit‘ ) ); // int(123456)
var_dump( get_assignment_value( ‚digit_variable‘, $code ) ); // string(6) "123456"
[/source]
My main source of guidance on this voodoo here and their sometimes hard to find but useful reference.
Also, here is a good starter on building a regex.
![Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee. I know regular expressions](https://i0.wp.com/imgs.xkcd.com/comics/regular_expressions.png?resize=600%2C607)
