I needed a way to normalize nearly any given input for a configuration parameter on a dashboard. Server side code is in PHP so I wrote a test script to experiment with preg_replac
e and regex. When I say normalize meaning, the resulting value must match the following restrictions:
- Take any cased value and return all lower case. For example, given "ValiDVoicE", after normalizing, "validvoice" would be returned.
- Strip away all special characters and white space except underscores. For example, given "@@Valid _voice" return only "valid_voice".
- Trim all nonessential white space from front and end of string. For example, given " (&& (*Paxus-Demo " return "paxus_demo".
Solution: (I copied my test script below)
$myvar = [ "Paxus-Demo", "Paxus Demo", "paxus_demo", "Paxus Demo", "paxus-Demo", "paxus_ Demo", "Paxus _Demo", "*(&*& (*Paxus-Demo ", "@@Valid _voice", "Valid-Voice", "PortaL-Demo", "gui_demo", "Gui-Demo ", "VoiceInstance", "vAlid Voice", " vaLid_ _voiCe ", ]; for( $i = 0; $i < sizeof( $myvar ); ++$i ) { $modded = trim( strtolower( preg_replace( array( '/[^a-zA-Z0-9 ]/i', '<\W+>'), array( ' ', '_' ), $myvar[$i] ) ), '_' ); echo "modded2 = " . $modded." [ ". $i ." ] = " .$myvar[$i]. "<br />"; }
My question is not necessarily for elegance, but could I have done this all in one regex expression using lookarounds? Bear in mind, I never took the time to truly understand using regex before today so my knowledge of using lookarounds is still a bit wonky. That said, if someone can simplify the use of lookarounds it'll be greatly appreciated.