i'm just a regular expression kind of guy

September 15, 2005 under Computers, Programming

Thus far in my young career, I’ve had to parse a lot of data. The data in question is usually a mish-mash of numeric and non-numeric data such as “N120 X3.243 Y0.2 Z0.1 A90 B0” (in this case, I’m talking about G-Code). For most tasks, I have to write code to extract either one or a few of the values, and optionally manipulate and change them to something else. For the most part, I stuck with tokenizing and anazlyzing the data using the tools that were available to me, given the language I was using at the time. When worse came to worse, I wrote my own routines to look for and extract what I needed. And then one day, I stumbled across regular expressions (aka: regex or reg ex). The power that they offer when faced with an ugly parsing task is phenomenal. I’m still not 100% comfortable with the syntax and it’s a tad more difficult when there are different regex engines, many with their own nuances and gotchas. As such, the regex implentation in the .NET Framework is slightly different than, JavaScript‘s implementation, which is slightly different than Python‘s and everybody wants to one-up Perl, but they’re all extremely similar on the whole. Up to this point, when I’ve written regex’s, they’ve been pretty simple and I’m usually either testing for an occurance or extracting a substring.

Yesterday, I finally tried a search and replace, regex-style. I had to go through a lot of phone numbers in a format like “Phone: (xxx)xxx-xxxx” or “Tel: (xxx)xxx-xxxx” and so on. I needed to convert them to something like “xxx-xxx-xxxx”. Replacing it was easy and here’s how I did it with VBS:

' Purpose: Format phone numbers like xxx-xxx-xxxx
'       I: phone number combined with other text, formatted
'          like (xxx)xxx-xxxx
'       O: phone number formatted like xxx-xxx-xxxx
Function FormatPhoneNumber(strPhoneString)
    Dim objRegEx          ' WSH RegEx object
    Dim objMatches        ' Matches object
    Dim strOldPhoneNum    ' extracted phone# in old format
    Dim strNewPhoneNum    ' phone# in new format
 
 
 
    ' Init the return value to an empty string
    strNewPhoneNum = vbNullString
 
    ' Bail on this record if the phone number is blank
    If (strPhoneString = vbNullString) Then
        FormatPhoneNumber = strNewPhoneNum
        Exit Sub
    End If
 
    ' Pull the phone number out from the format it's
    ' currently in (Ex: "Phone: (xxx)xxx-xxxx") and
    ' format it like this (Ex: "xxx-xxx-xxxx").
    Set objRegEx = New RegExp
 
    objRegEx.Pattern = "\((\d{3})\)(\d{3})-(\d{4})$"
    objRegEx.IgnoreCase = True
    objRegEx.Global = True
 
    Set objMatches = objRegEx.Execute(strPhoneString)
 
    If (objMatches.Count <> 0) Then
        For Each Match in objMatches
            strOldPhoneNum = Match.Value
        Next
 
        strNewPhoneNum = objRegEx.Replace(strOldPhoneNum, _
                                          "$1-$2-$3")
    End If
 
    ' Cleanup
    Set objRegEx = Nothing
    Set objMatches = Nothing
 
    ' Return the formatted phone number
    FormatPhoneNumber = strNewPhoneNum
End Function

Obviously, I had to group the digits of the original phone number into three groups so that I could reuse it in the Replace() method. Handy indeed 🙂

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
comments: 0 »

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>