Sunday, March 8, 2009

Regular Expressions

The create table statement and all the code to produce this report can be found here. Scroll down to the end to find the attachment: RegExp.Sql

This is the data the queries run against:
Select * From Fam;

NAME PHONE DSCR
----- --------------- --------------------------------------------------------------------------------
Me 555-123-0987 Fun-loving aunt with a lot of nieces and nephews. 111-123-4567
Don 555-234-2323 alternate phone (555)234-2222. Father Father of mostly-grown kids.
Craig 555-888-8888 at-home {nuts} {nuts} +45 8 676 49 86
Glenn 555-232-8888 (123) 242-2424
Dee 232-388-3333 -farmer 787.222.9834


Description to Match: 3 of anything, a dash, then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'...-....')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

Description to Match: 3 of anything, a dash, then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'.{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

BRACKET EXPRESSIONS

Description to Match: 3#s, -, 4

NAME REGEXP_SUBSTR(DSCR,'[12345]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers(1-5), -, 4

NAME REGEXP_SUBSTR(DSCR,'[1-5]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers,-,4

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: Any 3 characters except for numbers,-,4

NAME REGEXP_SUBSTR(DSCR,'[^[:digit:]]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don tly-grow

Description to Match: 3 numbers, A-G or X-z or 'n' or 'u' ,a dash then 4 more characters.
Don's data doesn't show because he has an earlier digit value


NAME REGEXP_SUBSTR(DSCR,'[[:digit:]A-GX-Znu]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers, a literal period

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}\.')
----- ------------------------------------------------------------------------------------------------
Don 222.
Dee 787.

SUBEXPRESSIONS

(find a "+" then 1-4 sections of 1-3 digits followed by a space. 
The last grouping will be 1 or more numbers)
i.e., ([0-9]{1,3} ) means 1-3 numbers followed by a space

Description to Match: International phone #


NAME REGEXP_SUBSTR(DSCR,'\+([0-9]{1,3}){1,4}([0-9]+)')
----- ------------------------------------------------------------------------------------------------
Craig +45 8 676 49 86

ALTERNATION

(use '|')  i.e., (-|\.)

Description to Match: 3 numbers, - or ., then 4 of anything


NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}(-|\.).{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424
Dee 787.222.

Description to Match: 3 numbers, - or ., then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}[.-].{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424
Dee 787.222.

Description to Match: Phone numbers with dashes, periods or parens

NAME REGEXP_SUBSTR(DSCR,'([0-9]{3}[.-]|\([0-9]{3}\)?)'||'[0-9]{3}[.-][0-9]{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-4567
Don (555)234-2222
Glenn (123) 242-2424
Dee 787.222.9834

GREEDINESS

Description to Match: Get a word ending in a space (gets all it can find)

.+[[:space:]]
-------------
In the

Description to Match: To get 1st word only

[^[:space:]]
------------
In

BACKREFERENCES

Description to Match: Find double-words

REGEXP_SUBSTR(DSCR,'(^|[[:space:][:punct:]]+)([[:alpha:]]+)'||
NAME DSCR '[[:space:][:punct:]]+\2')
----- ------------------------------ -----------------------------------------------------------------
Don alternate phone (555)234-2222. Father . Father Father
Father Father of mostly-grown
kids.
Craig at-home {nuts} {nuts} +45 8 {nuts} {nuts
676 49 86

No comments:

Post a Comment