On the move:

I'm currently moving the following to their own blogs: Unix, HTML for Blogs, Perl & Java.
The links are on the left side.

Sunday, March 8, 2009

Regular Expressions

The create table statement and all the code to produce this report can be found here. Scroll down to the end to find the attachment: RegExp.Sql

This is the data the queries run against:
Select * From Fam;

NAME PHONE DSCR
----- --------------- --------------------------------------------------------------------------------
Me 555-123-0987 Fun-loving aunt with a lot of nieces and nephews. 111-123-4567
Don 555-234-2323 alternate phone (555)234-2222. Father Father of mostly-grown kids.
Craig 555-888-8888 at-home {nuts} {nuts} +45 8 676 49 86
Glenn 555-232-8888 (123) 242-2424
Dee 232-388-3333 -farmer 787.222.9834


Description to Match: 3 of anything, a dash, then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'...-....')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

Description to Match: 3 of anything, a dash, then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'.{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

BRACKET EXPRESSIONS

Description to Match: 3#s, -, 4

NAME REGEXP_SUBSTR(DSCR,'[12345]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers(1-5), -, 4

NAME REGEXP_SUBSTR(DSCR,'[1-5]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers,-,4

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424

Description to Match: Any 3 characters except for numbers,-,4

NAME REGEXP_SUBSTR(DSCR,'[^[:digit:]]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don tly-grow

Description to Match: 3 numbers, A-G or X-z or 'n' or 'u' ,a dash then 4 more characters.
Don's data doesn't show because he has an earlier digit value


NAME REGEXP_SUBSTR(DSCR,'[[:digit:]A-GX-Znu]{3}-.{4}')
----- ------------------------------------------------------------------------------------------------
Me Fun-lovi
Don 234-2222
Glenn 242-2424

Description to Match: 3 numbers, a literal period

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}\.')
----- ------------------------------------------------------------------------------------------------
Don 222.
Dee 787.

SUBEXPRESSIONS

(find a "+" then 1-4 sections of 1-3 digits followed by a space. 
The last grouping will be 1 or more numbers)
i.e., ([0-9]{1,3} ) means 1-3 numbers followed by a space

Description to Match: International phone #


NAME REGEXP_SUBSTR(DSCR,'\+([0-9]{1,3}){1,4}([0-9]+)')
----- ------------------------------------------------------------------------------------------------
Craig +45 8 676 49 86

ALTERNATION

(use '|')  i.e., (-|\.)

Description to Match: 3 numbers, - or ., then 4 of anything


NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}(-|\.).{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424
Dee 787.222.

Description to Match: 3 numbers, - or ., then 4 of anything

NAME REGEXP_SUBSTR(DSCR,'[[:digit:]]{3}[.-].{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-
Don 234-2222
Glenn 242-2424
Dee 787.222.

Description to Match: Phone numbers with dashes, periods or parens

NAME REGEXP_SUBSTR(DSCR,'([0-9]{3}[.-]|\([0-9]{3}\)?)'||'[0-9]{3}[.-][0-9]{4}')
----- ------------------------------------------------------------------------------------------------
Me 111-123-4567
Don (555)234-2222
Glenn (123) 242-2424
Dee 787.222.9834

GREEDINESS

Description to Match: Get a word ending in a space (gets all it can find)

.+[[:space:]]
-------------
In the

Description to Match: To get 1st word only

[^[:space:]]
------------
In

BACKREFERENCES

Description to Match: Find double-words

REGEXP_SUBSTR(DSCR,'(^|[[:space:][:punct:]]+)([[:alpha:]]+)'||
NAME DSCR '[[:space:][:punct:]]+\2')
----- ------------------------------ -----------------------------------------------------------------
Don alternate phone (555)234-2222. Father . Father Father
Father Father of mostly-grown
kids.
Craig at-home {nuts} {nuts} +45 8 {nuts} {nuts
676 49 86

No comments:

Post a Comment