On the move:

I'm currently moving the following to their own blogs: Unix, HTML for Blogs, Perl & Java.
The links are on the left side.

Tuesday, September 16, 2008

Regular Expressions

--Regular Expression Operators and Metasymbols
--
----------------------------------
-- Anchor to:
----------------------------------
-- ^ the beginning of a string ^A finds Alice, Alan
-- $ the end of a string g$ finds dog, lag
----------------------------------
-- Match # times
----------------------------------
-- * >= 0 12*3 finds 13 123 1223 12223
-- + >= 1 ar*gh finds argh arrgh arrrrrgh
-- ? =0-1 t?hat finds hat that
-- {m} = m times bo(3) finds booo
-- {m,} >= m times bo(3,) finds booo boooo booooooooo
-- {m,n} >= m times <= n times bo(2,3) finds bo boo
----------------------------------
-- Classes
----------------------------------
--[:alpha:] Alphabetic
--[:lower:] Lowercase
--[:upper:] Uppercase
--[:digit:] Numeric digits
--[:alnum:] Alphanumeric
--[:space:] Space (nonprinting), carriage return, newline, form feed
--[:punct:] Punctuation
--[:cntrl:] Control (nonprinting)
--[:print:] Printable
--[:xdigit:] Hexadecimal characters
----------------------------------
-- Groups of things
----------------------------------
-- x|y alternative matches
-- ( ) a group - treated as a single subexpression (known as captures) i.e., cat(s|alog)
-- [ ] character matching list (most things her are taken as literals i.e.,
-- [^ ] nonmatching list (most things her are taken as literals
----------------------------------
-- Backslash
----------------------------------
-- \1 - \9 is a backreference to the \#th (subexpression)
-- \n newline
-- \\ Backslash character
-- \ Treat the next character as not a special charater
----------------------------------
-- Wildcard
----------------------------------
-- . any character except NULL h.t finds hot hat hit hut
----------------------------------
-- Need example
----------------------------------
-- [..] one collation element, and can be a multicharacter element
-- [==] equivalence classes. For example, [=a=] matches all characters having base letter 'a'.

---------------------------------------------------------------------------------------------------------------------------------
--RegExp_Count -- in 11G
---------------------------------------------------------------------------------------------------------------------------------
--Regexp_Like (
--String To Be Searched,
--Pattern To Be Found,
--Optional Parameter
--)
---------------------------------------------------------------------------------------------------------------------------------
--Regexp_Instr (
--String to be searched ,
--Pattern to be found
-- [, Start position /* If omitted then column 1 of string*/
-- [, Occurrence /* If omitted than first occurrence */
-- [, Return_option /* 0=return position of 1st character; 1=return position of next character */
-- [, Match_parameter ]
-- ]
-- ]
-- ]
--)
------------------------------------------------------------------------------------------------------------------------
--Regexp_Substr(
--String To Be Searched ,
--Pattern To Be Found
-- [, Start position /* If omitted then column 1 of string*/
-- [, Occurrence /* If omitted than first occurrence */
-- [, Match_Parameter ]
-- ]
-- ]
-- )
-------------------------------------------------------------------------------------------------------------
--Regexp_Replace(
--String to be searched ,
--Pattern to be found ,
-- [, Replace_String [, Position
-- [,Occurrence,
-- [Match_Parameter]
-- ]
-- ]
-- ]
--);
---------------------------------------------------------------------------------------------------------------------------------
-- Match Parameters:
-- * 'i' Case-Insensitive
-- * 'c' Case-Sensitive
-- * 'n' Matches Newline Character
-- * 'm' Source Treated As Multiple Lines
-- * 'x' Ignores Whitespace Characters



spool reg.txt
Column Name Format A10
Column Range Heading 1980|1981|1982
Column Specific Heading 1980||1982
Column White_House Format A45 Heading -
' 1 2 3 4 |123456789012345678901234567890123456789012345'
Column Instr1 Format 99
Column Instr2 Format 99
Column Instr3 Format 99
Column Instr7 Format 99
Column Invalid Format 99
Column Valid Format A5
Column RE1 Format A6
Column RE2 Format A6
Column RE3 Format A6
Column RE4 Format A6
Set NewPage 2 echo on recsep off pagesize 50 linesize 120 Feedback off

SQL> ttitle left '+=======================================================================================================================' skip 1-
> '| Begin Example: ' pno skip 1-
> '+=======================================================================================================================' skip 0
SQL>
SQL> repfooter left '+=======================================================================================================================' skip 1-
> '| End Example: ' pno skip 1-
> '+=======================================================================================================================' skip 3
SQL>
SQL> --------------------------------------------------------------------
SQL> -- create table
SQL> --------------------------------------------------------------------
SQL> Create Table Name_Birth (Name Varchar2(10), DOB Date) ;
SQL> Insert into Name_Birth Values ('KING' , '17-NOV-83');
SQL> Insert into Name_Birth Values ('Smyth' , '03-DEC-81');
SQL> Insert into Name_Birth Values ('SMITH' , '17-DEC-80');
SQL> Insert into Name_Birth Values ('Feeney', '09-DEC-82');
SQL> Insert into Name_Birth Values ('smath' , '23-JAN-82');
SQL>
SQL> Clear Screen
SQL> --------------------------------------------------------------------
SQL> Define PNo = 1;
SQL> -- Find Smith or Smyth
SQL> -- ^ start at the beginning of string
SQL> -- Sm find these characters then
SQL> -- (i|y) find either an 'i' or 'y'
SQL> -- th then find a 'th'
SQL> -- $' there should be nothing after the 'th' -- end of string
SQL> -- ,'i' optional parameters says be insensitive to case
SQL> --[^(i|y)] says 'not' either 'i' or 'y'
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Name,
3 Case When Regexp_Like (Name, '^Sm(i|y)th$','i')
4 then Name
5 Else ' ' End As
6 Smith,
7 Case When Not Regexp_Like (Name, '^Sm(i|y)th$','i')
8 then Name
9 Else ' ' End As
10 Not_Smith,
11 Case When Regexp_Like (Name, '^Sm[^(i|y)]th$','i')
12 then Name
13 Else ' ' End As
14 "Sm?th"
15 from Name_Birth;

+=======================================================================================================================
| Begin Example: 1
+=======================================================================================================================
' NAME SMITH NOT_SMITH Sm?th
- ---------- ---------- ---------- ----------
| KING KING
| Smyth Smyth
| SMITH SMITH
| Feeney Feeney
| smath smath smath
+=======================================================================================================================
| End Example: 1
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 2;
SQL> -- Find date of birth from 1980 and 1982
SQL> -- ^ search from the beginning
SQL> -- 198 for these 3 specific characters and then
SQL> -- [0-2] choose anything between zero and two
SQL> -- $ this is the end of the string
SQL> -- OR
SQL> -- [02] choose specifically 0 or 2
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Name,
3 To_Char(DOB,'yyyy') As
4 DOB,
5 Case When Regexp_Like (To_Char(DOB,'yyyy'), '^198[0-2]$')
6 Then To_Char(DOB,'yyyy')
7 Else ' ' End As
8 Range,
9 Case When Regexp_Like (To_Char(DOB,'yyyy'), '^198[02]$')
10 Then To_Char(DOB,'yyyy')
11 Else ' ' End As
12 Specific,
13 Case When Not Regexp_Like (To_Char(DOB,'yyyy'), '^198[012]$')
14 Then To_Char(DOB,'yyyy')
15 Else ' ' End As
16 Other
17 From Name_Birth
18 Order By DOB;

+=======================================================================================================================
| Begin Example: 2
+=======================================================================================================================
1980 1980
1981
' NAME DOB 1982 1982 OTHE
- ---------- ---- ---- ---- ----
| SMITH 1980 1980 1980
| Smyth 1981 1981
| smath 1982 1982 1982
| Feeney 1982 1982 1982
| KING 1983 1983
+=======================================================================================================================
| End Example: 2
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 3;
SQL> -- Find one or more occurrences of non-blanks
SQL> -- (there is no 8th occurrence, and it returns 0)
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' , '1600 Penn Ave NW Wash, DC 20500' White_House,
2 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[^ ]+', 1, 1) Instr1,
3 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[^ ]+', 1, 2) Instr2,
4 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[^ ]+', 1, 3) Instr3,
5 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[^ ]+', 1, 7) Instr7,
6 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[^ ]+', 1, 8) Invalid
7 From Dual ;

+=======================================================================================================================
| Begin Example: 3
+=======================================================================================================================
1 2 3 4
' 123456789012345678901234567890123456789012345 INSTR1 INSTR2 INSTR3 INSTR7 INVALID
- --------------------------------------------- ------ ------ ------ ------ -------
| 1600 Penn Ave NW Wash, DC 20500 1 6 11 27 0
+=======================================================================================================================
| End Example: 3
+=======================================================================================================================

SQL> ----------------------------------------------------------------------
SQL> Define PNo = 4;
SQL> -- Find position of words that start with P or W and 4 characters long
SQL> ----------------------------------------------------------------------
SQL>
SQL>
SQL> Select '|' , '1600 Penn Ave NW Wash, DC 20500' White_House,
2 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[P|W][[:alpha:]]{3}',1,1,0, 'i') Instr_1,
3 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[P|W][[:alpha:]]{3}',1,2,0, 'i') Instr_2,
4 Regexp_Instr('1600 Penn Ave NW Wash, DC 20500','[P|W][[:alpha:]]{3}',1,3,0, 'i') Instr_3
5 From Dual ;

+=======================================================================================================================
| Begin Example: 4
+=======================================================================================================================
1 2 3 4
' 123456789012345678901234567890123456789012345 INSTR_1 INSTR_2 INSTR_3
- --------------------------------------------- ---------- ---------- ----------
| 1600 Penn Ave NW Wash, DC 20500 6 18 0
+=======================================================================================================================
| End Example: 4
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 5;
SQL> -- Backreferences: the ability to store subexpressions for reuse later.
SQL> -- The matched part of the subexpression is stored in a temporary buffer.
SQL> -- The buffer is numbered left to right and accessed with \1 to \9
SQL> -- ( => creates a subexpression
SQL> -- . => match anything
SQL> -- *) => 0 or more times
SQL> -- The spaces between (.*) will also be matched
SQL> -- bring back in new order (3) then comma then (1) then (2)
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Regexp_Replace(
3 'Little Jack Horner',
4 '(.*) (.*) (.*)', '\3, \1 \2'
5 )
6 From Dual;

+=======================================================================================================================
| Begin Example: 5
+=======================================================================================================================
' REGEXP_REPLACE('LIT
- -------------------
| Horner, Little Jack
+=======================================================================================================================
| End Example: 5
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 6;
SQL> -- Find duplicate words
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Regexp_Substr(
3 'Now is the time for all all good folks to come to the aid of their country',
4 '([:alnum:]+)([[:space:]]+)\1'
5 ) As Substr
6 From Dual;

+=======================================================================================================================
| Begin Example: 6
+=======================================================================================================================
' SUBSTR
- -------
| all all
+=======================================================================================================================
| End Example: 6
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 7;
SQL> -- Look for a valid phone number:
SQL> --------------------------------------------------------------------
SQL>
SQL> Create Table Phonenums ( Phone_Number Varchar2(30) Constraint C_Phonenums_Pnf
2 Check (Regexp_Like (Phone_Number, '^\(\d{3}\) \d{3}-\d{4}$')) );
SQL>
SQL> Insert Into Phonenums (Phone_Number) Values( '(650) 555-5555' );
SQL> Insert Into Phonenums (Phone_Number) Values( '(215) 555-3427' );
SQL> Insert Into Phonenums (Phone_Number) Values( '650 555-5555' );
Insert Into Phonenums (Phone_Number) Values( '650 555-5555' )
*
ERROR at line 1:
ORA-02290: check constraint (SCOTT.C_PHONENUMS_PNF) violated


SQL> Insert Into Phonenums (Phone_Number) Values( '(650)555-5555' );
Insert Into Phonenums (Phone_Number) Values( '(650)555-5555' )
*
ERROR at line 1:
ORA-02290: check constraint (SCOTT.C_PHONENUMS_PNF) violated


SQL> Select '|',
2 Phone_number
3 From PhoneNums;

+=======================================================================================================================
| Begin Example: 7
+=======================================================================================================================
' PHONE_NUMBER
- ------------------------------
| (650) 555-5555
| (215) 555-3427
+=======================================================================================================================
| End Example: 7
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 8;
SQL> -- Before 111.222.3333 After (111) 222-3333
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Regexp_Replace('111-222-3333',
3 '([[:digit:]]{3})\-([[:digit:]]{3})\-([[:digit:]]{4})',
4 '(\1) \2-\3')
5 Result
6 From Dual;

+=======================================================================================================================
| Begin Example: 8
+=======================================================================================================================
' RESULT
- --------------
| (111) 222-3333
+=======================================================================================================================
| End Example: 8
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 9;
SQL> -- Double space words
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Regexp_Replace('OKLAHOMA!', '(.)', '\1 ') Result
3 From Dual;

+=======================================================================================================================
| Begin Example: 9
+=======================================================================================================================
' RESULT
- ------------------
| O K L A H O M A !
+=======================================================================================================================
| End Example: 9
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 10;
SQL> -- Kill extra spaces
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,
2 Regexp_Replace(
3 'Mary, Mary, quite contrary. How does your garden grow?'
4 , '( ){2,}', ' ') Result
5 From Dual;

+=======================================================================================================================
| Begin Example: 10
+=======================================================================================================================
' RESULT
- ------------------------------------------------------
| Mary, Mary, quite contrary. How does your garden grow?
+=======================================================================================================================
| End Example: 10
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 11;
SQL> -- Check for alpha only
SQL> --------------------------------------------------------------------
SQL>
SQL> CREATE TABLE Alpha_Table (c1 VARCHAR2(20), CHECK (REGEXP_LIKE(c1, '^[[:alpha:]]+$')));
SQL> Insert into Alpha_Table values ('Lynn');
SQL> Insert into Alpha_Table values ('ABC');
SQL> Insert into Alpha_Table values ('123');
Insert into Alpha_Table values ('123')
*
ERROR at line 1:
ORA-02290: check constraint (SCOTT.SYS_C008660) violated

SQL> Insert into Alpha_Table values (123);
Insert into Alpha_Table values (123)
*
ERROR at line 1:
ORA-02290: check constraint (SCOTT.SYS_C008660) violated

SQL>
SQL> Select '|',
2 c1
3 from Alpha_Table;

+=======================================================================================================================
| Begin Example: 11
+=======================================================================================================================
' C1
- --------------------
| Lynn
| ABC
+=======================================================================================================================
| End Example: 11
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 12;
SQL> -- SSN
SQL> --------------------------------------------------------------------
SQL> Create Table SocSec_Table ( SocSecNo Varchar2(20));
SQL> Insert Into SocSec_Table Values ( '444-77-9999' );
SQL> Insert Into SocSec_Table Values ( '494=77-9999' );
SQL> Insert Into SocSec_Table Values ( '4444-A7-9999');
SQL> Insert Into SocSec_Table Values ( '444-7-79999' );
SQL> Insert Into SocSec_Table Values ( '987-65-4974' );
SQL>
SQL> Select '|' ,
2 SocSecno,
3 Case When Regexp_Like(SocSecNo,'^[0-9]{3}-[0-9]{2}-[0-9]{4}$')
4 Then 'Good'
5 Else 'Bad'
6 End
7 Valid
8 From SocSec_Table;

+=======================================================================================================================
| Begin Example: 12
+=======================================================================================================================
' SOCSECNO VALID
- -------------------- -----
| 444-77-9999 Good
| 494=77-9999 Bad
| 4444-A7-9999 Bad
| 444-7-79999 Bad
| 987-65-4974 Good
+=======================================================================================================================
| End Example: 12
+=======================================================================================================================

SQL> --------------------------------------------------------------------
SQL> Define PNo = 13;
SQL> -- Alternative matches
SQL> --------------------------------------------------------------------
SQL>
SQL> Select '|' ,String,
2 Regexp_Substr(String,'fl(y(ing)?|(ew)|(own)|(ies))' ,1,1)Re1,
3 Regexp_Substr(String,'fl(y(ing)?|(ew)|(own)|(ies))' ,1,2)Re2,
4 Regexp_Substr(String,'fl(y(ing)?|(ew)|(own)|(ies))' ,1,3)Re3,
5 Regexp_Substr(String,'fl(y(ing)?|(ew)|(own)|(ies))' ,1,4)Re4
6 From (
7 Select '|' ,
8 'If you flew like I fly, then you would have flown with all the flying birds.'
9 As String
10 From Dual
11 );

+=======================================================================================================================
| Begin Example: 13
+=======================================================================================================================
' STRING RE1 RE2 RE3 RE4
- ---------------------------------------------------------------------------- ------ ------ ------ ------
| If you flew like I fly, then you would have flown with all the flying birds. flew fly flown flying
+=======================================================================================================================
| End Example: 13
+=======================================================================================================================

No comments:

Post a Comment