It is no surprise that programmers spend a large portion of their time wading through tons of data. Data and applications go hand in hand after all. It was in one of a recent customer requests that I started looking for certain missing pieces of information in the database and got into the whole data crunching game. It was like looking for a needle in a haystack. (Needles in a haystack would be more appropriate; I never knew how many I was looking for; My manager described it as counting 50,000 cats that kept moving around) The good part of this not so interesting request was that I learnt ways to optimize queries, some new tools and tricks. Getting familiar with Oracle Regular Expressions was one of them.

Oracle regular expression is a new feature in Oracle 10g that gives a lot of flexibility in textual searches. Oracle introduces the REGEXP_LIKE operator and the REGEXP_INSTR, REGEXP_SUBSTR, REGEXP_REPLACE functions which are similar to the LIKE operator and the INSTR, SUBSTR and REPLACE functions respectively.

REGEXP_LIKE: Searches for a expression in the specified column

e.g. SELECT * FROM names WHERE REGEXP_LIKE(first_name,’^(.{5})$’ )

This will return records that have exactly 5 characters (except a newline) in the first name

REGEXP_INSTR: Returns the starting position of a patters

REGEXP_SUBSTR: Extracts a part of the string based on the expression passed to it.

REGEXP_REPLACE: Similar to the REPLACE function except that it takes the expression as the replace string.

POSIX is supported by this implementation which means that we could use expressions like [[:digit:]]{5} or something like [[:upper:]]{4}. The first expression looks for 5 digits and the second looks for 4 uppercase characters.

When to use REGEXP_LIKE and when to use LIKE ?

Most of the articles that I read complained about the slow performance of the regular expressions in Oracle. (compared to the corresponding operator/ functions). The best bet would be to use LIKE / INSTR/ SUBSTR / REPLACE where possible. If things get too complex use REGEXP. Functions like TRANSLATE could give added flexibility too.

Performance or not, it definitely gives you a lot of room for performing quick and dirty searches.

Powered by ScribeFire.