ORA-01722 (invalid number) over and over again

This is a translation of my German article which was requested through “Google Translate” many times. It seems that throughout the years this error and its mostly simple causes haven’t lost their significance.

Here’s a sample scenario:

  • A table’s VARCHAR2 column shall be searched for numeric entries;
  • We use a query with a WHERE clause that constrains the scanned rows to only those containing numeric values.
  • Despite that, “ORA-01722″ is thrown.

So,  what happened?

  • The Oracle documentation states: “When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.” Think of an implicit TO_NUMBER being applied to the column.
  • This implicit TO_NUMBER might be applied before the filter predicates of your WHERE clause.
  • This in turn leads to an error, when an alphanumeric entry is hit — even when this entry wouldn’t be part of the scanned set if the filters were applied first.

Example

Let’s create a simple table with a VARCHAR2 column that will hold numeric and alphanumeric values. To make it easier to distinguish between numeric and alphanumeric entries, a second column containing a type indicator is used:

CREATE TABLE t1( content_type VARCHAR2(1), mycontent VARCHAR2(10) )
/
INSERT INTO t1 VALUES( 'N', '12345' );
INSERT INTO t1 VALUES( 'T', 'a1234' );
COMMIT
/

Now, the following SQL will fail -- and it's pretty obvious, why:

SELECT count(*)
  FROM t1
 WHERE mycontent > 1;

FEHLER in Zeile 3:
ORA-01722: invalid number

The WHERE clause used the numeric value 1. This will lead to an implicit TO_NUMBER being applied to the column "mycontent". As soon as this implicit TO_NUMBER processes an  alphanumeric Value, the query errs out.

Let's try to avoid the problem by filtering out the alphanumeric entries using our type indicator:

SELECT count(*)
  FROM t1
 WHERE mycontent > 1
   AND content_type = 'N';

  COUNT(*)
----------
         1

It works! ... sometimes. Sometimes, not.

It is not guaranteed that Oracle will always apply our predicates in the given order. In more complex joins it's not uncommon that the optimizer decides to apply the predicates in an order that might be performing better. In our case, when mycontent > 1 is applied before content_type = 'N', this will lead to errors. We can force this behaviour using an optimizer hint to apply the predicates in the given order:

SELECT /*+ ORDERED_PREDICATES */ count(*)
  FROM t1
 WHERE mycontent > 1
   AND content_type = 'N'
;
ERROR:
ORA-01722: invalid number

Since Oracle 10g, this scenario will happen in many cases, because the "CPU costing" of the optimizer can lead to a change in predicate order. In his blog, Jonathan Lewis discusses this behaviour and recommends to change your data model if this happens (see approach 3 below).

Counter measures

Approach 1 -- smarter filtering: We filter for values containing digits only. This can be done with the SQL function translate. For that we need another symbol that we'll convert 1:1, because if we used an empty string for the conversion, everything would be converted to NULL. The following example assumed that there's no "#" in the scanned column, so we use this character as a "dummy":

SELECT ROWID, mycontent
  FROM t1
 WHERE translate( mycontent, '#0123456789','#' ) IS NULL;

Approach 1a: Starting from Oracle 10g, Regular Expressions can be used instead of translate. This allows for more elegant filtering, e.g. including a leading sign and a decimal separator:

SELECT ROWID, mycontent
  FROM t1
 WHERE REGEXP_LIKE( mycontent, '^[\+-]?\d+(\.\d+)?$' );

Approach 2 -- Data Cleansing: Use constraint validation to write the ROWIDs of the problematic rows into a separate table "exceptions" (which can be created using Oracle's Script "utlexcpt.sql").

ALTER TABLE t1 ADD
CONSTRAINT check_numeric
 CHECK ( translate( mycontent, '#0123456789','#' ) IS NULL ) ENABLE
 VALIDATE
 EXCEPTIONS INTO exceptions;

Step 2 would be to delete the problematic rows from the source table using the ROWIDs in "exceptions".

Approach 3 -- good design: After cleansing the data using one of the above approaches, either re-design your data model or change your code to use an appropriate data type. In a perfect world, this would have happened in the first place.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s