Oracle’s BINARY_DOUBLE Can Be Much Faster Than NUMBER

Using the right data type for some calculation sounds like some obvious advice.

There are many blogs about using temporal data types for temporal data, instead of strings. An obvious reason is data integrity and correctness. We don’t gain much in storing dates as 2019-09-10 in one record, and as Nov 10, 2019 in the next one. Performance is also important in this case – for example, because of statistics being more optimal (an often overlooked side effect of unfit data types).

But at a customer site, I’ve recently discovered a surprising (not surprising in hindsight) performance issue when using NUMBER instead of BINARY_DOUBLE in an Oracle database.

NUMBER for monetary amounts

The NUMBER type (or DECIMAL, NUMERIC in other RDBMS) is perfectly suited for all decimal numbers, which require the correct precision and rounding. I mean, if you ever encounter this kind of display in an invoice, such as when I purchase video games on steam:

… your trust level for the platform immediately goes down by factor 1.0e+1 (even if technically, it does not matter in this case). So, the default, e.g. in banking systems for monetary amounts is always to use NUMBER or an equivalent type, e.g. java.math.BigDecimal.

Calculations on numbers

But now, let’s assume we want to do some statistics on these amounts. Some aggregations of various numeric values, maybe even of their logarithms. We can establish that these two expressions are equal:

ln(a * b) = ln(a) + ln(b)

Or in other words, for positive numbers:

a * b = exp(ln(a) + ln(b))

We’ve already blogged about how this approach can be used to emulate a PRODUCT() aggregate function in SQL, which is very useful for some cases, but none of the SQL databases jOOQ supports has built-in support for this yet. Notice, the blog post also takes care of zero and negative numbers.

But which number type to choose?

Now, we might be tempted to just calculate the LN(SOME_NUMBER) value, and sum that up using SUM(LN(SOME_NUMBER)) for this use-case. This turns out to be terribly slow in Oracle. We were thinking of bad indexes, first, even doubted aggregation in general, until I suggested we try using BINARY_DOUBLE instead, in this case. In our case, we didn’t care about the exact numeric value of the amount. A IEEE 754 floating point number with double precision was going to be good enough.

The results were surprising. In a simple benchmark, we compared 10 approaches to calculating this sum of logarithms:

  1. Using NUMBER(20, 10) and SUM(LN(the_number))
  2. Using NUMBER(20, 10) and SUM(LN(CAST(the_number AS BINARY_DOUBLE)))
  3. Using NUMBER(20, 10) and SUM(LN(TO_BINARY_DOUBLE(the_number)))
  4. Using NUMBER(20, 10), containing a pre-calculated LN value
  5. Using DOUBLE PRECISION and SUM(LN(the_double))
  6. Using DOUBLE PRECISION and SUM(LN(CAST(the_double AS BINARY_DOUBLE)))
  7. Using DOUBLE PRECISION and SUM(LN(TO_BINARY_DOUBLE(the_double)))
  8. Using DOUBLE PRECISION, containing a pre-calculated LN value
  9. Using BINARY_DOUBLE and SUM(LN(the_binary))
  10. Using BINARY_DOUBLE, containing a pre-calculated LN value

These were the thoughts:

  • We tried the above 3 possible numeric data types, expecting BINARY_DOUBLE to be the fastest
  • We tried to pre-calculate the LN() value for the benchmark, to see how much effort goes into summing, and how much effort goes into the LN() calculation with each type. While in general, in this system, such precalculation is impractical, we still wanted to have a benchmark comparison, in case a materialised view or other technique would be feasible.
  • We tried casting and converting each type to BINARY_DOUBLE prior to passing the value to the LN() function. Instead of migrating all the data (with possible side effects), we wanted to see if we can solve this to a reasonable extent “on the fly”

The benchmark that I’m posting here uses this table and example data (full benchmark code at the end of the article):

CREATE TABLE data (
  n1 NUMBER(20, 10),
  n2 NUMBER(20, 10),
  d1 DOUBLE PRECISION,
  d2 DOUBLE PRECISION,
  b1 BINARY_DOUBLE,
  b2 BINARY_DOUBLE
);

INSERT INTO data
SELECT level, ln(level), level, ln(level), level, ln(level)
FROM dual
CONNECT BY level <= 100000;

So, we have 100000 records, whose SUM(LN(x)) we want to calculate in the above 10 different ways. N1 contains the raw numeric value, and N2 contains the pre-calculated LN(N1).

The benchmark technique is described here. Do note it has a lot of caveats and is only useful for a limited number of verifications. Please always be careful when running such benchmarks, they often do not test production-like situations – and never use such benchmarks to compare different RDBMS products directly. They are only useful to compare two approaches on the same RDBMS product.

The results as run on Oracle 18c XE in Docker on development hardware are below. Times are compared relative to the fastest run, as actual time spent in each execution is not interesting for comparison. We did have similar results on production-like hardware, though, and I’m sure you can produce similar results in other RDBMS:

NUMBER(20, 10)
-------------------------------------------------
Run 3, Statement 1 : 280.34143  (avg : 280.75347)
Run 3, Statement 2 : 7.99402    (avg : 8.03506)
Run 3, Statement 3 : 7.71383    (avg : 7.73017)
Run 3, Statement 4 : 1.05456    (avg : 1.11735)

DOUBLE PRECISION
------------------------------------------------
Run 3, Statement 5 : 278.89476 (avg : 279.72981)
Run 3, Statement 6 : 8.06512   (avg : 8.07033)
Run 3, Statement 7 : 7.81873   (avg : 7.80063)
Run 3, Statement 8 : 1.5315    (avg : 1.54347)

BINARY_DOUBLE
------------------------------------------------
Run 3, Statement 9 : 2.4963    (avg : 2.57184)
Run 3, Statement 10: 1         (avg : 1.02943)

How to read these results?

  • Statement 10 is the fastest one, unsurprisingly, as it aggregates pre-calculated LN(binary_double) values. The precalculation of the function means that all the work has been done already before the report, and the data type is the one we expected to perform best in general
  • Statements 4 and 8 are almost as fast (precalculated LN() values). Being only slight factors off, we can attribute the difference to the usual benchmark flaws, although it’s interesting to see that DOUBLE PRECISION seems 1.5x slower to sum than BINARY_DOUBLE and even NUMBER
  • Statements 1, 5, 9 are the ones where no data type conversion is applied and SUM(LN(the_value)) is being calculated. It is staggering how much slower both NUMBER and DOUBLE PRECISION are than BINARY_DOUBLE. Statements 1 and 5 are a factor of 112x slower than statement 9!
  • Statements 2-3, 6-7 prove that converting the NUMBER or DOUBLE PRECISION vales to BINARY_DOUBLE on the fly provides a sufficiently performant workaround, which made statements 2-3, 6-7 only 3x slower than statement 9
  • Statements 2-3, 6-7 show that casting and converting are about equivalent

Note, we found similar results with other functions, such as EXP()

Analysis

The order of magnitude may seem surprising at first, but thinking about it, it is not. We would never do CPU intensive computation with java.math.BigDecimal in Java. The BigDecimal type is there for numeric accuracy, e.g. when it really matters what the monetary amount is, exactly. When we run analytics on monetary amounts, using double is sufficient in Java as well.

If our data is BigDecimal, and we cannot reasonably change that, it might still be better to use the BigDecimal::doubleValue conversion prior to further processing using e.g. Math::log. So, this translates directly to SQL, whose LN() implementations are data type specific. IEEE 754 having been designed precisely for this purpose.

When doing CPU intensive computations both in Java, or in the database, we should always evaluate our various options of

  • Quick fixing our data sets for the report (ad-hoc conversion prior to calculation)
  • Thoroughly fixing our data sets in the schema (migration towards a better data type)
  • Preprocessing our data sets (precalculating some very commonly used computations)

Benchmark logic

Just run the below on your own hardware. I’m curious to see your results:

-- Copyright Data Geekery GmbH
--
-- Licensed under the Apache License, Version 2.0 (the "License");
-- you may not use this file except in compliance with the License.
-- You may obtain a copy of the License at
--
--     http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
--
-- This version displays relative execution times (fastest execution = 1)
SET SERVEROUTPUT ON

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;

CREATE TABLE data (
  n1 NUMBER(20, 10),
  n2 NUMBER(20, 10),
  d1 DOUBLE PRECISION,
  d2 DOUBLE PRECISION,
  b1 BINARY_DOUBLE,
  b2 BINARY_DOUBLE
);

INSERT INTO data
SELECT level, ln(level), level, ln(level), level, ln(level)
FROM dual
CONNECT BY level <= 100000;

CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 10;
  v_stmt NUMBER;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
    v_stmt := 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(n1)) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(cast(n1 as binary_double))) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(to_binary_double(n1))) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(n2) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
    
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(d1)) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
    
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(cast(d1 as binary_double))) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
    
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(to_binary_double(d1))) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(d2) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(ln(b1)) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
    v_stmt := v_stmt + 1;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT sum(b2) FROM data
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, v_stmt, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(10, 5)) ratio,
      CAST(AVG(elapsed) OVER (PARTITION BY stmt) / MIN(elapsed) OVER() AS NUMBER(10, 5)) avg_ratio
    FROM results
    ORDER BY run, stmt
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio || ' (avg : ' || rec.avg_ratio || ')');
  END LOOP;
  
  dbms_output.put_line('');
  dbms_output.put_line('Copyright Data Geekery GmbH');
  dbms_output.put_line('https://www.jooq.org/benchmark');
END;
/

DROP TABLE data;
DROP TABLE results;

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.