Sunday, May 27, 2018

Database Support For Decimal Types That Don't Suffer Loss Of Precision

It's become clear to me that some developers naively trust using float/double types in programming languages and in databases, without even considering the loss of precision that will occur and any detrimental impact on their business applications. Maybe the loss of precision isn't an issue for some types of applications, but I believe it's important to assess that risk before choosing to accept it.

Here's a demonstration of what can happen in programming languages and why you need to consider it...

Python example:
$ python -c "print('%0.18f' % (0.1 * 0.2))"
0.020000000000000004
^^ WRONG!

Node (JavaScript) example:
$ node -e 'console.log(0.1 * 0.2)'
0.020000000000000004
^^ WRONG!

Mongo Shell (JavaScript) example:
$ mongo --nodb --quiet -eval '0.1 * 0.2'
0.020000000000000004
^^ WRONG!

Java example (using Java's newish JShell tool):
$ printf "System.out.println(0.1 * 0.2); \n/ex\n" | jshell -q
0.020000000000000004
^^ WRONG!

None of this is really a surprise given that most of these high-level programming languages are built using C (or C++), which invariably provides the fundamental building block types for floats and doubles.

C example:
printf "int main(void){ printf(\"%%.18lf\\\n\", (0.1 * 0.2)); return 0; }" | cc -w -x c -o multiply - && ./multiply
0.020000000000000004
^^ WRONG!

Of course, most modern programming languages have libraries for dealing with large decimals requiring exact representation and precision.

For example, Java has the BigDecimal library class for this reason:
$ printf "System.out.println((new BigDecimal(\"0.2\")).multiply(new BigDecimal(\"0.1\"))); \n/ex\n" | jshell -q
0.02
^^ CORRECT!

When it comes to databases, the same challenges exist when using fields with floating point values. There may be the need to store and retrieve such fields without loss of precision, whilst enabling arithmetic to be conducted on these fields.

Example of using a standard JavaScript/JSON float type for a field in a MongoDB database:
$ mongo
> db.records.drop()
> db.records.insert({val: 0.2})
> db.records.findOne()
{
    "_id" : ObjectId("5b0a7c24d3dac6c87c0d4a4b"),
    "val" : 0.2
}
> id = db.records.findOne()._id
> db.records.update({_id: id}, {$mul: {val: 0.1}})
> db.records.findOne()
{
"_id" : ObjectId("5b0a7c24d3dac6c87c0d4a4b"),
"val" : 0.020000000000000004
}
> db.records.findOne().val
0.020000000000000004
^^ WRONG!

Like programming languages, most traditional relational databases provide extra types and libraries for using floating point fields that don't suffer loss of precision. However, most of the so-called NoSQL database don't. MongoDB is one of the exceptions.

Example of using a BSON decimal128 type for a field in a MongoDB database:
$ mongo
> db.records.drop()
> db.records.insert({val: NumberDecimal("0.2")})
> db.records.findOne()
{
"_id" : ObjectId("5b0a7ce9d3dac6c87c0d4a4c"),
"val" : NumberDecimal("0.2")
}
> id = db.records.findOne()._id
> db.records.update({_id: id}, {$mul: {val: NumberDecimal("0.1")}})
> db.records.findOne()
{
"_id" : ObjectId("5b0a7ce9d3dac6c87c0d4a4c"),
"val" : NumberDecimal("0.02")
}
> db.records.findOne().val
NumberDecimal("0.02")
^^ CORRECT!

So as you can see, MongoDB can store fields as floating point numbers without precision loss and enable arithmetic and sorting to be performed across these fields. The MongoDB manual provides a lot more information on how to use this decimal field type.

It could be that the loss of precision does not have a major impact on a particular application and the business purpose it is used for. However, in some cases, especially for financial or scientific applications, the ability to store and process decimal fields without loss of precision is likely to be critical.


Song for today: Quit It by Strand of Oaks