Friday, December 11, 2015

Accessing MongoDB data from SQL / ODBC on Windows, using the new BI Connector

[EDIT 05-Apr-2018: MongoDB BI Connector version 2+ uses a different mechanism for connecting to (not using a PostgreSQL driver) - for more info see here]

The latest enterprise version of MongoDB (3.2) includes a new BI Connector to enable business intelligence, analytics and reporting tools, that only "speak" SQL, to access data in a MongoDB database, using ODBC. Most of the examples published so far show how to achieve this using rich graphical tools, like Tableau. Therefore, I thought it would be useful to show here that the data is accessible from any type of tool, that is capable of issuing SQL commands via an ODBC driver. Even from Microsoft's venerable Excel spreadsheet application. Believe it or not, I still come across organisations out there that are using Excel to report on the state of their business!

For my example, I loaded a MongoDB database with the anonymised MOT tests results data, that the UK government makes freely available to use. As an explanation for non-residents of the UK, MOT tests are the annual inspections that all UK road-going cars and other vehicles have to go through, to be legal and safe. There are millions of these car test records recorded every year, and they give a fascinating insight into the types and ages of cars people choose to drive in the UK.

First I loaded the CSV file based MOT data sets into a MongoDB 3.2 database, using a small Python script I wrote, with each document representing a test result for a specific owner's car for a specific year. Below is an example of what one test result document looks like in MongoDB:

I then followed the online MongoDB BI Connector documentation to configure a BI Connector server to listen for ODBC requests for the "mot" database and translate these to calls to the underlying MongoDB "testresults" collection. I just used the default DRDL ("Document Relational Definition Language") schema file that was automatically generated by the "mongodrdl" command line utility (a utility bundled with the BI Connector).

Then, on a separate desktop virtual machine running Windows 10, I downloaded the latest PostgreSQL ODBC driver installer for Windows and installed it.

With the ODBC driver installed, I then proceeded to define a Windows ODBC Data Source to reference the MOT database that I was exposing via by the BI Connector (running on a remote machine).

By default the BI Connector (running on a machine with IP address, in my case), listens on port 27032. Before hitting the Save button, I hit the Test button to ensure that the Windows client could make a successful ODBC connection to the BI Connector.

With the new ODBC Data Source now configured (shown in screenshot above), I then launched Microsoft Excel so that I could use this new data source to explore the MOT test data.

Excel's standard query wizard was able to use the ODBC data source to discover the MongoDB collection's "schema". I chose to include all the "fields" in the query.

I thought it would be useful to ask for the MOT test results to be ordered by Test Year, followed by Car Make, followed by Car Model.

Finally, upon pressing OK, Excel presented me with the results of the SQL/ODBC query, run directly against the MOT test data, sourced from the MongoDB collection.

Excel, then gave me many options, including settings to say whether to periodically refresh the data from source, and how often. I was also able to open Microsoft's built-in Query Builder tool to modify the query and execute again

That's pretty much it. It's straight forward to configure a Windows client to access data from MongoDB, via MongoDB's new BI Connector, using ODBC.

Song for today: Dust and Disquiet by Caspian


Unknown said...

Great post Paul

Operador said...

Great. Is it possible by vba?