Tuesday, November 24, 2020

Is Querying A MongoDB View Optimised?

Views in MongoDB appear to database users like read-only collections, ready to be queried in the same way normal collections are. A View is defined by an Aggregation pipeline and when a query is issued on a View, using find(), there is the potential for the execution of the View to be optimised by MongoDB in the same way as MongoDB would optimise any aggregation pipeline that is executed.

In reality, most applications will not issue a find() without specifying a query filter as an argument. This begs the question: When issuing a find() with a query filter against a View (backed by an aggregation pipeline), how is the combination optimised, and can indexes be leveraged effectively?

In the rest of this post, I will explore this further and answer this question.

Source Collection Data

The data I am using for the investigation is a music based data-set sourced from the Discogs website, imported from Discog's XML data dump, using an XML MongoDB import utility.

The resulting releases collection, representing the albums and singles released by all artists, has over 1.5 million documents in it. I've defined various obvious indexes for the collection in anticipation of wanting to run finds and aggregations efficiently against it. Below is a screenshot showing some of the data in this collection, illustrating each document's typical shape...

As you can see, the releases collection contains fields for the artist, the title of the release, the year of the release and the music genres & styles associated with the release.

Let's now look at using two different Views, with different degrees of complexity, against this same collection, to see if and how these Views are optimised at runtime, when a find() is issued...

Using A View Which Filters Out Some Records & Fields

So let's create a View which only shows music released since the start of the year 2000, concatenates the array of one or more styles into a new 'style' string field and then excludes the 'styles' and '_id' fields from the result.

 var pipeline = [

    {$match: {'year': {'$gte': 2000}}},

    {$set: {'style': {

        $reduce: {

            input: '$styles',

            initialValue: '',

            in: {$concat: ['$$value', '$$this', '. ']}



    {$unset: ['styles']},

    {$unset: ['_id']},


 db.createView('millennium_releases_view', 'releases', pipeline);

Below is an example of the shape of result documents, when the View is queried for a specific artist:

If I ask MongoDB to provide the explain plan for an 'empty' query on the View, using the following command...


...the resulting explain plan shows the database runs the following steps in the order shown:

  1. MATCH using INDEX SCAN hitting an index for the Year field
  2. SET new Style string field to concatenate values from existing Styles array field
  3. UNSET Styles array field
  4. UNSET _id field

It's good to see here that the 'year greater than or equal' clause in the aggregation pipeline defined for the View is being run as the first step and is targeting an index to avoid a 'full table scan'. However, what happens when I include a query filter when issuing a find() against the View, to only show releases for a specific artist?

 db.millennium_releases_view.find({'artist': 'Fugazi'}).explain();

This time the resulting explain plan shows the following steps executed:

  1. MATCH using INDEX SCAN hitting a compound index composed of both the Artist & Year fields
  2. SET new Style string field to concatenate values from existing Styles array field
  3. UNSET Styles array field
  4. UNSET _id field

This is great news, because when I am specifying a query filter for the find() on this View, the optimiser is converting the regular find() filter syntax into an aggregation match expression and pushing it to the existing $match stage at the start of the pipeline. As a result, the optimum compound index of (artist, year) is being used, to entirely satisfy the find's 'artist=Fugazi' expression combined with the View's 'year>=2000' expression,.

Does this mean a find() with a query filter will always be pushed to the top of the View's aggregation pipeline, at runtime?

Well actually, no. Let's see why, in this second example...

Using A View Which Rolls Up Some Data

This time let's create a View which groups releases (albums & singles) for each artist by the style associated with the release. For example, if an artist has five albums categorised with the style 'Stoner Rock' and 7 albums categorised by 'Post Rock', the resulting View will contain 2 documents for the artist, one for each of the two styles. This is the command for creating this View:

 var pipeline = [

    {$unwind: {path: '$styles'}}, 

    {$group: {

        _id: {artist: '$artist', style: '$styles'}, 

        titles: {'$push': '$title'},


    {$set: {'artist': '$_id.artist'}},

    {$set: {'style': '$_id.style'}}, 

    {$unset: ['_id']},


 db.createView('styles_view', 'releases', pipeline);

Below is an example of the shape of result documents from querying this new second View, for a specific artist:

If I ask MongoDB to provide the explain plan for an 'empty' query on this View, using the following command...


    ...the resulting explain plan shows the database runs the following steps in the order shown:

    1. COLLECTION_SCAN with PROJECTION of Artist, Styles & Title fields only
    2. UNWIND of Styles array field producing a record for each array element
    3. GROUP on Artist + Style fields, adding each associated release title to a new Titles array field
    4. SET Artist string field to the first of element of the group's id
    5. SET Style string field to the second of element of the group's id
    6. UNSET _id field which was created by the group stage

    As expected here, because the aggregation pipeline defined for the View does not contain a $match, the first step will result in a 'full table scan', where all the documents in the collection are inspected, and then the required fields only, are projected out.

    What happens this time when I include a query filter for the find() run against the View, to only show results for a specific artist, using the following command to explain?

     db.styles_view.find({'artist': 'Fugazi'}).explain();

      This time, the resulting explain plan shows the following ordered steps executed:

      1. COLLECTION_SCAN with PROJECTION of ArtistStyles & Title fields only
      2. UNWIND of Styles array field producing a record for each array element
      3. GROUP on Artist + Style fields, adding each associated release title to a new Titles array field
      4. SET Artist string field to the first of element of the group's id
      5. MATCH on Artist filed (no index used)
      6. SET Style string field to the second of element of the group's id
      7. UNSET _id field which was created by the group stage

      Here the new $match generated by MongoDB to capture the find() expression run against the View, is included in the executed aggregation pipeline, but the $match cannot be pushed all the way up to the first step of the pipeline. This is to be expected...

      Essentially what happens when a find() with filter is run on a View is as follows. The filter expression is initially placed in a new $match stage appended to the end of the aggregation pipeline. Then the normal aggregation pipeline runtime optimiser kicks in and attempts to move the newly added $match step as near to the top of the pipeline as possible. However, the $group stage (and related $set on artist, in this case), acts as a barrier. The $group operator stage completely changes the shape of documents and effectively drops any existing fields that preceded it. The optimiser has no way of knowing that a filter on an artist field, being applied to the outcome of a View, is definitively referring to a field called artist that existed in the original source collection used by the View. Instead, for all it knows, the expression on artist could be referring to some other intermediate aggregation pipeline field of similar name. In the example above, even if we don't use $set in the View's pipeline to set a new field called artist, the new $match expression is still blocked by $group and so is only executed straight after $group (a scenario which I also tested).

      So even though I only want to see the results for one artist which relates to only a few 10s of documents in the database, the find() which applies a filter on the View, will result in the total data set of 1.5 million documents being 'full table scanned', adding considerable latency to the response.

      If I wasn't querying the View and instead running my own hand-crafted aggregation pipeline directly against the source collection, to achieve the same functional outcome, my pipeline could be composed of the following stages where I explicitly include the match on artist as the first stage:

       var pipeline = [

          {$match: {'artist': 'Fugazi'}},

          {$unwind: {path: '$styles'}}, 

          {$group: {

              _id: {artist: '$artist', style: '$styles'}, 

              titles: {'$push': '$title'},


          {$set: {'artist': '$_id.artist'}},

          {$set: {'style': '$_id.style'}}, 

          {$unset: ['_id']},



      Then when I ask for the explain plan...


        ...I see that the following steps are executed:

        1. MATCH using INDEX_SCAN on the Artist field with PROJECTION of ArtistStyles & Title fields only
        2. UNWIND of Styles array field producing a record for each array element
        3. GROUP on Artist + Style fields, adding each release title to a new Titles array field
        4. SET Artist string field from the first of element in the group id
        5. SET Style string field from the second of element in the group id
        6. UNSET _id field which was created by the group stage

        This time an index will be leveraged so that only the few 10s of records, corresponding to the desired artist, are retrieved, ready for unwinding and grouping. The aggregation does not attempt to grab 1.5 million records. This is only possible because, as the developer of the aggregation pipeline logic, I have extra knowledge which the MongoDB runtime does not have. Specifically, I know that the $match on the artist field should actually be applied to the field named artist in the View's source collection and not to the result of the $group stage.

        Wrapping Up

        What these findings show for Views is that at runtime, when MongoDB receives a find() containing query filter expressions, these expressions are dynamically appended to the end of the View's aggregation pipeline, before the resulting composite pipeline is executed. Then, as is the case if you are just issuing a regular aggregation against a normal collection, MongoDB's aggregation pipeline runtime optimiser attempts to re-order the pipeline on the fly, without changing its functional behaviour, to be more efficient. These runtime optimisations include attempting to push any $match stages as near to the start of the pipeline as possible, to help promote maximum use of indexes when executed. However, stages like the $group stage, which completely transform the shape of documents, mean that the optimiser cannot move a $match ahead of such stages, without risking changing the functional behaviour and ultimately the resulting output. 

        In practice, where Views are used to filter a subset of records and/or a subset of fields. the system should be able to fully optimise the find() run against the collection, pushing query filter expressions to the first step of the executed aggregation pipeline, to best leverage indexes. Only in places where there is a loss of fidelity (e.g when using a $group stage), will it be the case that the find() query filter cannot be placed earlier in the pipeline being executed against the View. 

        Song for today: Runaway Return by Fugazi

        Sunday, October 4, 2020

        Rust & MongoDB - Perfect Bedfellows

        I've been learning Rust over the last month or so and I'm really enjoying it. It's a really elegant and flexible programming language despite being the most strongly typed and compile-time strict programming language I've ever used (bearing in mind I used to be a professional C & C++ developer way back in the day). 

        I'd recently read the really good and commonly referenced blog post Creating a REST API in Rust with warp, which shows how to create a simple example Groceries stock management REST API service, and which uses an in-memory HashMap as its backing store. As part of my learning I thought I'd have a go at porting this to use MongoDB as its data store instead, using the fairly new MongoDB Rust Driver.

        It turns out that this was really easy to do, also due to how well engineered the new MongoDB Rust Driver turned out to be, with its rich yet easy to use API. 

        You can see my resulting MongoDB version of this sample Groceries application, in the Github project rust-groceries-mongo-api I created. Check out that project link to view the source code showing how MongoDB was integrated with for the Groceries REST API and how to test the application using a REST client.

        What was even more surprising was how easy it was to integrate MongoDB's flexible data model with a programming language as strict as Rust, and I encountered no friction between the two at all. In fact, this was even easier to achieve by leveraging the option of using the driver team's additional contribution of BSON translation to the open source Rust Serde framework, which makes it easy to serialize/deserialize Rust data structures to/from other formats (e.g. JSON, Avro and now BSON).

        I plan to blog again in the future, in more detail, about how to combine Rust's strict typing and MongoDB's flexible schema, especially when the data model and consuming microservices inevitably change over time. [UPDATE 09-Dec-2020: I have now blogged on this at MongoDB DevHub, see: The Six Principles for Building Robust Yet Flexible Shared Data Applications]

        Song for today: Dissolution by Cloud Nothings

        Sunday, May 3, 2020

        Converting Gnarly Date Strings to Proper Date Types Using a MongoDB Aggregation Pipeline

        I recently received some example bank payments data in a CSV file which had been exported from a relational database with that database's default export settings. After using mongoimport to import this data 'as-is', into a MongoDB database, I noticed that there was a particularly gnarly date string field in each record. For example:
        • 23-NOV-20
        Why do I say gnarly? Well if you lived through Y2K you should be horrified by the 'year' field shown above. How would you know from the data, without any context, what century this applies to? Is it 1920? Is it 2020? Is it 2120? There's no way of knowing from just the exported data alone. Also, there is no indication of which time zone this applies to. Is it British Summer Time? Is it Eastern Daylight Time? Who knows? Also the month element appears to be an abbreviation of a month expressed in a specific spoken language. Which spoken language?

        I needed to get this into a proper Date type in MongoDB so I could then easily index it, perform date range queries natively, perform sort by date natively, etc.. My usual tool of choice for this is MongoDB's Aggregation pipeline to generate a new collection from the existing collection with the 'date' string fields converted to proper date type fields. To perform the string to date conversion, the usual operator of choice to use is $dateFromString (introduced in MongoDB 3.6). 

        However, $dateFromString [rightly] expects an input string which isn't missing crucial date related text, indicating things like the century or timezone. Also, the $dateFromString operator contains no format specifiers to indicate that the text 'NOV' maps to the 11th month of a year in a specific spoken language.

        Therefore, armed with the extra context of knowing this exported data refers to dates in the 21st century (the '2000s') with a UTC 'time zone' and in the English language (only inferred by asking the owner of the data), I had to perform some additional string manipulation in the aggregation pipeline before using $dateFromString to generate a true and accurate date type. The rest of this blog post shows how I achieved this for date strings like '23-NOV-20'.

        Converting Incomplete Date Strings to Date Types Example

        In the Mongo Shell targeting a running MongoDB test database, run the following code to insert 12 sample 'payment' records, with example 'bad date string' fields for testing each month of a sample year.

        use test;
          {'account_id': '010101', 'pymntdate': '01-JAN-20', 'amount': 1.01},
          {'account_id': '020202', 'pymntdate': '02-FEB-20', 'amount': 2.02},
          {'account_id': '030303', 'pymntdate': '03-MAR-20', 'amount': 3.03},
          {'account_id': '040404', 'pymntdate': '04-APR-20', 'amount': 4.04},
          {'account_id': '050505', 'pymntdate': '05-MAY-20', 'amount': 5.05},
          {'account_id': '060606', 'pymntdate': '06-JUN-20', 'amount': 6.06},
          {'account_id': '070707', 'pymntdate': '07-JUL-20', 'amount': 7.07},
          {'account_id': '080808', 'pymntdate': '08-AUG-20', 'amount': 8.08},
          {'account_id': '090909', 'pymntdate': '09-SEP-20', 'amount': 9.09},
          {'account_id': '101010', 'pymntdate': '10-OCT-20', 'amount': 10.10},
          {'account_id': '111111', 'pymntdate': '11-NOV-20', 'amount': 11.11},
          {'account_id': '121212', 'pymntdate': '12-DEC-20', 'amount': 12.12}

        Then execute the following Aggregation pipeline to copy the contents of the 'rawpayments' collection, populated above, into a new collection named 'payments', but with the 'pymntdate' field values converted from string types to date types.

          {$set: {
            pymntdate: {
              $dateFromString: {format: '%d-%m-%Y %H.%M.%S.%L', dateString:
                {$concat: [
                  {$substrCP: ['$pymntdate', 0, 3]},  // USE FIRST 3 CHARS IN DATE STRING
                  {$switch: {branches: [  // REPLACE MONTH 3 CHARS IN DATE STRING WITH 2 DIGIT MONTH
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'JAN']}, then: '01'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'FEB']}, then: '02'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'MAR']}, then: '03'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'APR']}, then: '04'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'MAY']}, then: '05'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'JUN']}, then: '06'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'JUL']}, then: '07'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'AUG']}, then: '08'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'SEP']}, then: '09'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'OCT']}, then: '10'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'NOV']}, then: '11'},
                    {case: {$eq: [{$substrCP: ['$pymntdate', 3, 3]}, 'DEC']}, then: '12'},
                   ], default: 'ERROR'}},
                  '-20',  // ADD HYPEHN + HARDCODED CENTURY 2 DIGITS
          {$out: 'payments'}

        In this pipeline, the string '23-NOV-20' will be converted to 'ISODate("2020-11-23T22:57:36.827Z")' by concatenating the following four elements of text together before passing it to the $dateFromString operator to convert to a date:
        1. '23-' (from the input string)
        2. '11' (replacing 'NOV')
        3. '-20' (hard-coded hyphen + century)
        4. '20' (the rest of input string apart from last 6 nanosecond digits)
        Note: A $set stage is used in this pipeline, which is a type of stage first introduced in MongoDB 4.2. $set is an alias for $addFields, so if using an earlier version of MongoDB, replace $set with $addFields in the pipeline.

        To see what the converted records look like, containing new date types, query the new collection:

        db.payments.find({}, {_id:0});

        Which will show the following results:

        { "account_id" : "010101", "pymntdate" : ISODate("2020-01-01T01:01:01.123Z"), "amount" : 1.01 }
        { "account_id" : "020202", "pymntdate" : ISODate("2020-02-02T02:02:02.456Z"), "amount" : 2.02 }
        { "account_id" : "030303", "pymntdate" : ISODate("2020-03-03T03:03:03.789Z"), "amount" : 3.03 }
        { "account_id" : "040404", "pymntdate" : ISODate("2020-04-04T04:04:04.012Z"), "amount" : 4.04 }
        { "account_id" : "050505", "pymntdate" : ISODate("2020-05-05T05:05:05.345Z"), "amount" : 5.05 }
        { "account_id" : "060606", "pymntdate" : ISODate("2020-06-06T06:06:06.678Z"), "amount" : 6.06 }
        { "account_id" : "070707", "pymntdate" : ISODate("2020-07-07T07:07:07.901Z"), "amount" : 7.07 }
        { "account_id" : "080808", "pymntdate" : ISODate("2020-08-08T08:08:08.234Z"), "amount" : 8.08 }
        { "account_id" : "090909", "pymntdate" : ISODate("2020-09-09T09:09:09.567Z"), "amount" : 9.09 }
        { "account_id" : "101010", "pymntdate" : ISODate("2020-10-10T10:10:10.890Z"), "amount" : 10.1 }
        { "account_id" : "111111", "pymntdate" : ISODate("2020-11-11T11:11:11.111Z"), "amount" : 11.11 }
        { "account_id" : "121212", "pymntdate" : ISODate("2020-12-12T12:12:12.999Z"), "amount" : 12.12 }

        Song for today: For Everything by The Murder Capital

        Sunday, December 29, 2019

        Running MongoDB on ChromeOS (via Crostini)

        In my previous post I explored Linux application support in ChromeOS and Chromebooks (a.k.a. Crostini). Of course I was bound to try running MongoDB in this environment, which I found to work really well (for development purposes). Here's my notes on running a MongoDB database and tools on a Chromebook with Linux (beta) enabled:
        • In ChromeOS, launch the Terminal app (which opens a Shell inside the 'Penguin' Linux container inside the 'Termina' Linux VM)
        • Run the following commands which are documented in the MongoDB Manual page on installing MongoDB Enterprise on Debian (following the manual's tab instructions titled “Debian 9 "Stretch”):
        wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -
        echo "deb http://repo.mongodb.com/apt/debian stretch/mongodb-enterprise/4.2 main" | sudo tee /etc/apt/sources.list.d/mongodb-enterprise.list
        sudo apt-get update
        sudo apt-get install -y mongodb-enterprise
        • Start a MongoDB database instance running:
        mkdir ~/data
        mongod --dbpath ~/data
        • Launch a second Terminal window and then run the Mongo Shell against this database and perform a quick database insert and query test:

        • Install Python 3 and the PIP Python package manager (using Anaconda) and then install the MongoDB Python driver (PyMongo):
        wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
        bash Anaconda3-*-Linux-x86_64.sh
        source ~/.bashrc
        python --version
        pip --version
        pip install --user pymongo
        • Test PyMongo by running a small ‘payments data generator’ Python script pulled down from a GitHub repository (this should insert records into the MongoDB local database’s “fs.payments” collection; after letting it run for a minute, continuously inserting new records, press Ctrl-C to stop it):
        git clone https://github.com/pkdone/PaymentsWriteReadConcerns.git
        cd PaymentsWriteReadConcerns/
        ./payments-records-loader.py -p 1
        • Download MongoDB Compass (use the Ubuntu 64-bit 14.04+ version), install and run it against the 'localhost' MongoDB database and inspect the contents of the “fs.payments” collection:
        wget https://downloads.mongodb.com/compass/mongodb-compass_1.20.4_amd64.deb
        sudo apt install ./mongodb-compass_*_amd64.deb

        Song for today: Sun. Tears. Red by Jambinai

        My Notes on Linux Application Support in ChromeOS (a.k.a. Crostini)

        These are my own rough notes from spending a few days studying Chrome OS and its Linux app support on a HP Chromebook 14* I got for free (retails for about £150) when I recently purchased a Google Pixel 4 Android mobile phone. I thought I’d share the notes in case they are of use to others. I’m sure there needs to be some corrections, so feedback is welcome.
         * released: 2019, model: db0003na, codename: careena, board: grunt

        Some references to other articles that I used to bootstrap my knowledge:

        Below are some screenshots showing the ChromeOS Settings section where “Linux (beta)” (a.k.a. Crostini) can be enabled and the Linux apps that are then installed by default when (essentially just the GNOME Help application and the Terminal application, from which many other Linux apps can subsequently be installed):

        Here is a diagram I put together to attempt to capture the architecture of Crostini in ChromeOS as I understand it (the rest of this document digs into the details behind some of these layers):

        ChromeOS & Crostini

        • Under the covers, ChromeOS is based on Gentoo and the Portage package manager
        • crosh (ChromeOS Developer Shell) is the pluggable command line shell/terminal for ChromeOS (in the Chrome browser, enter Ctrl-Alt-T to launch crosh inside a browser tab)
        • Crostini is the term for Linux application support in ChromeOS which manages the specific Linux VM and then the specific Linux container inside it, managing the lifecycle of when to launch them, mounting the filesystem to show the container’s files in the ChromeOS Files app, etc.. Crostini provides easy to use Linux application support integrated directly into the running ChromeOS desktop, rather than, for example, needing to dual boot or having to run a separate Linux VM and needing to explicitly switch, via the desktop, between ChromeOS and the Linux VM.
        • ChromeOS also has a Developer mode (verification is disabled when the OS boots) which is a special mode built into all Chromebooks to allow users and developers to access the code behind the Chrome Operating System and load their own builds of ChromeOS. This mode also allows users to install and run another Linux system like Ubuntu instead of ChromeOS (i.e. dual boot), but still have ChromeOS available to boot into too
        • As an alternative to Crostini, in addition to the dual-boot option, developer mode can also be used for Crouton which is a set of scripts that bundle up a chroot generator/environment to run both ChromeOS and Ubuntu at the same time. Here a Linux OS runs alongside ChromeOS, so users can switch between the ChromeOS desktop and Linux desktops via a keyboard shortcut. This gives users the ability to take advantage of both environments without needing to reboot. Unlike with virtualisation, a second OS is not being booted and instead the guest OS is running using the Chromium OS system. As a result any performance penalty is reduced because everything is run natively, and RAM is not being wasted to boot two OSes at the same time. Note, Crostini is different than this Crouton capability, as it enables the Linux shell and apps to be brought into the platform in verified (non-developer) mode with seamless user interface desktop integration and multi-layered security, in a supported way.
        • To use Crostini, from the ChromeOS Settings select ‘Linux (Beta)’ and choose to enable it, which, behind the scenes, will download and configure a specific Linux VM containing a specific Linux Container (see the next sections for more details) and it adds a launcher group to the ChromeOS desktop called ‘Linux Apps’. This launcher group includes a launcher to run a Linux shell/terminal application, called Terminal, which is displayed in the ChromeOS desktop but is connected directly inside the container

        Crostini Linux VM Layer

        • crosvm (ChromeOS Virtual Machine Monitor) is a custom virtual machine manager written in Rust that runs guest VMs via Linux's KVM hypervisor virtualisation layer and manages the low-level virtual I/O device communication (Amazon’s Firecracker is a fork of crosvm)
        • A specific VM is used to run a container rather than ChromeOS running a container directly, for security reasons because containers do not provide sufficient security isolation on their own. With the two layers, an adversary has to exploit crosvm via its limited interactions with the guest, in addition to the container, and the VM itself is heavily sandboxed.
        • The VM (and its container) are tied to a ChromeOS login session and as soon as a user logs out, all programs are shut down/killed by design (all user data lives in the user’s encrypted home to ensure nothing is leaked when a user log out). The VM, container and their data are persisted across user sessions and are kept in the same per-user encrypted storage as the rest of the browser's data.
        • KVM generally (rather than Crostini specifically) can execute multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualised hardware: a network card, disk, graphics adapter, etc. The kernel component of KVM is included in mainline Linux codebase and the userspace component of KVM is included in mainline QEMU codebase
        • Termina is the VM launched by crosvm and is based on a ChromeOS (CrOS) image with a stripped-down ChromeOS Linux kernel and userland tools. The main goal is to just boot up Termina as quickly as possible, as a secure sandbox, and start running containers.
        • Currently, other custom VMs (other Linux variants, Windows, etc) cannot be run and only instances of the Termina VM image can be booted, although multiple VM instances can be run simultaneously based on the Termina image
        • vmc is the crosh command line utility to manually manage custom VM instances via Concierge (the ChromeOS daemon that manages VM/container life cycles)
        • To view the registered VM(s) from crosh (Ctrl-Alt-T), which may or may not be running, run:
        vmc list
        • To launch the Termina VM as a VM instance called ‘termina’ and open a shell directly in the VM, run:
        vmc start termina
        • With the above command, the default container in the VM will not be started automatically. However, instead, if from the ChromeOS desktop, a Linux Shell (Terminal) or other Linux App is launched (or the ‘Linux files’ app, Files , is launched) the Termina VM is automatically launched and the default container it owns is also automatically started
        • If the Termina VM is already running, to connect to it via a shell, run:
        vsh termina
        • If the ‘vmc start’ command is run with a different VM name, a new VM of that name will be created, launched and its shell entered from the existing terminal command line. This will use the same Termina image, and when running, ‘vmc list’ with list both VMs (the new instance doesn’t have any containers defined in it by default, ready to run, unlike the main Termina VM)
        • To stop the main Termina VM, run:
        vmc stop termina

        Crostini Container Layer

        • The Termina VM only supports running containers using the “Linux Containers” (LXC) technology at the moment and doesn’t support Docker or other container technologies
        • The default container instance launched via Termina is called Penguin and is based on Debian 9 with some custom packages
        • Containers are run inside a VM rather than programs running directly in the VM to help keep VM startup times low, to help improve security sandboxing by providing a stateless immutable VM image and to allow the container, its applications and their dependencies to be maintained independently from the VM, which otherwise may have contradicting dependecy requirements
        • LXC, generally, works in the vanilla Linux kernel requiring no additional patches to be applied to the kernel source and uses various kernel features to contain processes including kernel namespaces (ipc, uts, mount, pid, network and user), Apparmor and SELinux profiles, Seccomp policies, chroots (using pivot_root), CGroups (control groups). LXCFS provides the userspace (FUSE) filesystem providing overlay files for cpuinfo, meminfo, stat and uptime plus a cgroupfs compatible tree allowing unprivileged writes.
        • LXD is a higher level container framework, which Crostini uses and LXD uses its own specific image formats and also provides the ability to manage containers remotely. Although LXD uses LXC under the covers, it is based on more than just LXC. The Termina VM is configured to run the LXD daemon. Confusingly, the command line tool for controlling LXD is called ‘lxc’ (the ‘LXD Client). If users are using LXD commands to manage containers, they should avoid using any commands that start with ‘lxc-’ as these are lower level LXC commands. Users should avoid mixing and matching the use of both sets of commands in the same system. Crostini uses LXD to launch the Penguin container and LXD is configured to only allow unprivileged containers to be run, for added security. Therefore with Crostini, users should not use the lower level ‘lxc-’ commands because these can’t manage the LXD derived containers that Crostini uses. By default, LXD comes with 3 remote repositories providing images: 1) ubuntu: (for stable Ubuntu images), 2) ubuntu-daily: (for daily Ubuntu images), and 3) images: (for other distros)
        • In the Termina VM, the full LXC/LXD capabilities are provided, and remote images for many types of distros can be used to spawn multiple containers, in addition to the main Penguin container (these are not tested or certified though so may or may not work correctly)
        • Sommelier (a Wayland proxy compositor provides seamless X forwarding integration for content, input events, clipboard data, etc... between Linux apps and the ChromeOS desktop) and Garcon (a daemon for passing requests between the container and ChromeOS) binaries are bind-mounted into the main Penguin container. The Penguin container’s systemd is automatically configured to start these daemons. The libraries for these daemons are already present in the Penguin container LXD image used for Penguin (‘google:debian/stretch’). Other LXD containers launched in the VM don't seem to be enabled for their X based GUI apps to be displayed in the ChromeOS desktop, even if they use the special ‘google:debian/stretch’ LXD container image as it seems Crostini won’t attempt to integrate with this at runtime. Note: Some online articles imply it may be possible to get X-forwarding working from multiple containers.
        • In the Penguin container (which users can access directly, via the Terminal app launcher in the ChomeOS desktop), users can query the IP address of the container which is accessible from ChromeOS and can then run crosh (Ctrl-Alt-T) in ChromeOS and ping the IP address of the container directly. Users can also SSH from the ChromeOS desktop to the Penguin container using Google’s official SSH client that can be installed in Chrome via Chrome Web Store
        • If other containers are launched and then Google’s official SSH client is installed in ChromeOS (install ‘Secure Shell Extension’ via the Chrome Web Store), users can then define SFTP mount-points to other non-Penguin containers and the files in these containers will automatically appear in the Files app too 
        • From the Termina VM, users can use the standard LXD lxc command line tool to list containers and then to see if the Penguin container is running, by running:
        lxc list
        lxc info penguin | grep "Status: "

        • To check the logs for the Penguin container, run:
        lxc info --show-log penguin
        • To open a command line shell as root in the running container (note, the Terminal app has a different identity for connecting to the Penguin container, which is a non-root user), run
        lxc exec penguin -- /bin/bash
        • Within the Penguin container you can run GUI apps which automatically display in the main ChromeOS user interface. For example to install the GEdit text editor Linux application run the following (which also adds a launcher for GEdit in the ChromeOS desktop ‘Linux Apps’ launcher group):
        sudo apt install gedit

        • It is even possible to install and run a new Google Chrome browser installation from the Linux container, by running the following (which also adds a launcher for this Linux version of Chrome in the ChromeOS desktop ‘Linux Apps’ launcher group):
        wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
        sudo apt install ./google-chrome-stable_current_amd64.deb

        • From crosh (Ctrl-Alt-T), it is also possible to start the main container in the main VM (if not already started) and then connect a shell directly to the main container in the main VM, by running
        vmc container termina penguin
        vsh termina penguin

        Playing with Custom Containers

        • First of all launch crosh (Ctrl-Alt-T), and connect a shell to the Termina VM:
        vsh termina
        • Import Google’s own image repository into LXD to include the special Debian image used by Penguin:
        lxc remote list
        lxc remote add google https://storage.googleapis.com/cros-containers --protocol=simplestreams
        lxc remote list
        lxc image list google:
        lxc image info google:debian/stretch

        • Launch and test a container using Google’s special Debian 9 image:
        lxc launch google:debian/stretch mycrosdebiancontainer
        lxc list
        lxc exec mycrosdebiancontainer -- /bin/bash
        cat /etc/*elease*
        apt update && apt upgrade -y

        • Launch and test a container using a standard Ubuntu 18.04 image:
        lxc launch ubuntu:18.04 myubuntucontainer
        lxc list
        lxc exec myubuntucontainer -- /bin/bash
        cat /etc/*elease*
        apt update && apt upgrade -y

        • Launch and test a container using a standard Centos 7 image:
        lxc launch images:centos/7 mycentoscontainer
        lxc list
        lxc exec mycentoscontainer -- /bin/bash
        cat /etc/*elease*
        yum -y update

        • If the Chromebook is rebooted and the Termina VM restarted, these 3 containers still exist as they are persisted, but they will be in a stopped state. When the containers are then manually restarted they will still have the same settings, files and modifications that were made before they were stopped. To start a stopped container run (example shown for one of the containers):
        lxc start myubuntucontainer
        • None of the containers launched above seem to enable GUI apps (e.g. GEdit) to be forwarded automatically to the ChromeOS desktop. Even though the ‘google:debian/stretch’ based container has the relevant X forwarding libraries bundled, it doesn't seem to be automatically integrated with at runtime by the Crostini framework to enable X forwarding
        • Another way to launch a new container is to use one of the following commands, although, again, neither seem to automatically configure X-forwarding, even though they use the ‘google:debian/stretch’ image. It seems that only the Penguin container specifically is beiung managed by Crostini and has X forwarding configured (the first command below should be launched from ChromeOS crosh, the second command which is deprecated performs the same action but should be run from inside the Termina VM:
        vmc container termina mycontainer
        run_container.sh --container_name=mycontainer --user=jdoe --shell

        • Note, this may throw a timeout error similar to below, but the containers do seem to be created ok:
        Error: routine at frontends/vmc.rs:397 `container_create(vm_name,user_id_hash,container_name,image_server,image_alias)` failed: timeout while waiting for signal

        Song for today: The Desert Song, No.2 - live by Sophia

        Thursday, December 19, 2019

        Some Tips for Diagnosing Client Connection Issues for MongoDB Atlas


           [UPDATE 07-Sep-2020: I've now written an executable binary tool you can run which performs the equivalent of the checks in this blog post to diagnose connectivity issues to Atas or any other type of MongoDB deployment, downloadable from here]

        By default, for recent MongoDB drivers and client tools, MongoDB Atlas advertises the exposed URL for a deployed database cluster using a service name which maps to a set of DNS SRV records to provide an initial connection seed list. This results in a much more 'human digestible' URL, but more importantly, increases deployment flexibility and the ability for underlying database server hosts to migrate over time, without needing to subsequently reconfigure clients.

        For example, an Atlas Cluster may be referenced in a connection string by:


        ...as an alternative to the full connection endpoint list:


        It is worth noting though, whichever approach is used (explicitly defining all endpoints in the connection string or having it discovered via the DNS SRV service name), the connection URL seed list is only ever used for bootstrapping a client application to the database cluster, when the client first starts or when it later needs to restart. On start-up, the client uses the connection seed list to attempt to attach to any member of the cluster, and in fact, all but one of the endpoints could be incorrect and a successful cluster connection will still be achieved. Once the initial connection is made, the true cluster member endpoint list is dynamically and continuously shared between the cluster and the client at runtime. This enables the client to continue operating against the database even if the members of the database cluster change locations or identities over time. For example, after a year of a database cluster and application continuously running, there could be the need to increase database capacity by dynamically rotating the database hosts to new higher processing capacity machines. This all happens dynamically and the already running client application automatically becomes aware and leverages the new hosts without downtime and without needing to consult the connection string again. If the client application restarts though, it will need to read the updated connection string to be able to bootstrap a connection back up to the database cluster.

        In the rest of this post we will explore some of the ways initial client connectivity issues can be diagnosed and resolved when using DNS SRV based connection URLs. For reference, Joe Drumgoole provides a great explanation about how DNS SRV records work more generally, and how MongoDB drivers and tools can leverage these.

        Naive Connectivity Diagnosis

        If you are having connection problems with Atlas when using the SRV service name based URL, be weary of drawing the wrong conclusions regarding the cause of the connection problem...

        For example, lets say you can't connect an application to a cluster with the Atlas advertised URL of 'mongodb+srv://testcluster-abcd.mongodb.net' from your laptop. You may be tempted to try to debug the connection problem by running some of the following commands from your laptop:

        $ ping testcluster-abcd.mongodb.net
        ping: testcluster-abcd.mongodb.net: Name or service not known

        $ nc -zv -w 5 testcluster-abcd.mongodb.net 27017
        nc: getaddrinfo for host "testcluster-abcd.mongodb.net" port 27017: Name or service not known

        Neither of these work even if you actually do have Atlas connectivity configured correctly. This is because "testcluster-abcd.mongodb.net" is not the DNS name of a specific host endpoint. It is actually used by the MongoDB drivers and tools to dynamically lookup the DNS SRV records which have been populated for a service called 'testcluster-abcd.mongodb.net'.

        Useful Connectivity Diagnosis

        As documented in the MongoDB Drivers specification document and the MongoDB Manual, a DNS SRV query is performed by the drivers/tools by prepending the text '_mongodb._tcp.' to the service name. Therefore, to lookup the list of real endpoints for the Atlas cluster from your laptop using the DNS nslookup tool, you should run:

        $ nslookup -q=SRV _mongodb._tcp.testcluster-abcd.mongodb.net

        Non-authoritative answer:
        _mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-02-abcd.mongodb.net.
        _mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-01-abcd.mongodb.net.
        _mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-00-abcd.mongodb.net.

        You can see that in this case that the database service name maps to 3 endpoints (i.e. the hosts of the 3 replica set members). You can then lookup the actual IP address of any one of these endpoints if you desire:

        $ nslookup testcluster-shard-00-00-abcd.mongodb.net

        Non-authoritative answer:
        testcluster-shard-00-00-abcd.mongodb.net canonical name = ec2-35-178-15-240.eu-west-2.compute.amazonaws.com.
        Name: ec2-35-178-15-240.eu-west-2.compute.amazonaws.com

        So to now debug your connectivity issue further you can use ping but this time by specifying one of the underlying host server endpoints for the database cluster:

        $ ping -c 3  testcluster-shard-00-00-abcd.mongodb.net
        PING ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ( 56(84) bytes of data.
        64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ( icmp_seq=1 ttl=51 time=10.2 ms
        64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ( icmp_seq=2 ttl=51 time=9.73 ms
        64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ( icmp_seq=3 ttl=51 time=11.7 ms

        --- ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ping statistics ---
        3 packets transmitted, 3 received, 0% packet loss, time 2002ms
        rtt min/avg/max/mdev = 9.739/10.586/11.735/0.850 ms

        If this is successful it still doesn't necessarily mean that you can connect to the database service. The next thing to try is to see if you can actually open a socket connection to the mongod (or mongos) daemon process running on one of the endpoints, which you can achieve from your laptop using the netcat utility:

        $ nc -zv -w 5 testcluster-shard-00-00-abcd.mongodb.net 27017
        nc: connect to testcluster-shard-00-00-abcd.mongodb.net port 27017 (tcp) timed out: Operation now in progress

        If this doesn't connect but you are able to ping the endpoint host (as is the case in this example), it probably indicates that the IP address of your client laptop has not been added to the Atlas project's access list, which is easy to remedy via the Atlas Console:

        Once your laptop has been added to the access list, running netcat again should demonstrate that a socket connection can now be successfully made:

        $ nc -zv -w 5 testcluster-shard-00-00-abcd.mongodb.net 27017
        Connection to testcluster-shard-00-00-abcd.mongodb.net 27017 port [tcp/*] succeeded!

        If this connects, then it is advisable to move on to trying to connect to the database via the Mongo Shell.

        In this example screenshot, the Atlas console suggests the following Mongo Shell command line to use to connect:

         mongo "mongodb+srv://testcluster-abcd.mongodb.net/test" --username main_user

        With this connection string, some of you may be thinking how does the Shell know to connect to Atlas over SSL/TLS, what replica-set name it should request and what authentication source database it should specify to locate the user's credentials?

        Well, in addition to querying the DNS SRV records for the service, when dynamically constructing the initial bootstrap URL for the cluster, the MongoDB drivers/tools also lookup a DNS TXT record for the service which Atlas also populates for the deployed cluster. This TXT record contains the set of connection options, to be added as parameters to the dynamically constructed connecting string (e.g. 'ssl=true&replicaSet=TestCluster-shard-0&authSource=admin'). You can view what these parameter settings are for a particular Atlas cluster, yourself, by running the following DNS query:

        $ nslookup -q=TXT testcluster-abcd.mongodb.net

        Non-authoritative answer:
        testcluster-abcd.mongodb.net  text = "authSource=admin&replicaSet=TestCluster-shard-0"

        Note, the default behaviour for MongoDB drivers/tools using a 'mongodb+srv' based URL is defined as to enable SSL/TLS for the connection. As a result, 'ssl=true' doesn't have to be included in the DNS TXT record, as shown in the example above, because the drivers/tools will automatically add this parameter to the connection string on the fly.


        There's other potential causes of MongoDB Atlas connectivity issues that aren't covered in this post, but hopefully the tips highlighted here will help some of you, especially if you are diagnosing problems when using DNS SRV based service names in the connection URLs you use.

        Song for today: Lose the Baby by Tropical Fuck Storm

        Saturday, May 11, 2019

        Running a Mongo Shell Script From Within A Larger Bash Script

        If you have a Bash script that amongst other things needs to execute a set of multiple Mongo Shell commands together, there are a number of approaches that can be taken. This blog post contains nothing revelatory, but hopefully at least captures examples of these approaches in one single place for easy future reference. There are many situations where this is required, for example:
        • From within a Docker container image’s Entrypoint, running a Bash script which includes a section of Mongo Shell JavaScript code to configure a MongoDB replica-set, using rs.initiate() and associated commands.
        • From within a Continuous Integration process, running a Bash script which installs a MongoDB environment in a host Operating System (OS) and then populates the new MongoDB database with some sample data, using a set of Mongo Shell CRUD commands
        • From within a host system’s monitoring Bash script, which, in addition to gathering some host OS metrics, invokes a set of MongoDB’s server status and statistics commands to also capture database metrics.
        The rest of this blog post shows some of the different approaches that can be taken to execute a block of Mongo Shell JavaScript code from within a larger Bash script. In these specific examples a trivial block of JavaScript code will insert 2 records into a ‘persons’ database collection, then query and print both the records belonging to the collection and then remove the 2 records from the collection.

        It is worth noting that there is a difference in some of Mongo Shell’s behaviour when running a block of JavaScript code in the Mongo Shell’s Scripted mode rather than its Interactive mode, including the inability to run the Shell Helper commands (e.g. unable to utilise use db, show collections, etc.).


        This option requires executing a separate file which contains the block of JavaScript code. First create a new JavaScript file called test.js with the following content:

        db = db.getSiblingDB('testdb');
        db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
        db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
        db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);

        Then create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

        echo "Doing some Bash script work first"
        mongo --quiet ./test.js
        echo "Doing some more Bash script work afterwards"


        This option involves executing the Mongo Shell with its eval option, passing in a single line containing each of the JavaScript commands separated by a semicolon. Create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

        echo "Doing some Bash script work first"
        mongo --quiet --eval "db = db.getSiblingDB('testdb'); db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'}); db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'}); db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson); print(db.persons.remove({}));"
        echo "Doing some more Bash script work afterwards"

        Note: Depending on your desktop resolution, your browser may show the Mongo Shell command wrapping onto multiple lines. However, it is actually just a single line, which can be proved by copying the line into a text editor which has its ‘text wrapping’ feature disabled.


        This option involves executing the Mongo Shell with its eval option, passing in a block of multiple lines of JavaScript code, where the start and end of the code block are delimited by single or double quotes. Create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

        echo "Doing some Bash script work first"
        mongo --quiet --eval "
            db = db.getSiblingDB('testdb');
            db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
            db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
            db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);
        echo "Doing some more Bash script work afterwards"

        Note: Care has to be taken to ensure that any quotes used within the JavaScript code block are single-quotes, if the Mongo Shell’s eval delimiters are double-quotes, or vice versa.


        This option involves redirecting the content of a block of JavaScript multi-line code into the standard input (‘stdin’) stream of the Mongo Shell program, using a Bash Here-Document. Create, make executable, and run a new Bash .sh script file with the following content (unlike the other approaches this will run the Mongo Shell in Interactive mode):

        echo "Doing some Bash script work first"
        mongo --quiet <<EOF
            show dbs;
            db = db.getSiblingDB("testdb");
            db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
            db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
            db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);
        echo "Doing some more Bash script work afterwards"

        In this case, because the Mongo Shell is run in Interactive mode, the output of the script will be more verbose. Also, by virtue of running in Interactive mode, the Shell Helpers commands can now be used within the JavaScript code. The block of code above contains the additional line show dbs; as the first line, to illustrate this. However, don’t take this example as a recommendation to use Shell Helpers in your scripts. Generally you should avoid using Shell Helpers in any of your Mongo Shell scripts, regardless of which approach you use.

        Also, because the Mongo Shell eval option is not being used, the JavaScript code can contain a mix of both single and double quotes, as illustrated by the modified line of code db = db.getSiblingDB("testdb"); shown above, which utilises double-quotes.

        Another Observation

        It is worth noting that for all of these four methods, apart from the External Script File method, you can reference Bash environment variables inline within the Mongo Shell JavaScript code (as long as double-quotes deliminate the code for the eval methods, rather than single-quotes). For example, from a Bash terminal if you have set a variable with the name of the database to write to...

        export DBNAME=testdb

        ... you can then use the value of this environment variable from within the inline Mongo Shell JavaScript...

        db = db.getSiblingDB('${DBNAME}');

        ...to factor out the database name. At face value this may not seem particularly powerful until you realise that many build frameworks (e.g. Docker Compose, Ansible, etc.) allow you to declare environment variables within configuration settings before invoking Bash scripts, to factor out environment specific settings.

        One bit of caution though, if you are using the MongoDB query operators, they include an ampersand in the syntax (e.g. '&gt', '&exists') which will need to be escaped in these scripts (e.g. '\&gt', '\&exists'). Otherwise Bash will treat each ampersand as a special control character which, in this case, will likely result in being replaced with some empty text.


        The following table summarises the main differences between the four approaches to running a JavaScript block of code with the Mongo Shell, from within a larger Bash script:

        Song for today: D. Feathers by Bettie Serveert