Sunday, December 29, 2019

Running MongoDB on ChromeOS (via Crostini)

In my previous post I explored Linux application support in ChromeOS and Chromebooks (a.k.a. Crostini). Of course I was bound to try running MongoDB in this environment, which I found to work really well (for development purposes). Here's my notes on running a MongoDB database and tools on a Chromebook with Linux (beta) enabled:
  • In ChromeOS, launch the Terminal app (which opens a Shell inside the 'Penguin' Linux container inside the 'Termina' Linux VM)
  • Run the following commands which are documented in the MongoDB Manual page on installing MongoDB Enterprise on Debian (following the manual's tab instructions titled “Debian 9 "Stretch”):
wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -
echo "deb http://repo.mongodb.com/apt/debian stretch/mongodb-enterprise/4.2 main" | sudo tee /etc/apt/sources.list.d/mongodb-enterprise.list
sudo apt-get update
sudo apt-get install -y mongodb-enterprise
  • Start a MongoDB database instance running:
mkdir ~/data
mongod --dbpath ~/data
  • Launch a second Terminal window and then run the Mongo Shell against this database and perform a quick database insert and query test:
mongo
db.mycoll.insert({a:1})
db.mycoll.find()
db.mycoll.drop()
exit

  • Install Python 3 and the PIP Python package manager (using Anaconda) and then install the MongoDB Python driver (PyMongo):
wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-*-Linux-x86_64.sh
source ~/.bashrc
python --version
pip --version
pip install --user pymongo
  • Test PyMongo by running a small ‘payments data generator’ Python script pulled down from a GitHub repository (this should insert records into the MongoDB local database’s “fs.payments” collection; after letting it run for a minute, continuously inserting new records, press Ctrl-C to stop it):
git clone https://github.com/pkdone/PaymentsWriteReadConcerns.git
cd PaymentsWriteReadConcerns/
./payments-records-loader.py -p 1
  • Download MongoDB Compass (use the Ubuntu 64-bit 14.04+ version), install and run it against the 'localhost' MongoDB database and inspect the contents of the “fs.payments” collection:
wget https://downloads.mongodb.com/compass/mongodb-compass_1.20.4_amd64.deb
sudo apt install ./mongodb-compass_*_amd64.deb
mongodb-compass




Song for today: Sun. Tears. Red by Jambinai

My Notes on Linux Application Support in ChromeOS (a.k.a. Crostini)

These are my own rough notes from spending a few days studying Chrome OS and its Linux app support on a HP Chromebook 14* I got for free (retails for about £150) when I recently purchased a Google Pixel 4 Android mobile phone. I thought I’d share the notes in case they are of use to others. I’m sure there needs to be some corrections, so feedback is welcome.
 * released: 2019, model: db0003na, codename: careena, board: grunt

Some references to other articles that I used to bootstrap my knowledge:

Below are some screenshots showing the ChromeOS Settings section where “Linux (beta)” (a.k.a. Crostini) can be enabled and the Linux apps that are then installed by default when (essentially just the GNOME Help application and the Terminal application, from which many other Linux apps can subsequently be installed):




Here is a diagram I put together to attempt to capture the architecture of Crostini in ChromeOS as I understand it (the rest of this document digs into the details behind some of these layers):

ChromeOS & Crostini

  • Under the covers, ChromeOS is based on Gentoo and the Portage package manager
  • crosh (ChromeOS Developer Shell) is the pluggable command line shell/terminal for ChromeOS (in the Chrome browser, enter Ctrl-Alt-T to launch crosh inside a browser tab)
  • Crostini is the term for Linux application support in ChromeOS which manages the specific Linux VM and then the specific Linux container inside it, managing the lifecycle of when to launch them, mounting the filesystem to show the container’s files in the ChromeOS Files app, etc.. Crostini provides easy to use Linux application support integrated directly into the running ChromeOS desktop, rather than, for example, needing to dual boot or having to run a separate Linux VM and needing to explicitly switch, via the desktop, between ChromeOS and the Linux VM.
  • ChromeOS also has a Developer mode (verification is disabled when the OS boots) which is a special mode built into all Chromebooks to allow users and developers to access the code behind the Chrome Operating System and load their own builds of ChromeOS. This mode also allows users to install and run another Linux system like Ubuntu instead of ChromeOS (i.e. dual boot), but still have ChromeOS available to boot into too
  • As an alternative to Crostini, in addition to the dual-boot option, developer mode can also be used for Crouton which is a set of scripts that bundle up a chroot generator/environment to run both ChromeOS and Ubuntu at the same time. Here a Linux OS runs alongside ChromeOS, so users can switch between the ChromeOS desktop and Linux desktops via a keyboard shortcut. This gives users the ability to take advantage of both environments without needing to reboot. Unlike with virtualisation, a second OS is not being booted and instead the guest OS is running using the Chromium OS system. As a result any performance penalty is reduced because everything is run natively, and RAM is not being wasted to boot two OSes at the same time. Note, Crostini is different than this Crouton capability, as it enables the Linux shell and apps to be brought into the platform in verified (non-developer) mode with seamless user interface desktop integration and multi-layered security, in a supported way.
  • To use Crostini, from the ChromeOS Settings select ‘Linux (Beta)’ and choose to enable it, which, behind the scenes, will download and configure a specific Linux VM containing a specific Linux Container (see the next sections for more details) and it adds a launcher group to the ChromeOS desktop called ‘Linux Apps’. This launcher group includes a launcher to run a Linux shell/terminal application, called Terminal, which is displayed in the ChromeOS desktop but is connected directly inside the container

Crostini Linux VM Layer

  • crosvm (ChromeOS Virtual Machine Monitor) is a custom virtual machine manager written in Rust that runs guest VMs via Linux's KVM hypervisor virtualisation layer and manages the low-level virtual I/O device communication (Amazon’s Firecracker is a fork of crosvm)
  • A specific VM is used to run a container rather than ChromeOS running a container directly, for security reasons because containers do not provide sufficient security isolation on their own. With the two layers, an adversary has to exploit crosvm via its limited interactions with the guest, in addition to the container, and the VM itself is heavily sandboxed.
  • The VM (and its container) are tied to a ChromeOS login session and as soon as a user logs out, all programs are shut down/killed by design (all user data lives in the user’s encrypted home to ensure nothing is leaked when a user log out). The VM, container and their data are persisted across user sessions and are kept in the same per-user encrypted storage as the rest of the browser's data.
  • KVM generally (rather than Crostini specifically) can execute multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualised hardware: a network card, disk, graphics adapter, etc. The kernel component of KVM is included in mainline Linux codebase and the userspace component of KVM is included in mainline QEMU codebase
  • Termina is the VM launched by crosvm and is based on a ChromeOS (CrOS) image with a stripped-down ChromeOS Linux kernel and userland tools. The main goal is to just boot up Termina as quickly as possible, as a secure sandbox, and start running containers.
  • Currently, other custom VMs (other Linux variants, Windows, etc) cannot be run and only instances of the Termina VM image can be booted, although multiple VM instances can be run simultaneously based on the Termina image
  • vmc is the crosh command line utility to manually manage custom VM instances via Concierge (the ChromeOS daemon that manages VM/container life cycles)
  • To view the registered VM(s) from crosh (Ctrl-Alt-T), which may or may not be running, run:
vmc list
  • To launch the Termina VM as a VM instance called ‘termina’ and open a shell directly in the VM, run:
vmc start termina
  • With the above command, the default container in the VM will not be started automatically. However, instead, if from the ChromeOS desktop, a Linux Shell (Terminal) or other Linux App is launched (or the ‘Linux files’ app, Files , is launched) the Termina VM is automatically launched and the default container it owns is also automatically started
  • If the Termina VM is already running, to connect to it via a shell, run:
vsh termina
  • If the ‘vmc start’ command is run with a different VM name, a new VM of that name will be created, launched and its shell entered from the existing terminal command line. This will use the same Termina image, and when running, ‘vmc list’ with list both VMs (the new instance doesn’t have any containers defined in it by default, ready to run, unlike the main Termina VM)
  • To stop the main Termina VM, run:
vmc stop termina


Crostini Container Layer

  • The Termina VM only supports running containers using the “Linux Containers” (LXC) technology at the moment and doesn’t support Docker or other container technologies
  • The default container instance launched via Termina is called Penguin and is based on Debian 9 with some custom packages
  • Containers are run inside a VM rather than programs running directly in the VM to help keep VM startup times low, to help improve security sandboxing by providing a stateless immutable VM image and to allow the container, its applications and their dependencies to be maintained independently from the VM, which otherwise may have contradicting dependecy requirements
  • LXC, generally, works in the vanilla Linux kernel requiring no additional patches to be applied to the kernel source and uses various kernel features to contain processes including kernel namespaces (ipc, uts, mount, pid, network and user), Apparmor and SELinux profiles, Seccomp policies, chroots (using pivot_root), CGroups (control groups). LXCFS provides the userspace (FUSE) filesystem providing overlay files for cpuinfo, meminfo, stat and uptime plus a cgroupfs compatible tree allowing unprivileged writes.
  • LXD is a higher level container framework, which Crostini uses and LXD uses its own specific image formats and also provides the ability to manage containers remotely. Although LXD uses LXC under the covers, it is based on more than just LXC. The Termina VM is configured to run the LXD daemon. Confusingly, the command line tool for controlling LXD is called ‘lxc’ (the ‘LXD Client). If users are using LXD commands to manage containers, they should avoid using any commands that start with ‘lxc-’ as these are lower level LXC commands. Users should avoid mixing and matching the use of both sets of commands in the same system. Crostini uses LXD to launch the Penguin container and LXD is configured to only allow unprivileged containers to be run, for added security. Therefore with Crostini, users should not use the lower level ‘lxc-’ commands because these can’t manage the LXD derived containers that Crostini uses. By default, LXD comes with 3 remote repositories providing images: 1) ubuntu: (for stable Ubuntu images), 2) ubuntu-daily: (for daily Ubuntu images), and 3) images: (for other distros)
  • In the Termina VM, the full LXC/LXD capabilities are provided, and remote images for many types of distros can be used to spawn multiple containers, in addition to the main Penguin container (these are not tested or certified though so may or may not work correctly)
  • Sommelier (a Wayland proxy compositor provides seamless X forwarding integration for content, input events, clipboard data, etc... between Linux apps and the ChromeOS desktop) and Garcon (a daemon for passing requests between the container and ChromeOS) binaries are bind-mounted into the main Penguin container. The Penguin container’s systemd is automatically configured to start these daemons. The libraries for these daemons are already present in the Penguin container LXD image used for Penguin (‘google:debian/stretch’). Other LXD containers launched in the VM don't seem to be enabled for their X based GUI apps to be displayed in the ChromeOS desktop, even if they use the special ‘google:debian/stretch’ LXD container image as it seems Crostini won’t attempt to integrate with this at runtime. Note: Some online articles imply it may be possible to get X-forwarding working from multiple containers.
  • In the Penguin container (which users can access directly, via the Terminal app launcher in the ChomeOS desktop), users can query the IP address of the container which is accessible from ChromeOS and can then run crosh (Ctrl-Alt-T) in ChromeOS and ping the IP address of the container directly. Users can also SSH from the ChromeOS desktop to the Penguin container using Google’s official SSH client that can be installed in Chrome via Chrome Web Store
  • If other containers are launched and then Google’s official SSH client is installed in ChromeOS (install ‘Secure Shell Extension’ via the Chrome Web Store), users can then define SFTP mount-points to other non-Penguin containers and the files in these containers will automatically appear in the Files app too 
  • From the Termina VM, users can use the standard LXD lxc command line tool to list containers and then to see if the Penguin container is running, by running:
lxc list
lxc info penguin | grep "Status: "

  • To check the logs for the Penguin container, run:
lxc info --show-log penguin
  • To open a command line shell as root in the running container (note, the Terminal app has a different identity for connecting to the Penguin container, which is a non-root user), run
lxc exec penguin -- /bin/bash
  • Within the Penguin container you can run GUI apps which automatically display in the main ChromeOS user interface. For example to install the GEdit text editor Linux application run the following (which also adds a launcher for GEdit in the ChromeOS desktop ‘Linux Apps’ launcher group):
sudo apt install gedit

  • It is even possible to install and run a new Google Chrome browser installation from the Linux container, by running the following (which also adds a launcher for this Linux version of Chrome in the ChromeOS desktop ‘Linux Apps’ launcher group):
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install ./google-chrome-stable_current_amd64.deb

  • From crosh (Ctrl-Alt-T), it is also possible to start the main container in the main VM (if not already started) and then connect a shell directly to the main container in the main VM, by running
vmc container termina penguin
vsh termina penguin


Playing with Custom Containers

  • First of all launch crosh (Ctrl-Alt-T), and connect a shell to the Termina VM:
vsh termina
  • Import Google’s own image repository into LXD to include the special Debian image used by Penguin:
lxc remote list
lxc remote add google https://storage.googleapis.com/cros-containers --protocol=simplestreams
lxc remote list
lxc image list google:
lxc image info google:debian/stretch

  • Launch and test a container using Google’s special Debian 9 image:
lxc launch google:debian/stretch mycrosdebiancontainer
lxc list
lxc exec mycrosdebiancontainer -- /bin/bash
cat /etc/*elease*
apt update && apt upgrade -y
exit

  • Launch and test a container using a standard Ubuntu 18.04 image:
lxc launch ubuntu:18.04 myubuntucontainer
lxc list
lxc exec myubuntucontainer -- /bin/bash
cat /etc/*elease*
apt update && apt upgrade -y
exit

  • Launch and test a container using a standard Centos 7 image:
lxc launch images:centos/7 mycentoscontainer
lxc list
lxc exec mycentoscontainer -- /bin/bash
cat /etc/*elease*
yum -y update
exit

  • If the Chromebook is rebooted and the Termina VM restarted, these 3 containers still exist as they are persisted, but they will be in a stopped state. When the containers are then manually restarted they will still have the same settings, files and modifications that were made before they were stopped. To start a stopped container run (example shown for one of the containers):
lxc start myubuntucontainer
  • None of the containers launched above seem to enable GUI apps (e.g. GEdit) to be forwarded automatically to the ChromeOS desktop. Even though the ‘google:debian/stretch’ based container has the relevant X forwarding libraries bundled, it doesn't seem to be automatically integrated with at runtime by the Crostini framework to enable X forwarding
  • Another way to launch a new container is to use one of the following commands, although, again, neither seem to automatically configure X-forwarding, even though they use the ‘google:debian/stretch’ image. It seems that only the Penguin container specifically is beiung managed by Crostini and has X forwarding configured (the first command below should be launched from ChromeOS crosh, the second command which is deprecated performs the same action but should be run from inside the Termina VM:
vmc container termina mycontainer
run_container.sh --container_name=mycontainer --user=jdoe --shell

  • Note, this may throw a timeout error similar to below, but the containers do seem to be created ok:
Error: routine at frontends/vmc.rs:397 `container_create(vm_name,user_id_hash,container_name,image_server,image_alias)` failed: timeout while waiting for signal


Song for today: The Desert Song, No.2 - live by Sophia

Thursday, December 19, 2019

Some Tips for Diagnosing Client Connection Issues for MongoDB Atlas

Introduction


   [UPDATE 07-Sep-2020: I've now written an executable binary tool you can run which performs the equivalent of the checks in this blog post to diagnose connectivity issues to Atas or any other type of MongoDB deployment, downloadable from here]

By default, for recent MongoDB drivers and client tools, MongoDB Atlas advertises the exposed URL for a deployed database cluster using a service name which maps to a set of DNS SRV records to provide an initial connection seed list. This results in a much more 'human digestible' URL, but more importantly, increases deployment flexibility and the ability for underlying database server hosts to migrate over time, without needing to subsequently reconfigure clients.

For example, an Atlas Cluster may be referenced in a connection string by:

 testcluster-abcd.mongodb.net

...as an alternative to the full connection endpoint list:

 testcluster-shard-00-00-abcd.mongodb.net:27017,testcluster-shard-00-01-abcd.mongodb.net:27017,testcluster-shard-00-02-abcd.mongodb.net:27017/test?replicaSet=TestCluster-shard-0

It is worth noting though, whichever approach is used (explicitly defining all endpoints in the connection string or having it discovered via the DNS SRV service name), the connection URL seed list is only ever used for bootstrapping a client application to the database cluster, when the client first starts or when it later needs to restart. On start-up, the client uses the connection seed list to attempt to attach to any member of the cluster, and in fact, all but one of the endpoints could be incorrect and a successful cluster connection will still be achieved. Once the initial connection is made, the true cluster member endpoint list is dynamically and continuously shared between the cluster and the client at runtime. This enables the client to continue operating against the database even if the members of the database cluster change locations or identities over time. For example, after a year of a database cluster and application continuously running, there could be the need to increase database capacity by dynamically rotating the database hosts to new higher processing capacity machines. This all happens dynamically and the already running client application automatically becomes aware and leverages the new hosts without downtime and without needing to consult the connection string again. If the client application restarts though, it will need to read the updated connection string to be able to bootstrap a connection back up to the database cluster.

In the rest of this post we will explore some of the ways initial client connectivity issues can be diagnosed and resolved when using DNS SRV based connection URLs. For reference, Joe Drumgoole provides a great explanation about how DNS SRV records work more generally, and how MongoDB drivers and tools can leverage these.

Naive Connectivity Diagnosis


If you are having connection problems with Atlas when using the SRV service name based URL, be weary of drawing the wrong conclusions regarding the cause of the connection problem...

For example, lets say you can't connect an application to a cluster with the Atlas advertised URL of 'mongodb+srv://testcluster-abcd.mongodb.net' from your laptop. You may be tempted to try to debug the connection problem by running some of the following commands from your laptop:

$ ping testcluster-abcd.mongodb.net
ping: testcluster-abcd.mongodb.net: Name or service not known

$ nc -zv -w 5 testcluster-abcd.mongodb.net 27017
nc: getaddrinfo for host "testcluster-abcd.mongodb.net" port 27017: Name or service not known

Neither of these work even if you actually do have Atlas connectivity configured correctly. This is because "testcluster-abcd.mongodb.net" is not the DNS name of a specific host endpoint. It is actually used by the MongoDB drivers and tools to dynamically lookup the DNS SRV records which have been populated for a service called 'testcluster-abcd.mongodb.net'.

Useful Connectivity Diagnosis


As documented in the MongoDB Drivers specification document and the MongoDB Manual, a DNS SRV query is performed by the drivers/tools by prepending the text '_mongodb._tcp.' to the service name. Therefore, to lookup the list of real endpoints for the Atlas cluster from your laptop using the DNS nslookup tool, you should run:

$ nslookup -q=SRV _mongodb._tcp.testcluster-abcd.mongodb.net
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
_mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-02-abcd.mongodb.net.
_mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-01-abcd.mongodb.net.
_mongodb._tcp.testcluster-abcd.mongodb.net service = 0 0 27017 testcluster-shard-00-00-abcd.mongodb.net.

You can see that in this case that the database service name maps to 3 endpoints (i.e. the hosts of the 3 replica set members). You can then lookup the actual IP address of any one of these endpoints if you desire:

$ nslookup testcluster-shard-00-00-abcd.mongodb.net
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
testcluster-shard-00-00-abcd.mongodb.net canonical name = ec2-35-178-15-240.eu-west-2.compute.amazonaws.com.
Name: ec2-35-178-15-240.eu-west-2.compute.amazonaws.com
Address: 35.178.14.238

So to now debug your connectivity issue further you can use ping but this time by specifying one of the underlying host server endpoints for the database cluster:

$ ping -c 3  testcluster-shard-00-00-abcd.mongodb.net
PING ec2-35-178-15-240.eu-west-2.compute.amazonaws.com (35.178.14.238) 56(84) bytes of data.
64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com (35.178.14.238): icmp_seq=1 ttl=51 time=10.2 ms
64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com (35.178.14.238): icmp_seq=2 ttl=51 time=9.73 ms
64 bytes from ec2-35-178-15-240.eu-west-2.compute.amazonaws.com (35.178.14.238): icmp_seq=3 ttl=51 time=11.7 ms

--- ec2-35-178-15-240.eu-west-2.compute.amazonaws.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 9.739/10.586/11.735/0.850 ms

If this is successful it still doesn't necessarily mean that you can connect to the database service. The next thing to try is to see if you can actually open a socket connection to the mongod (or mongos) daemon process running on one of the endpoints, which you can achieve from your laptop using the netcat utility:

$ nc -zv -w 5 testcluster-shard-00-00-abcd.mongodb.net 27017
nc: connect to testcluster-shard-00-00-abcd.mongodb.net port 27017 (tcp) timed out: Operation now in progress

If this doesn't connect but you are able to ping the endpoint host (as is the case in this example), it probably indicates that the IP address of your client laptop has not been added to the Atlas project's access list, which is easy to remedy via the Atlas Console:


Once your laptop has been added to the access list, running netcat again should demonstrate that a socket connection can now be successfully made:

$ nc -zv -w 5 testcluster-shard-00-00-abcd.mongodb.net 27017
Connection to testcluster-shard-00-00-abcd.mongodb.net 27017 port [tcp/*] succeeded!

If this connects, then it is advisable to move on to trying to connect to the database via the Mongo Shell.


In this example screenshot, the Atlas console suggests the following Mongo Shell command line to use to connect:

 mongo "mongodb+srv://testcluster-abcd.mongodb.net/test" --username main_user

With this connection string, some of you may be thinking how does the Shell know to connect to Atlas over SSL/TLS, what replica-set name it should request and what authentication source database it should specify to locate the user's credentials?

Well, in addition to querying the DNS SRV records for the service, when dynamically constructing the initial bootstrap URL for the cluster, the MongoDB drivers/tools also lookup a DNS TXT record for the service which Atlas also populates for the deployed cluster. This TXT record contains the set of connection options, to be added as parameters to the dynamically constructed connecting string (e.g. 'ssl=true&replicaSet=TestCluster-shard-0&authSource=admin'). You can view what these parameter settings are for a particular Atlas cluster, yourself, by running the following DNS query:

$ nslookup -q=TXT testcluster-abcd.mongodb.net
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
testcluster-abcd.mongodb.net  text = "authSource=admin&replicaSet=TestCluster-shard-0"

Note, the default behaviour for MongoDB drivers/tools using a 'mongodb+srv' based URL is defined as to enable SSL/TLS for the connection. As a result, 'ssl=true' doesn't have to be included in the DNS TXT record, as shown in the example above, because the drivers/tools will automatically add this parameter to the connection string on the fly.

Summary


There's other potential causes of MongoDB Atlas connectivity issues that aren't covered in this post, but hopefully the tips highlighted here will help some of you, especially if you are diagnosing problems when using DNS SRV based service names in the connection URLs you use.


Song for today: Lose the Baby by Tropical Fuck Storm

Saturday, May 11, 2019

Running a Mongo Shell Script From Within A Larger Bash Script

[EDIT May 2023: The post below was written for the legacy 'mongo' shell but has since been tested with the modern 'mongosh' shell, which behaves the same with no issues.]

If you have a Bash script that amongst other things needs to execute a set of multiple Mongo Shell commands together, there are a number of approaches that can be taken. This blog post contains nothing revelatory, but hopefully at least captures examples of these approaches in one single place for easy future reference. There are many situations where this is required, for example:
  • From within a Docker container image’s Entrypoint, running a Bash script which includes a section of Mongo Shell JavaScript code to configure a MongoDB replica-set, using rs.initiate() and associated commands.
  • From within a Continuous Integration process, running a Bash script which installs a MongoDB environment in a host Operating System (OS) and then populates the new MongoDB database with some sample data, using a set of Mongo Shell CRUD commands
  • From within a host system’s monitoring Bash script, which, in addition to gathering some host OS metrics, invokes a set of MongoDB’s server status and statistics commands to also capture database metrics.
The rest of this blog post shows some of the different approaches that can be taken to execute a block of Mongo Shell JavaScript code from within a larger Bash script. In these specific examples a trivial block of JavaScript code will insert 2 records into a ‘persons’ database collection, then query and print both the records belonging to the collection and then remove the 2 records from the collection.

It is worth noting that there is a difference in some of Mongo Shell’s behaviour when running a block of JavaScript code in the Mongo Shell’s Scripted mode rather than its Interactive mode, including the inability to run the Shell Helper commands (e.g. unable to utilise use db, show collections, etc.).


1. EXTERNAL SCRIPT FILE


This option requires executing a separate file which contains the block of JavaScript code. First create a new JavaScript file called test.js with the following content:

db = db.getSiblingDB('testdb');
db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);
print(db.persons.remove({}));

Then create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

#!/bin/bash
echo "Doing some Bash script work first"
mongo --quiet ./test.js
echo "Doing some more Bash script work afterwards"


2. SINGLE-LINE EVAL SCRIPT


This option involves executing the Mongo Shell with its eval option, passing in a single line containing each of the JavaScript commands separated by a semicolon. Create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

#!/bin/bash
echo "Doing some Bash script work first"
mongo --quiet --eval "db = db.getSiblingDB('testdb'); db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'}); db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'}); db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson); print(db.persons.remove({}));"
echo "Doing some more Bash script work afterwards"

Note: Depending on your desktop resolution, your browser may show the Mongo Shell command wrapping onto multiple lines. However, it is actually just a single line, which can be proved by copying the line into a text editor which has its ‘text wrapping’ feature disabled.


3. MULTI-LINE EVAL SCRIPT


This option involves executing the Mongo Shell with its eval option, passing in a block of multiple lines of JavaScript code, where the start and end of the code block are delimited by single or double quotes. Create, make executable, and run a new Bash .sh script file with the following content (this will run the Mongo Shell in Scripted mode):

#!/bin/bash
echo "Doing some Bash script work first"
mongo --quiet --eval "
    db = db.getSiblingDB('testdb');
    db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
    db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
    db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);
    print(db.persons.remove({}));
"
echo "Doing some more Bash script work afterwards"

Note: Care has to be taken to ensure that any quotes used within the JavaScript code block are single-quotes, if the Mongo Shell’s eval delimiters are double-quotes, or vice versa.


4. MULTI-LINE SCRIPT WITH HERE-DOC


This option involves redirecting the content of a block of JavaScript multi-line code into the standard input (‘stdin’) stream of the Mongo Shell program, using a Bash Here-Document. Create, make executable, and run a new Bash .sh script file with the following content (unlike the other approaches this will run the Mongo Shell in Interactive mode):

#!/bin/bash
echo "Doing some Bash script work first"
mongo --quiet <<EOF
    show dbs;
    db = db.getSiblingDB("testdb");
    db.persons.insertOne({'firstname': 'Sarah', 'lastname': 'Smith'});
    db.persons.insertOne({'firstname': 'John', 'lastname': 'Jones'});
    db.persons.find({}, {'_id': 0, 'firstname': 1}).forEach(printjson);
    print(db.persons.remove({}));
EOF
echo "Doing some more Bash script work afterwards"

In this case, because the Mongo Shell is run in Interactive mode, the output of the script will be more verbose. Also, by virtue of running in Interactive mode, the Shell Helpers commands can now be used within the JavaScript code. The block of code above contains the additional line show dbs; as the first line, to illustrate this. However, don’t take this example as a recommendation to use Shell Helpers in your scripts. Generally you should avoid using Shell Helpers in any of your Mongo Shell scripts, regardless of which approach you use.

Also, because the Mongo Shell eval option is not being used, the JavaScript code can contain a mix of both single and double quotes, as illustrated by the modified line of code db = db.getSiblingDB("testdb"); shown above, which utilises double-quotes.


Another Observation


It is worth noting that for all of these four methods, apart from the External Script File method, you can reference Bash environment variables inline within the Mongo Shell JavaScript code (as long as double-quotes deliminate the code for the eval methods, rather than single-quotes). For example, from a Bash terminal if you have set a variable with the name of the database to write to...

export DBNAME=testdb

... you can then use the value of this environment variable from within the inline Mongo Shell JavaScript...

db = db.getSiblingDB('${DBNAME}');

...to factor out the database name. At face value this may not seem particularly powerful until you realise that many build frameworks (e.g. Docker Compose, Ansible, etc.) allow you to declare environment variables within configuration settings before invoking Bash scripts, to factor out environment specific settings.

One bit of caution though, if you are using the MongoDB query operators, they include an ampersand in the syntax (e.g. '&gt', '&exists') which will need to be escaped in these scripts (e.g. '\&gt', '\&exists'). Otherwise Bash will treat each ampersand as a special control character which, in this case, will likely result in being replaced with some empty text.


Summary


The following table summarises the main differences between the four approaches to running a JavaScript block of code with the Mongo Shell, from within a larger Bash script:



Song for today: D. Feathers by Bettie Serveert