Like many growth companies in the software-as-a-service field, Duetto has scaled rapidly every year, not only to serve more customers but also to add more features to our core GameChanger app. MongoDB, both the product and the company, has been an important part of our growth story as a hotel technology company.
[Editor’s note: This is the second entry in a three-part “Building a Better RMS” series, in which Duetto CTO Craig Weissman explains the finer points of the cloud-based architecture that powers Duetto and its GameChanger application. Read Craig’s first post here, about why Duetto runs GameChanger exclusively on Amazon Web Services.]
We rely heavily on MongoDB as our database, not only for the metadata that controls our application but also for all of our structured customer “Big Data.” In fact, we use Mongo for other operational persistence, including queuing, logging, temporary storage and cache storage.
Many architects have discussed recently the topic of “polyglot” storage for modern web scale applications. This refers to the idea of using multiple heterogeneous data stores in the back end of an application, with each data store tailored for its specific best purpose and need. This idea has a lot of merit, and in fact Duetto uses AWS S3 where appropriate for its high reliability and low storage costs. But interestingly, we have been able to use MongoDB, at least at our current scale, for all of the purposes mentioned above and described in more detail below.
Multitasking With MongoDB
The flip side of the polyglot argument is that managing multiple data stores brings a fair bit of operational complexity. At Duetto we have stayed “lean and mean,” and MongoDB has been a big part of that. When used properly and for our use cases, it turns out to solve many problems reliably.
I like to describe Mongo DB as “Goldilocks” — it’s neither too simple nor too complex. It scales reasonably well (for Duetto data volumes and needs) but not as well as some other NoSQL data stores, which are needed at global web scale companies. MongoDB has excellent API and application idioms but lacks full SQL and transactional semantics.
Mongo DB has its detractors from “serious” database experts, but if one accepts its limitations and works around them, it makes for a highly productive and reliable environment.
Managing Big Data, While Getting Bigger
I should also mention that MongoDB (the company) has innovated quite a lot in the nearly five years we have been users (and customers of their support contract). Many useful application concepts have been rolled out in that timeframe, including Bulk DML verbs, a rich aggregation framework, and a more powerful query analyzer and indexing schemes.
But certainly the most important innovation for Duetto has been MongoDB’s incorporation of the WiredTiger storage engine, which compresses data on disk and allows Duetto to store nearly 10 times the customer data on the same physical hardware as the original memory mapped engine. We use WiredTiger to store our Big (really “Medium”) Data: all of the bookings, blocks, folios and shopping events.
We also store pre-computations (materialized views) of our OLAP-style reporting results in MongoDB. Often our runtime application reads results using only these materialized cached results, which represent a direct cache lookup (we could use memcached or redis in the future, but so far MongoDB has sufficed for this purpose as well).
In order to build these query results from the direct source data we perform index scans in MongoDB (leading queries often with hotel ID and target stay days of interest). This type of workload represents a typical range scan, which could be implemented in other larger Big Data stores such as HBase or Cassandra. However, Mongo has handled our workflows just fine so far and provides the indexing “out of the box” so we don’t have to build it ourselves.
Why Duetto Does Its Own Sharding
In terms of data scaleout, we have built our own sharding scheme on top of Mongo. Sharding refers to spreading the overall data volumes for an entire large dataset across multiple machines in a cluster.
Duetto uses multiple MongoDB clusters and multiple databases within those clusters to scale our hardware and keep each of our databases at a reasonable size for movement and restore.
Duetto chose to write its own multi-tenant data migration tools, since often the application layer has more knowledge of the data being sharded than the underlying data store. Also, to be frank, while MongoDB provides its own generic sharding solution, these tend to be operationally complex and some of the areas more likely to have vendor bugs, as we have seen in MongoDB patch notifications. So although it might be seen as controversial, we decided to “roll our own” infrastructure for this important concept.
But in summary, MongoDB serves many purposes well and continues to grow and scale with our needs.
- Building a Better RMS: Running GameChanger on AWS Cloud
- Bringing a New Approach to Integrations in the Hotel Industry
- The Cloud, SaaS, and Why the Hotel Industry is Stuck in the Past
Latest posts by Craig Weissman, CTO and Co-Founder (see all)
- How Agile Development and Deployment Adds Value to Hotel Technology - September 22, 2016
- Building a Better RMS: How Coding in Java Makes Duetto More Efficient - August 5, 2016
- Building a Better RMS: Scaling GameChanger With MongoDB - July 15, 2016