Architecting-Reporting_Dev-Blog_600x300_0716 (1)

At SparkPost we send over 25% of all non-spam email, but how do we account for all of those messages? Humans! Our well-equipped team calculates over 40 metrics and gives you the ability to dissect it in any way. Well, that was our first stab at this capability but we quickly realized it was too much work for them to do. Joking aside, scalable reporting is a challenging problem because we have to process millions of events per minute and ensure they can be queried within a timely manner. I hope to explain in some detail how we’ve solved this problem.

Sending email is great and we tell you we’re the best email service out there. However, if you’re like me, you need some hard evidence of this claim. You may not realize it but there’s a lot that happens when sending an email: Momentum (our email engine) receives a message (injection), the ISP receives it (delivery), the recipient’s mailbox is full (soft bounce), and so on. Seeing this at the 10,000-foot glance is great for reporting back to your boss, but you’ll also need to diagnose problems by inspecting the lifecycle of one message. We have solutions for both. And the architectural designs we made allow us to easily add different uses of this data. We implemented a strategy called ETL (extract, transform and load). This pattern is what powers Metrics, Message Events, Webhooks and Suppressions. I’ll walk you through this process starting with the Event Hose.

Event Hose

The event hose is responsible for keeping track of all the different events that can occur during the course of sending messages (injection, delivery, bounce, rejection, etc.). It logs these events as they occur in JSON to an exchange in RabbitMQ. By offloading the queuing of these events from Momentum, it provides a good separation and allows our operations team to scale this cluster of servers independently. If you are familiar with the pub-sub model, the event hose acts as the publisher. However, without subscribers, these messages would float into the ether. What are our subscribers and how do they work you ask? Read on!

Metrics ETL

The Metrics ETL is the first of a collection of subscribers in this stack. It is a clustered Node.js process that binds to a RabbitMQ queue. This process receives messages from the queue as they are emitted and batches them up by transforming the data to adhere to the schema within a database called Vertica.

Message Events ETL

Like the Metrics ETL this is a clustered Node.js process. However, it binds to a different queue and has its own independent control over processing the data. It also loads into Vertica but into a more flexible schema called a flex table. Think of it as MongoDB on steroids. As mentioned in the intro, there are other uses of this data that I will not get into today. If you use webhooks or suppressions, it has different processes and logic to process this data.

Vertica

We spent a great deal of time vetting analytic database solutions to fit our many different use cases. The big advantage are projections, similar to a materialized views, which provide the ability to store raw event data and model very complex queries for all the different ad-hoc drill downs and groupings (domain, time series, sending pool, etc). Lastly, it is horizontally scalable and allows us to easily add new nodes as our load and data set increases.

HTTP API Layer

As the processes I described above loads the data, users can retrieve data from several API endpoints. As explained in How SparkPost Built the Best Email API for Developers, we use RESTful web APIs. The Metrics API provides a variety of endpoints enabling you to retrieve a summary of the data, data grouped by a specific qualifier, or data by event type. This means I can see statistics about my emails at an aggregate level, by a time range, grouped by domain, subaccount, IP pool, sending domain, etc. The capabilities are extensive and our users love the different dissections of the data they can retrieve in almost real-time. We retain this information for six month to allow for trending of the data over time. If you need the data longer than that, we encourage you to set up a webhook and load the data into their respective business intelligence tools.

The Message Events API allows a user to search on the raw events the event hose logs above. The retention of these events is 10 days, and is intended for more immediate debugging of your messages (push, email, etc).

Web User Interface

We built SparkPost for the developer first but we understand that not all of our users are technical. We provide a Reports UI that allows a user to drill down by many different facets like: a recipient domain, sending domain, ip pool and campaign. It is built using the same APIs mentioned above.

Conclusion

I hope this shed some light on how SparkPost processes the large amount of data and makes it available to you. We’re also currently working on re-architecting everything I just talked about. We’ve learned a lot over the first 18 months of SparkPost, especially managing many different tiers of our own infrastructure. I’ve personally spent many hours triaging and fighting fires around RabbitMQ and Vertica. We have decided to leverage a service based message queue in SQS, and are starting to investigate service-based alternatives to Vertica. I plan on writing a follow up to this later in the year, so stay tuned!

Our knowledgeable staff also uses this data to ensure you’re making the best decisions when sending your messages with SparkPost. I encourage you to start using these APIs and the WebUI to start digging into how your messages are performing. It can also be crucial if you get stuck in one of those Lumbergh moments and have to provide an email report to your boss by 5pm on a Friday, or need to dig into why an ISP is bouncing your email. We’ve also seen great uses of our APIs in hackathon projects. So get creative and let us help you build something awesome.

–Bob Evans, Director of Engineering

reporting with sparkpost using node.js

SparkPost is the world’s most powerful email platform. Our engineering team takes advantage of this fact by using SparkPost to power several pieces of functionality within SparkPost itself. One of the areas we use SparkPost is to power our internal reporting. In this post I’ll demonstrate how we use SparkPost to build an email report.

We’re going to build what we refer to as our sending domains report. This report goes to our deliverability team, which ensures we follow best practices to help our customers succeed. We’ll use Node.js, demonstrate some queries in Cassandra, which is where our customer data lives, and pull it all together with the templates and transmissions capabilities of SparkPost.

The content of the email will be a simple heading and table with a row for each record in the data we retrieve from Cassandra. Here we create a template that makes use of the SparkPost template substitution data:

We’ll create a new template with the ID sending-domains-report by going to the SparkPost Templates UI and pasting the HTML content into our template editor:

t

Let’s break down what we’re doing in the template:

First, we have a heading and an if statement. We use this statement to see if the domains variable is empty (we’ll show you where that comes from later). If it is, the template will just show a friendly message.

If the domains variable is not empty we create a table with a heading row:

And finally we loop through each item in the domains variable and render some data:

Now that we have our template, we need to create a Node.js script that queries our Cassandra database and uses node-sparkpost to send the email.

First we have to require some libraries:

Next we create the functions for retrieving data from our sending domains and accounts tables. Note that we created our own wrapper around cassandra-driver to promisify the batch/execute methods and add an executeByStream function that allowed us to stream back more than 5,000 results:

We have to do some processing on the data returned from the queries to merge them together:

This function is responsible for sending our message through SparkPost using the node-sparkpost library. We use the config library to set up our connection to Sparkpost, prepare the message meta and substitution data and deliver the message:

Finally we pull it all together into a promise chain and log out the results:

You can view the code in full as a Gist on Github. The end result an email report that looks like this:

reports-email

Internal reporting is only a small sample of how we use SparkPost at SparkPost. In a future post we will dig into how we use SparkPost to power the various emails that are sent to users of SparkPost.

-Rich

Dev Survival Guide Blog Footer