User engagement is critical to the growth of any product or service and email is a key channel for driving engagement. An increase in email engagement can boost monthly active users, conversion rates, and revenue. In this blog post, I explore whether optimizing when emails are sent to recipients results in a meaningful lift in engagement and outline an approach to optimizing these send times.
While this approach is fairly simple, the model performs quite well in backtesting, resulting in a significant 43.3% lift in email engagement.
If you’re a SparkPost customer, you can utilize our Events API to collect data in order to run a similar analysis to mine below. If you do so, see the caveats section at the end of this post regarding data sparsity.
Sampling and Cleaning the Data
In this analysis, I have the benefit of being able to look at recipient engagement across SparkPost’s customer base. As SparkPost delivers email for many of the largest senders in the world, this offers a reasonably high resolution of recipient engagement but also means the dataset is extraordinarily large. As a result, this analysis uses a smaller sample of 2.2 million US-based recipients.
All of these recipients have exhibited some engagement behavior in the sample period from October 1, 2018, to November 30, 2018. We define engagement as opening an email or clicking on a link in an email.
As with all data science projects, the data needed to be cleaned and outliers noted or removed. In this dataset, system-generated emails (such as alerts sent by software for consumption by IT or development teams) were removed. So too were recipients who received high volumes of email (over 1,000 emails a day) or where I noted recipients opening emails from more than 6 timezones (which I believe may be group email addresses or role-based email addresses such as sales@ or info@ addresses).
Exploring Recipient Behavior
Before diving into building a model for optimizing send times, I wanted to understand the feasibility of building such a model. Getting a sense for recipient behavior and determining whether there were patterns we could potentially model was a useful exercise. For clarity, the exploratory data analysis below is limited to recipients located in the US EST timezone.
In the charts above, we’ve plotted the following for each hour of the week (in EST):
- Mean Time to First Engagement: The mean time, in hours, from delivery of an email to a recipient’s first engagement with that email. Lower is better.
- Mean Engagement Rate: The mean engagement rate, where the engagement rate is the count of emails engaged with over the total number of emails sent. Higher is better.
- Mean Email Deliveries Per Recipient: The mean number of emails sent to recipients.
It’s clear that there are some interesting patterns here, particularly late night, early mornings, and over weekends. The early morning drops in mean time to first engagement and increase in engagement rate are phenomena familiar to many marketers, and so too the significant drop in engagement over weekends.
It’s also possible that the peaks in engagement are being driven by sender behavior i.e. emails sent late at night to night owls are driving high late night engagement rates, rather than this being an organic, recipient-driven behavior.
This is definitely worth digging into further.
Optimizing for Conversion
When we send emails to recipients, we’d often like the recipients to engage as soon as possible: buy something in a sale, attend an event, purchase a plane ticket, read a news article, or start using our app. However, the email time to engagement distribution is highly skewed, with a very long right tail. You can see this depicted in the histogram below, where the right tail is bounded by the sample period of 61 days.
The engagement data for recipients with mean engagement in the long right tail are not ideal for use in optimizing for engagement (and in turn, for conversion). It would be worth calculating the engagement rate for recipients as:
But what should we make x above? It’s clear from the histogram that the majority of recipients engage with an email within hours. In the table below, it looks like approximately 70% of emails are engaged with, and almost 60% of recipients engage with emails, within 12 hours. This seems like a reasonable number to use for determining an “immediate” engagement rate.
|TTF Engagement for Messages
(percentiles | hours)
|TTF Engagement for Recipients
(percentiles | hours)
I’ll use this “immediate” engagement rate for the rest of the analysis below.
Per-Recipient Engagement “Probability Distribution” and Clustering
I’d like to understand a recipient’s propensity to engage per hour of the week in order to further explore the phenomena we saw above and to potentially optimize recipient send times.
To do this, I mapped every email delivered to a recipient over the two month sample period to an hour of the week, tallied how many were “immediately” engaged with and calculate the hour of week engagement rate.
I did this for every recipient and ended up with a probability distribution of sorts similar to the table below (rows are recipients, columns the hour of the week). In hour 11, recipient 0 has an engagement rate of 0.67.
Clustering is a great exploratory tool. In the next step of the analyses, I used a k-means algorithm on the per-recipient “propensity to engage” dataset to cluster recipients, hoping to surface a higher definition view of recipient engagement behavior. Once I clustered the recipients, I calculated the mean engagement per hour of the week for each cluster and plotted this on the charts below.
The results are super interesting. Many of the clusters exhibit a clear engagement bias. Cluster 3 to evenings ET, Cluster 5 to late night / overnight (night owls or night workers?), Cluster 1 and 6 to mornings to mid-afternoon (with a bias to mornings).
You’ll note that Clusters 0 and 4 have significantly lower engagement rates to the other clusters. In the chart below, I’ve modified the y-axis to give us a better look.
Despite this being a very large dataset, I still have fairly sparse engagement data for many recipients. In particular, I don’t have much data for recipients in Cluster 0: engagement is low and there isn’t a clear pattern, other than a bias to daytime engagement.
I may have to exclude these recipients from our optimization or make a best guess based on some general patterns I’ve observed in the data.
Modeling Engagement Lift from Optimizing Send Time
Clearly, there are recipients that have a higher propensity to engage at a specific time or day of the week. What if I had used this information to modify the time that the emails were sent?
Modeling this is fairly straightforward. I used a Window function (available in Apache Spark and most SQL engines) to look a number of hours prior and ahead of the real-world send time in order to determine if there was a better time within this window to send an email. I defined a better send time as an hour in this window during which the recipient had a higher hourly engagement rate than during the actual send time.
For my model, I used a fairly narrow 8-hour window (4 hours prior to the delivery hour and 4 hours ahead). I was also careful to exclude data points where only a single email was delivered and subsequently engaged with (resulting in a 100% hourly engagement rate).
I ran the entire dataset (all US timezones) through the model and the mean engagement rate results are below. The findings show that had SparkPost’s customers sent these emails to recipients during their optimized engagement hour, they would have seen a mean uplift of 43%. This is really compelling. Imagine what this might do for your monthly active users, conversion, and revenue numbers!
|Mean “Immediate” Engagement Rate||Mean “Optimized” Engagement Rate||Uplift|
Some Caveats to this Analysis and Model
This approach to send time optimization is exciting and compelling. Some caveats to the analysis and model should, however, be noted:
- Purely transactional email such as password resets, shipment notifications, and alerts was not removed from our analysis (our data privacy safeguards limit our ability to do this). In a real-world implementation, we’d probably want to exclude these emails from optimization.
- Teasing apart phenomenon driven by sender behavior (i.e. when we send emails) from organic recipient behavior can be challenging, and, given sufficient data sparsity, may not allow us to determine actual recipient behavior. This was, however, somewhat mitigated by the number of senders we included in this analysis.
Additionally, if you’d like to explore this for your message streams it is important to be mindful of data sparsity issues (mentioned in (2) above), i.e., not enough engagement data to make meaningful adjustments to send time. This may not result in a performant model and may even result in a drop in engagement.