Charm your SparkPost Recipient and Suppression Lists with Python

Steve Tuck
May. 26, 2017 by Steve Tuck

When developing code that sends email to your customers, it’s smart to try things out on a test list first. Using our sink-domain service helps you avoid negative impact to your reputation during tests. Next you’ll want to check that your code is working at scale on a realistic sized recipient list and flush out any code performance issues… but how?

You could use Microsoft Excel to put together .csv recipient-list files for you, but there are practical limitations, and it’s slow. You’ll be using fields like “substitution_data” that require JSON encoding, and Excel doesn’t help you with those. Performance-wise, anything more than a few hundred rows in Excel is going to get cumbersome.

What we need is a tool that will generate realistic looking test lists; preferably of any size, up to hundreds of thousands of recipients, with safe sink-domain addresses. Excel would be really slow at doing that – we can do much better with some gentle programming.

The second requirement, perhaps less obvious, is testing your uploads to SparkPost’s built-in suppression-list functionality. It’s good practice to upload the suppressed addresses from your previous provider before mailing – see, for example our Migration Guides. You might need to rehearse a migration without using your real suppression addresses. Perhaps you don’t have easy access to them right now, because your old provider is playing awkward and doesn’t have a nice API. Luckily, with very little extra code, we can also make this tool generate “practice” suppression lists.

You’re on my list

CSV files have a “header” in line 1 of the file, giving the names of each field. Handy hint: you can get an example file, including the header line, directly from SparkPost, using the “Download a Recipient List CSV template” button right here:

The SparkPost recipient-list .csv format looks like this:

The metadata , substitution_data , and tags  fields can carry pretty much anything you want.
SparkPost suppression list .csv format is equally fun, and looks like this:

Let’s have an argument

Some command-line arguments would be nice, so we can change the lists we’re generating. Here’s the arguments we’ll accept, which translate nicely into design goals for this project:

  • A flag to say whether we’re generating a recipient list or a suppression list
  • How many records we want (make it optional – say 10 as a default)
  • A recipient domain to generate records for (optional – default as something safe, such as demo.sink.sparkpostmail.com).

Downloading and using the tool

Firstly, you’ll need python , pip , and git installed. If you don’t already have them, there are some simple instructions to in my previous blogpost. Then we use git clone  to download the project. The external package names is needed, we can install that using pip3.

After that final command, you should see the list output to the screen.  If you want to direct it into a file, you just use > , like this:

That’s all there is to it!  If you run the tool with no arguments, it gives some guidance on usage:

Inside the code – special snowflakes

Here’s the kind of data we want to generate for our test recipient-lists.

The metadata, substitution data and tags are from our example company, Avocado Industries.  Let’s pick a line of that apart, and hide the double-quotes ””  so we can see it more clearly:

metadata:

Substitution_data:

Tags (these are types of avocado, by the way!)

We want each recipient email address to be unique, so that when imported into SparkPost, the list is exactly the asked-for length. Sounds easy – we can just use a random number generator to produce an ID like the ones shown above. The catch is that random functions can give the same ID during a run, and on a long run that is quite likely to happen. We need to prevent that, eliminating duplicate addresses as we go.

Python provides a nice set() datatype we can use that’s relatively efficient:

We’ve created a global set object, uniqFlags  which will acts as a scratchpad for random numbers we’ve already used – and pass it into the function randomRecip  in the usual way.

Python allows changes made to ensureUnique  inside the function using the .add() method to show up in the global data – in other words, the parameter is called by reference.

For the other fields, picking random values from a small set of options is easy. For example:

We can pick randomized US postal states in exactly the same way. The custID field is just a naive random number (so it might repeat). I’ve left that as an exercise for the reader to change, if you wish (hint: use another set).

For the tags field – we would like to assign somewhere between none and all of the possible Avocado varieties to each person; and for good measure we’ll randomize the order of those tags too. Here’s how we do that:

What’s in a name?

SparkPost recipient-list format supports a text name field, as well as an email address. It would be nice to have realistic-looking data for that. Fortunately, someone’s already built a package that uses the 1990 US Census data, that turns out to be easy to leverage. You’ll recall we installed the names  package earlier.

The names library calls take a little while to run, which could really slow down our list creation. Rather than calling the function for every row, the above code builds a nameList of first and last names, that we can choose from later. For our purposes, it’s OK to have text names that might repeat (i.e. more than one Jane Doe) – only the email addresses need be strictly unique.

The choice of 100 in the above code is fairly arbitrary – it will give us “enough randomness” when picking a random first-name and separately picking a random last-name.

Full speed ahead

A quick local test shows the tool can create a 100,000 entry recipient list – about 20MB – in seven seconds, so you shouldn’t have to wait long even for large outputs.

The output of the tool is just a text stream, so you can redirect it into a file using >, like this:

You can also pipe it into other tools. CSVkit is great for this – you can choose which columns to filter on (with csvcut), display (with csvlook) etc.  For example, you could easily create a file with just email, name, and substitution_data, and view it:

And finally …

Download the code and make your own test recipient and suppression lists. Leave a comment below to let me know how you’ve used it and what other hacks you’d like to see.

Steve Tuck
Senior Messaging Engineer

 

Dev Survival Guide Blog Footer

Share your Thoughts

Your email address will not be published.

Related Content

Sending Scheduled Mailings Simply with SparkPost

Need to send scheduled mailings but not ready for a full-featured campaign management tool? Senior Messaging Engineer Steve Tuck has a simple solution.

read more

Forwarding Inbound Email with Heroku

Get started forwarding inbound email with SparkPost. Here are simple instructions to deploy a forwarding service in Heroku.

read more

Here’s Your Mandrill Template Migration Tool

How to use the Mandrill-to-SparkPost Templatizer 3000 (MST3K) tool that makes it easy to convert a Mandrill template (or migrate many Mandrill templates) to SparkPost.

read more

Start sending email in minutes!

The world’s most powerful email delivery solution is now yours in a developer-friendly, quick to set up cloud service. Open a SparkPost account today and send up to 100,000 emails per month for free.

Send 100K Emails/Month For Free

Send this to a friend