Receiving Email with Rails

This article originally appeared in the first issue of Rails Magazine. It is reproduced here not quite verbatim with a couple of corrections and additions.

423AEFD3-8E36-43A1-B054-C16F860CF0FE.jpg

Photo from esparta on Flickr.

Introduction

Receiving email is a great way to add functionality to your application. This is one area, though, that is not very well documented with Rails. Sure, we have ActionMailer documentation, but how does something like this actually work in a production environment and what are the concerns? I had to tackle this problem recently and no solution that was “in the wild” would work with the requirements I had for this application. In this article, we will take a look at a couple of options and go in to detail with one method that has not recieved much coverage.

John Nunemaker, on the Rails Tips blog, posted a solution to this problem using GMail with IMAP to Receive Email in Rails(1). He uses a script to connect to an IMAP server every minute or so and polls for new e-mails. The Rails environment is loaded and if there are new messages waiting, this is processed using an ActiveRecord model. He uses the ‘Daemons’ library to keep the script running and give it start/stop commands and keep a pidfile.

This is a perfectly valid and functional way to process e-mail. The application I was working on, though, had to be able to handle and process e-mail in as little time as possible. People would likely be e-mailing things from their mobile phones and might want to check on them soon after an upload. At that point, polling every X number of minutes wasn’t a viable solution, so I had to come up with something else.

It’s also worth noting that polling for email should take in to account your user base and size. Let’s say that we have an app with 10,000 active users. Let’s also say that during peak times, they all decide to email our application. Finally, we’ll say that they are sending an average 5 emails apiece. With these hypothetical numbers, this works out to about 833 emails per minute. If your IMAP server is being polled every three minutes, that’s going to leave you about 3,332 e-mails to download and process each time.

Configuring Your Email Server

Email itself is a problem that has been largely solved. There are a wealth of email servers available but this article will take a look at Postfix. Postfix is most easily installed using the package manager of your distribution and may already be installed if you have a VPS. I prefer Ubuntu server side so this article will focus on that flavor. Just be aware that certain configuration file locations may vary depending on your distribution. So let’s get started.

First we need to add or change some DNS records. The instructions for how to do this will vary depending on how you have your DNS hosted. I personally use DNS Made Easy and recommend it to my clients as well, should they need DNS hosting. DNS Made Easy has very reasonable rates and quotas. Regardless of your host you need to create the following records:

mxrecord.t.jpg

  • An “A” record that has your domain name only.
  • An “A” record that is just "mail".
  • An “MX” record that is set to level 10 and points to “mail”
  • Optional: An SPF Record

OK, I was lying. The SPF record isn’t really optional. This is going to be a TXT Record and should read something like this:

	v=spf1 mx -all

There are several different variations you can use with SPF records but going through them would be beyond the scope of this article. Do some research and pick the combination that’s right for your setup.

Now, the first thing that you’re going to need to do is figure out what address you want to receive mail as. This is going to be your catch-all address. In this case we’re going to choose "killerrobot" because that just might keep spammers at bay(2). I mean, who’s going to spam a killer robot?

Reading tutorials around the web or even looking in some books will tell you that you can tell postfix to forward mail simply by piping it in /etc/aliases. You might be tempted to do something like pipe everything to a ruby script:

# /etc/aliases
...
killerrobot: "|/usr/bin/ruby /var/www/apps/myapp/current/lib/mail_receiver.rb"
*: killerrobot

This, unfortunately, won’t work. If you do it this way, all of your scripts are going to run as root. This is not what you want and can be a security concern. The proper way to do this is with a postmap filter. Open up /etc/postfix/master.cf. The first line after all of the comments should look like this:

# /etc/postfix/master.cf
smtp      inet  n       -       -       -       -       smtpd

Add a line right below that to tell postfix that you’re using a filter:

# /etc/postfix/master.cf
smtp      inet  n       -       -       -       -       smtpd
 -o content_filter=myapp_filter:

Then go all the way down to the bottom of the file and add your filter:

# /etc/postfix/master.cf 
smtp      inet  n       -       -       -       -       smtpd
 -o content_filter=myapp_filter:
...
myapp_filter unix -     n       n       -       -       pipe
 flags=Xhq user=deploy argv=/usr/bin/ruby /var/www/apps/myapp/current/lib/mail_receiver.rb

The “X” parameter in “flags=Xhq” tells postfix that an external command performs final delivery. This is going to change the message status from “relayed” to “delivered”. The “h” flag sets the recipients and domains to lowercase, and the “q” flag quotes whitespace and other special characters. Now, reload postfix by doing sudo postfix reload. At this point, you should have a very basic mail server configured to receive email and put it in to a mail_receiver.rb script.

Handling The Email

We’re going to be putting all of our mail that comes in to a message queue and parsing it with the ‘mms2r’ gem. In this article I’m going to use beanstalkd(3) but you could substitute your favorite message queue for this part of the architecture. I’m going to assume that a message queue is already installed and running and that you have both the ‘tmail’ and ‘mms2r’ gems installed.

We want our mail_receiver script to be super lean. It’s only going to serve one function: put the incoming mail in to a queue. We’ll process the queue later but for now we just want to get it in there and handle any attachments. We want it to be super lean because if we’re receiving a lot of mail we don’t want this script to be memory intensive or take a long time to start up or run. It will look something like this:

#!/usr/bin/env ruby
require 'rubygems'
require 'tmail'
require 'mms2r'
require 'beanstalk-client'
 
message = $stdin.read
mail = TMail::Mail.parse(message)
mms = MMS2R::Media.new(mail)
 
if !mail.to.nil?
  BEANSTALK = Beanstalk::Pool.new(['127.0.0.1:11300'])
  BEANSTALK.yput({:type => 'received_email', 
    :to => mail.to.flatten.first.gsub('@mydomain.com', ''), 
    :tags => mail.subject, 
    :attachment => mms.default_media.path})
end

What we’re doing here is taking the email message from standard input and parsing it by putting it in to a TMail object. TMail is a great library that takes care of most of the formatting for us. It lets us do things like refer to mail messages as objects and use mail.to, mail.from, etc. If we have attachments, they’re going along as well.

MMS2R is an amazing piece of software. It works for both emails and, as the name implies, MMS messages as well. What makes it so amazing? There are dozens of different formats that an attachment can come in from both email and MMS. Different phone carriers each have their own way of doing MMS attachments, each of them slightly different. MMS2R alleviates the problem of trying to parse all of these different formats and does it all for you. In this way we can call MMS2R::Media.new(mail) and be done with it.

For the purposes of our example application, the user can tag the photos they upload by putting the different tags in the subject. We send that in as another option in the job hash for beanstalkd. Each user is assigned a unique identifier in their account that lets them send email to the application, for example “aslkdf32@myapp.com”. We grab the first recipient (mail.to) because that will come in as an array. We take the domain out and send that in as the “to” field. Finally, the temporary media location on disk that we parsed using MMS2R is thrown in to the queue as the :attachment option. Our mail is in the queue.

Processing the Queue

Now that we have our email in the queue, we need to get it out. For this part, we’re actually going to load the Rails environment. I have this in the lib directory. The code would look something like this:

#!/usr/bin/env ruby
require File.join(File.dirname(__FILE__), '..', 'config', 'environment')
require 'rubygems'
require 'beanstalk-client'
require 'yaml'
beanstalk_config = YAML::load(File.open("#{RAILS_ROOT}/config/beanstalk.yml"))
 
@logger = Logger.new("#{RAILS_ROOT}/log/queue.#{Rails.env}.log")
@logger.level = Logger::INFO
 
BEANSTALK = Beanstalk::Pool.new(beanstalk_config[Rails.env])
 
loop do
  job = BEANSTALK.reserve
  job_hash = job.ybody
  case job_hash[:type]
  when 'received_email'
    @logger.info("Got email: #{job_hash.inspect}")
    if EmailProcessor.process(job_hash)
      job.delete
    else
      @logger.warn("Did not process email: #{job_hash.inspect}")
      job.bury
    end
  else
    @logger.warn("Don't know how to process #{job_hash.inspect}")
  end
end

The first line loads the Rails environment so we have access to all of our ActiveRecord models. We want to keep our code DRY and use only one method of processing an attachment, routing messages, or the like. If we were using attachment_fu or paperclip, we would keep this code in the model. You might even want to make a seprate class, such a presenter, for your logic. In this case the EmailProcessor class finds the user based on the reception_email attribute and then executes the “do_stuff” method to process the message. It would look something like this:

require 'local_file'
require 'tmail'
require 'mms2r'
 
class EmailProcessor
  attr_accessor :user, :options
 
  def self.process(*args)
    email_processor = new(*args)
    email_processor.do_stuff
  end
 
  def find_user
    @user = User.find(:first, :conditions => {:reception_email => @options[:to]})
  end
 
  def do_stuff
		# Your actual logic would go here...
  end
 
  def initialize(*args)
    @options = args.extract_options!
    find_user
  end
end

This uses the LocalFile class from Ben Rubenstein(4)

We’re not quite done yet. We need to make the mail_processor run as a daemon instead of just running “ruby mail_processor.rb” when we want to launch it. We’ll use the ‘daemons’ library for that. This will take care of setting up PID files and lets us do ruby mail_processor_control.rb start and ruby mail_processor_control.rb stop. We’re also using the “daemons_extension” file from Rapleaf that actually gives feedback on stopping of the daemon. The script itself is extremely simple and goes in the lib directory with your mail_processor.rb script:

require 'rubygems'
require 'daemons'
require File.join(File.dirname(__FILE__), 'daemons_extension')
 
ENV['RAILS_ENV'] ||= 'development'
 
options = {
  :app_name  => 'processor',
  :dir_mode  => :script,
  :dir       => '../log',
  :backtrace => true,
  :mode      => :load,
  :monitor   => true
}
 
Daemons.run(File.join(File.dirname(__FILE__), 'processor.rb'), options)

Now just start it by doing “ruby mail_processor_control.rb start" and your daemon will be up and running. That’s it! You’re receiving e-mail to your Rails app. It is very important that you load the rails environment in your worker and not the daemon controller. The controller only manages the worker, therefore it won’t have access to the Rails environment and will error out.

Considerations

Depending on your configuration, you may want to use a different message queue than beanstalkd. I’ve personally found beanstalkd to be reliable but your architecture might call for something else. For example, you may want to put your message queue on another server. If you did this then you wouldn’t have access to the temporary storage that MMS2R defaults to for saving the attachments. In that case you could use a queue and put the attachments directly in the queue, on s3, etc.

Some people have reported problems using the daemons library and having their daemon just halt and stop responding. I’ve never encountered that and I’ve had this similar setup running for months. You will also want to put your mail_processor_control under some sort of process supervision, such as by monit or god.

You may be asking yourself why we didn’t use ActionMailer to handle the incoming emails since it does that? The answer is that if you do it the way it’s described, for example, on the Rails wiki, it will spin up a new Rails process for each email that’s received. Under any significant load, this will fail(5). Another drawback to that approach is that if there is a failure, you lose the email. With this type of architecture, it remains in the queue and you can just process it later.

Conclusion

This is a good start to handling email in your application. Being able to process email is a great way to enhance your app and give your users mobile access. With email-capable phones becoming ubiquitous, they no longer need to be tied to a computer to use your app. Remember, an app that can be used anywhere is an app that will be used anywhere.

Citations

  1. http://railstips.org/2008/10/27/using-gmail-with-imap-to-receive-email-in-rails
  2. No it won’t.
  3. http://xph.us/software/beanstalkd/
  4. http://www.benr75.com/articles/2008/01/04/attachment_fu-now-with-local-file-fu
  5. Because rails can’t scale. See http://canrailsscale.com for more information.