Receiving Email with Rails

This article originally appeared in the first issue of Rails Magazine. It is reproduced here not quite verbatim with a couple of corrections and additions.

423AEFD3-8E36-43A1-B054-C16F860CF0FE.jpg

Photo from esparta on Flickr.

Introduction

Receiving email is a great way to add functionality to your application. This is one area, though, that is not very well documented with Rails. Sure, we have ActionMailer documentation, but how does something like this actually work in a production environment and what are the concerns? I had to tackle this problem recently and no solution that was “in the wild” would work with the requirements I had for this application. In this article, we will take a look at a couple of options and go in to detail with one method that has not recieved much coverage.

John Nunemaker, on the Rails Tips blog, posted a solution to this problem using GMail with IMAP to Receive Email in Rails(1). He uses a script to connect to an IMAP server every minute or so and polls for new e-mails. The Rails environment is loaded and if there are new messages waiting, this is processed using an ActiveRecord model. He uses the ‘Daemons’ library to keep the script running and give it start/stop commands and keep a pidfile.

This is a perfectly valid and functional way to process e-mail. The application I was working on, though, had to be able to handle and process e-mail in as little time as possible. People would likely be e-mailing things from their mobile phones and might want to check on them soon after an upload. At that point, polling every X number of minutes wasn’t a viable solution, so I had to come up with something else.

It’s also worth noting that polling for email should take in to account your user base and size. Let’s say that we have an app with 10,000 active users. Let’s also say that during peak times, they all decide to email our application. Finally, we’ll say that they are sending an average 5 emails apiece. With these hypothetical numbers, this works out to about 833 emails per minute. If your IMAP server is being polled every three minutes, that’s going to leave you about 3,332 e-mails to download and process each time.

Configuring Your Email Server

Email itself is a problem that has been largely solved. There are a wealth of email servers available but this article will take a look at Postfix. Postfix is most easily installed using the package manager of your distribution and may already be installed if you have a VPS. I prefer Ubuntu server side so this article will focus on that flavor. Just be aware that certain configuration file locations may vary depending on your distribution. So let’s get started.

First we need to add or change some DNS records. The instructions for how to do this will vary depending on how you have your DNS hosted. I personally use DNS Made Easy and recommend it to my clients as well, should they need DNS hosting. DNS Made Easy has very reasonable rates and quotas. Regardless of your host you need to create the following records:

mxrecord.t.jpg

  • An “A” record that has your domain name only.
  • An “A” record that is just "mail".
  • An “MX” record that is set to level 10 and points to “mail”
  • Optional: An SPF Record

OK, I was lying. The SPF record isn’t really optional. This is going to be a TXT Record and should read something like this:

	v=spf1 mx -all

There are several different variations you can use with SPF records but going through them would be beyond the scope of this article. Do some research and pick the combination that’s right for your setup.

Now, the first thing that you’re going to need to do is figure out what address you want to receive mail as. This is going to be your catch-all address. In this case we’re going to choose "killerrobot" because that just might keep spammers at bay(2). I mean, who’s going to spam a killer robot?

Reading tutorials around the web or even looking in some books will tell you that you can tell postfix to forward mail simply by piping it in /etc/aliases. You might be tempted to do something like pipe everything to a ruby script:

# /etc/aliases
...
killerrobot: "|/usr/bin/ruby /var/www/apps/myapp/current/lib/mail_receiver.rb"
*: killerrobot

This, unfortunately, won’t work. If you do it this way, all of your scripts are going to run as root. This is not what you want and can be a security concern. The proper way to do this is with a postmap filter. Open up /etc/postfix/master.cf. The first line after all of the comments should look like this:

# /etc/postfix/master.cf
smtp      inet  n       -       -       -       -       smtpd

Add a line right below that to tell postfix that you’re using a filter:

# /etc/postfix/master.cf
smtp      inet  n       -       -       -       -       smtpd
 -o content_filter=myapp_filter:

Then go all the way down to the bottom of the file and add your filter:

# /etc/postfix/master.cf 
smtp      inet  n       -       -       -       -       smtpd
 -o content_filter=myapp_filter:
...
myapp_filter unix -     n       n       -       -       pipe
 flags=Xhq user=deploy argv=/usr/bin/ruby /var/www/apps/myapp/current/lib/mail_receiver.rb

The “X” parameter in “flags=Xhq” tells postfix that an external command performs final delivery. This is going to change the message status from “relayed” to “delivered”. The “h” flag sets the recipients and domains to lowercase, and the “q” flag quotes whitespace and other special characters. Now, reload postfix by doing sudo postfix reload. At this point, you should have a very basic mail server configured to receive email and put it in to a mail_receiver.rb script.

Handling The Email

We’re going to be putting all of our mail that comes in to a message queue and parsing it with the ‘mms2r’ gem. In this article I’m going to use beanstalkd(3) but you could substitute your favorite message queue for this part of the architecture. I’m going to assume that a message queue is already installed and running and that you have both the ‘tmail’ and ‘mms2r’ gems installed.

We want our mail_receiver script to be super lean. It’s only going to serve one function: put the incoming mail in to a queue. We’ll process the queue later but for now we just want to get it in there and handle any attachments. We want it to be super lean because if we’re receiving a lot of mail we don’t want this script to be memory intensive or take a long time to start up or run. It will look something like this:

#!/usr/bin/env ruby
require 'rubygems'
require 'tmail'
require 'mms2r'
require 'beanstalk-client'
 
message = $stdin.read
mail = TMail::Mail.parse(message)
mms = MMS2R::Media.new(mail)
 
if !mail.to.nil?
  BEANSTALK = Beanstalk::Pool.new(['127.0.0.1:11300'])
  BEANSTALK.yput({:type => 'received_email', 
    :to => mail.to.flatten.first.gsub('@mydomain.com', ''), 
    :tags => mail.subject, 
    :attachment => mms.default_media.path})
end

What we’re doing here is taking the email message from standard input and parsing it by putting it in to a TMail object. TMail is a great library that takes care of most of the formatting for us. It lets us do things like refer to mail messages as objects and use mail.to, mail.from, etc. If we have attachments, they’re going along as well.

MMS2R is an amazing piece of software. It works for both emails and, as the name implies, MMS messages as well. What makes it so amazing? There are dozens of different formats that an attachment can come in from both email and MMS. Different phone carriers each have their own way of doing MMS attachments, each of them slightly different. MMS2R alleviates the problem of trying to parse all of these different formats and does it all for you. In this way we can call MMS2R::Media.new(mail) and be done with it.

For the purposes of our example application, the user can tag the photos they upload by putting the different tags in the subject. We send that in as another option in the job hash for beanstalkd. Each user is assigned a unique identifier in their account that lets them send email to the application, for example “aslkdf32@myapp.com”. We grab the first recipient (mail.to) because that will come in as an array. We take the domain out and send that in as the “to” field. Finally, the temporary media location on disk that we parsed using MMS2R is thrown in to the queue as the :attachment option. Our mail is in the queue.

Processing the Queue

Now that we have our email in the queue, we need to get it out. For this part, we’re actually going to load the Rails environment. I have this in the lib directory. The code would look something like this:

#!/usr/bin/env ruby
require File.join(File.dirname(__FILE__), '..', 'config', 'environment')
require 'rubygems'
require 'beanstalk-client'
require 'yaml'
beanstalk_config = YAML::load(File.open("#{RAILS_ROOT}/config/beanstalk.yml"))
 
@logger = Logger.new("#{RAILS_ROOT}/log/queue.#{Rails.env}.log")
@logger.level = Logger::INFO
 
BEANSTALK = Beanstalk::Pool.new(beanstalk_config[Rails.env])
 
loop do
  job = BEANSTALK.reserve
  job_hash = job.ybody
  case job_hash[:type]
  when 'received_email'
    @logger.info("Got email: #{job_hash.inspect}")
    if EmailProcessor.process(job_hash)
      job.delete
    else
      @logger.warn("Did not process email: #{job_hash.inspect}")
      job.bury
    end
  else
    @logger.warn("Don't know how to process #{job_hash.inspect}")
  end
end

The first line loads the Rails environment so we have access to all of our ActiveRecord models. We want to keep our code DRY and use only one method of processing an attachment, routing messages, or the like. If we were using attachment_fu or paperclip, we would keep this code in the model. You might even want to make a seprate class, such a presenter, for your logic. In this case the EmailProcessor class finds the user based on the reception_email attribute and then executes the “do_stuff” method to process the message. It would look something like this:

require 'local_file'
require 'tmail'
require 'mms2r'
 
class EmailProcessor
  attr_accessor :user, :options
 
  def self.process(*args)
    email_processor = new(*args)
    email_processor.do_stuff
  end
 
  def find_user
    @user = User.find(:first, :conditions => {:reception_email => @options[:to]})
  end
 
  def do_stuff
		# Your actual logic would go here...
  end
 
  def initialize(*args)
    @options = args.extract_options!
    find_user
  end
end

This uses the LocalFile class from Ben Rubenstein(4)

We’re not quite done yet. We need to make the mail_processor run as a daemon instead of just running “ruby mail_processor.rb” when we want to launch it. We’ll use the ‘daemons’ library for that. This will take care of setting up PID files and lets us do ruby mail_processor_control.rb start and ruby mail_processor_control.rb stop. We’re also using the “daemons_extension” file from Rapleaf that actually gives feedback on stopping of the daemon. The script itself is extremely simple and goes in the lib directory with your mail_processor.rb script:

require 'rubygems'
require 'daemons'
require File.join(File.dirname(__FILE__), 'daemons_extension')
 
ENV['RAILS_ENV'] ||= 'development'
 
options = {
  :app_name  => 'processor',
  :dir_mode  => :script,
  :dir       => '../log',
  :backtrace => true,
  :mode      => :load,
  :monitor   => true
}
 
Daemons.run(File.join(File.dirname(__FILE__), 'processor.rb'), options)

Now just start it by doing “ruby mail_processor_control.rb start" and your daemon will be up and running. That’s it! You’re receiving e-mail to your Rails app. It is very important that you load the rails environment in your worker and not the daemon controller. The controller only manages the worker, therefore it won’t have access to the Rails environment and will error out.

Considerations

Depending on your configuration, you may want to use a different message queue than beanstalkd. I’ve personally found beanstalkd to be reliable but your architecture might call for something else. For example, you may want to put your message queue on another server. If you did this then you wouldn’t have access to the temporary storage that MMS2R defaults to for saving the attachments. In that case you could use a queue and put the attachments directly in the queue, on s3, etc.

Some people have reported problems using the daemons library and having their daemon just halt and stop responding. I’ve never encountered that and I’ve had this similar setup running for months. You will also want to put your mail_processor_control under some sort of process supervision, such as by monit or god.

You may be asking yourself why we didn’t use ActionMailer to handle the incoming emails since it does that? The answer is that if you do it the way it’s described, for example, on the Rails wiki, it will spin up a new Rails process for each email that’s received. Under any significant load, this will fail(5). Another drawback to that approach is that if there is a failure, you lose the email. With this type of architecture, it remains in the queue and you can just process it later.

Conclusion

This is a good start to handling email in your application. Being able to process email is a great way to enhance your app and give your users mobile access. With email-capable phones becoming ubiquitous, they no longer need to be tied to a computer to use your app. Remember, an app that can be used anywhere is an app that will be used anywhere.

Citations

  1. http://railstips.org/2008/10/27/using-gmail-with-imap-to-receive-email-in-rails
  2. No it won’t.
  3. http://xph.us/software/beanstalkd/
  4. http://www.benr75.com/articles/2008/01/04/attachment_fu-now-with-local-file-fu
  5. Because rails can’t scale. See http://canrailsscale.com for more information.

, ,

18 Responses to “Receiving Email with Rails”

  1. giles bowkett April 24, 2009 at 7:14 am #

    hey jason – also check out Astrotrain on github. it receives e-mail and issues xmpp or http in response. blatant plug for my employer but it's what enables Tender to turn e-mails into support discussions. like how Tripit turns e-mails into itineraries automagically.

  2. technoweenie April 24, 2009 at 7:33 am #

    Piping emails to a ruby process can present some scaling issues. Each email is basically another ruby process that loads rubygems, tmail, mms2r, etc. It's probably better to just have postfix dump emails to a maildir and have tmail read them in a little ruby daemon: http://tmail.rubyforge.org/rdoc/classes/TMail/M… . I handled all of our email processing fine with ruby pipes for awhile. But when I moved to our current host, they were quick to suggest using a maildir.

    I didn't know about mms2r, but I'll take a look at that. I think I parse the mail parts manually with tmail right now, blah :)

  3. morgancurrie April 24, 2009 at 7:36 am #

    how do you guys deal with stripping the quoted of the reply?

  4. phegaro September 25, 2009 at 7:40 am #

    Some of the other information that i have read talks about how doing it this way means that a new ruby process is started and killed for each email that comes into the system. According to the math above that would mean that you have about 833 processes being created/deleted per minute on the server. Are you guys having issues with that model?

    It seems the polling model avoids this but not sure if the tradeoff for less processes is worth the loss in responsiveness and can we just poll more often to get rid of the issue? Like once ever minute?

  5. jasonseifer September 25, 2009 at 11:43 am #

    Doing it this way specifically avoids spinning up a new process each time as you're running it as a daemon. You wind up having one process running constantly that processes each one as they come in.

  6. phegaro September 25, 2009 at 6:37 pm #

    I thought the mail_receiver.rb file will be lauched for each email received and then will stick the email in the queue. I realize that it does not do much but it is creating a ruby process and letting it exit for each email. Is that a high overhead on a system?

  7. Great article. Only I would use an unless statement, rather than if and a boolean negation operator. more ruby like and more readable IMHO.

    i.e.,

    unless mail.to.nil?
    BEANSTALK = Beanstalk::Pool.new(['127.0.0.1:11300'])
    …
    end

  8. Jose November 9, 2009 at 6:42 pm #

    I have several webapps needing this service hosted on the same server (each app receiving mail sent to a different domain).

    I'm thinking of creating a separate Rails project just to do the mail receiving. The domain info would be stored in the model so the webapps can separate their mail from each other.

    Does this sound reasonable?

  9. Me January 13, 2010 at 5:55 pm #

    Is this true?

  10. phegaro January 13, 2010 at 6:04 pm #

    After looking at this we went with a model where we had a daemon that would just sit there and poll frequently to an IMAP server. We are using a google apps account to do this and it makes this much easier. You dont have to run your own mail server, you can just poll as frequent or non frequent as you want and if you really want instant you can turn on IMAP IDLE support and then you can get the mail as they come in. Since there is only one daemon running we no longer have this problem.

  11. Devon Sonterre September 5, 2010 at 8:34 am #

    A wide variety of things cause to do this.

  12. Oma Gildner October 9, 2010 at 10:51 am #

    Awesome blogging site! Now this technologies won’t cease to surprise me personally. So anyways, please maintain the truly great work. If you have the time, I want to listen to your new thoughts on other related things to that. I have to say, my interest is piqued. Now whereis the subscribe option!

  13. Walter Schreppers February 14, 2011 at 5:40 am #

    Thanks for this great tutorial! I will be using this in a project soon.
    One question though, when you edit main.cf instead of using aliases I don’t see you specify killerrobot anywhere?

    So this means all emails go to the mail_receiver.rb? So that is all emails from all domains running on this postfix ?

    If this is so, it might also be an ideal place to do some ruby based spam filtering ;)

    Anyway thanks for the great write up. Hope you can find the time to clearify where killerapp@yourdomain.com is used. Or maybe ‘anything’@yourdomain.com. Or how it can be modified to do this since I have about 10 domains running on 1 server, don’t want all of their mails being processed by this system, just one or two of the used domains from my postfix/bind.

    Kind regards,
    W.

  14. ben May 5, 2011 at 6:41 pm #

    <3 u jason

  15. m4rtijn April 5, 2012 at 5:06 am #

    Hi,

    hope this topic is not dead yet.

    How about the security risk, having your app & maybe DB connection data on a mail server which is reachable from the outside?

    Cheers
    Martijn

  16. Olivier El Mekki April 22, 2013 at 12:57 pm #

    > This, unfortunately, won’t work. If you do it this way, all of your scripts are going to run as root.

    Why not using sudo ? I suppose that on most system, it will be run as some kind of postfix user rather than root, and you can add a line in sudoers so that postfix can call your ruby command (and only that) as your application user, without password.

    It also seems to me it has the advantage to let your app process mail sent to a defined recipient only, and has mail to other recipients (read : human system users) delivered to their mailbox.

  17. Olivier El Mekki April 23, 2013 at 7:57 am #

    Ok, this is working for me : https://gist.github.com/oelmekki/dd1a9ff78584e43aa8f2

    So basically, postfix is set to deliver mails as user “incoming_mailer”. Sudoers is set to let this user call our ruby script with no password. And finally, alias is set to pipe mail to that user and switching to app user through sudo.

    Additionnaly, I’ve made a regex virtual alias to match /contact\+.*/ recipients, so I can embed thread id.

    The advantage of this is that system user can still receive mails through normal transport, and only “contact” pseudo user’s mails are piped to script.

Leave a Reply