One of the little joys of being a sysadmin is being able to solve a problem by composing a highly specific tool out of a set of much more general tools or libraries. The evolution of this maildir linting program demonstrates how effective this model is and illustrates a bit of the iterative problem solving that’s a sysadmin’s bread and butter. We get a basic level of functionality in this post and a coming Part 2 will extend it to adapt to a new set of requirements. While migrating from our current mail system to a newer, shinier one we discovered that there were a number of messages that imapsync couldn’t process due to missing Message-ID headers. Fortunately we store mail on the backend in Maildir format so we can easily iterate over them without worrying about locking and we can use Python’s

built-in email library (and os.walk) to handle all the messy parsing and header-splitting:

for dirname, dirnames, filenames in os.walk(maildir):
  for thisfile in filenames:
      fullpath = os.path.join(dirname, thisfile)
      f=open(fullpath, 'r') #we might be using an older python
      data = f.read()
      message = message_from_string(data)
      if 'MESSAGE-ID' not in (header.upper() for header in m.keys()):
        msgid = utils.make_msgid("RETCON.%f.%i" % (time.time(), total_count))
        m.add_header("Message-Id", msgid)
        total_count = total_count+1

Of course this is probably fine but we’d really like to be able to see what it’s doing just in case it doesn’t do what we expect. Another little corner of the standard library is difflib:

before = data.split('n')
after = message.as_string().split('n')
for l in difflib.unified_diff(before, after, tofile=fullpath):
  print l

And for further paranoia we’d really like to save a backup of the messages we modified. Look no further than tarfile for a nifty way to create a tbz2 without creating any intermediates to clog things up:

backuptar = tarfile.open(full_backup_path, "w|bz2")
...
backuptar.add(fullpath)

At this point we add some error checking and command line parsing and have not only a solution to the immediate problem but a nice platform to extend should we find other things that need adjusting during the migration.