(Almost) Obtain Raw Mail Message Source

Let’s say you need to get the raw RFC 2822 mail message source out of Mail.app and into an script. The accepted way to do this is:

on run {input, parameters}
    set theSource to {}
    tell application "Mail"
        repeat with aMessage in input
            set end of theSource to aMessage's source & return
        end repeat
    end tell

    return theSource as text
end run

Problems with this approach

However, this fails because Mail.app does not provide the full message source over the bridge, which is buggier than a bait store these days.

I get reasonably accurate source until the end of the first MIME part (which is usually the text one), and then the output only lists the attachments but does not include them.

If it did, you’d be able to save all attachments with this script:

import os, os.path, sys, email, mimetypes

msg = email.message_from_file(sys.stdin)

counter = 1
for part in msg.walk():
    # multipart/* are just containers
    if part.get_content_maintype() == 'multipart':
        continue
    filename = part.get_filename()
    if not filename:
        ext = mimetypes.guess_extension(part.get_content_type())
        if not ext:
            # Use a generic bag-of-bits extension
            ext = '.bin'
        filename = 'part-%03d%s' % (counter, ext)
    counter += 1
    target = os.path.join(os.environ['HOME'],'Downloads', filename)
    copy = 1
    prefix = "Copy %d of "
    while os.path.exists(target):
        target = os.path.join(os.environ['HOME'],'Downloads', (prefix % copy) + filename)
        copy += 1 
    fp = open(target, 'wb')
    data = part.get_payload(decode=True)
    fp.write(part.get_payload(decode=True))
    fp.close()

Workaround

Drag messages to the Finder and build an script that uses this block instead, taking as input the result of Ask for Finder Items:

import os, os.path, sys, email, mimetypes

for filename in sys.stdin:
    if ".eml" not in filename:
        break
    msg = email.message_from_file(open(filename.strip()))
    counter = 1
    for part in msg.walk():
        # multipart/* are just containers
        print part.get_content_type()
        if part.get_content_maintype() == 'multipart':
            continue
        # Applications should really sanitize the given filename so that an
        # email message can't be used to overwrite important files
        filename = part.get_filename()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
            if not ext:
                # Use a generic bag-of-bits extension
                ext = '.bin'
            filename = 'part-%03d%s' % (counter, ext)
        counter += 1
        target = os.path.join(os.environ['HOME'],'Downloads', filename)
        copy = 1
        prefix = "Copy %d of "
        while os.path.exists(target):
            target = os.path.join(os.environ['HOME'],'Downloads', (prefix % copy) + filename)
            copy += 1 
        fp = open(target, 'wb')
        data = part.get_payload(decode=True)
        print len(data)
        fp.write(part.get_payload(decode=True))
        fp.close()