Renaming and Filing Photos and Videos in the HEIC/HEIF era

Maintaining a coherent photo archive has been an issue for me ever since I started using a digital camera, and I’ve been hacking scripts for making sure things are consistent , which, if you’ve been using Macs long enough, was a full four years before supported .

My needs are actually pretty simple: I’ve long decided to just keep a master copy of everything on my NAS using a simple hierarchy (YYYY/MM/YYYYMMDDHHMMSS.foo) and eschew fancy albums.

There are two main reasons for that:

  • Pretty event-oriented albums are always short-lived because you end up having to use copies of your masters in one way or another, and I don’t want to tie my archive to a gazillion different album/social sharing tools1.
  • Apple just can’t seem to maintain any kind of stable photo sharing service over the years–even updating an album on the Apple TV, on the same network as your Mac is an unsightly mess.

So the real challenge is ensuring the files are filed properly according to creation date.

Where they come from is largely irrelevant, although my main “inbox” is still iCloud Photos and most things tend to go through there in one way or another for the sake of triage and the odd tweak2.

But filenames are always a jumble and I just can’t rely on filesystem dates, so I’ve had a number of different strategies over the years, especially as we moved on from taking photos in JPEG and Canon .cr2 format (which use metadata) to HEIF/HEIC photos from more modern iPhones.

Plus, of course, video, which even Swiss Army knife-type tools like exiftool have trouble with, and which is now at least half the storage of any new year I add to my archive.

Tooling

On the Mac ecosystem, things really haven’t improved that much in the past few years.

The best I can say is that the Photos app exports literally everything well enough if you have Download originals to this Mac set, even down to (quite surprisingly) actually setting filesystem modification dates correctly, so I can mostly trust it to do the right thing.

But in this new, nerfed scripting era of Shortcuts there just aren’t any decent ways to go about renaming files in bulk depending on their metadata, and Photos cannot normalize file names the way I want to.

I’ve used a script to wrap jhead for a long while, but that doesn’t work for HEIF files or video, so recently I decided to cheat and resort to reading Spotlight (mds) metadata via a hacky shell script:

#!/bin/zsh

for FILE in "$@" 
do
    if [[ ! -f $FILE ]] then
        continue
    fi
    EXTENSION="${FILE##*.}"

    # Ensure we only handle the kinds of media we want
    if [[ "DNG dng JPG JPEG jpeg jpg PNG png MOV mov HEIC heic" != *$EXTENSION* ]] then
        continue
    fi

    # Normalize extensions

    if [[ "DNG" == *$EXTENSION* ]] then
      EXTENSION="dng"
    fi
    if [[ "JPG JPEG jpeg" == *$EXTENSION* ]] then
      EXTENSION="jpg"
    fi
    if [[ "PNG" == *$EXTENSION* ]] then
      EXTENSION="png"
    fi
    if [[ "HEIC" == *$EXTENSION* ]] then
      EXTENSION="heic"
    fi
    if [[ "MOV" == *$EXTENSION* ]] then
      EXTENSION="mov"
    fi

    # Now grab EXIF/IPTC data that MacOS has already figured out for us
    METADATA=$(mdls "$FILE")
    # The space prevents matching derived properties
    ISO_DATE=$(echo $METADATA | grep "kMDItemContentModificationDate " | cut -d= -f 2 | sed 's/[^0-9]//g' | cut -c1-14)
    if [[ ${#ISO_DATE} -eq 14 ]] then 
      if [[ "$FILE" == "$ISO_DATE.$EXTENSION" ]] then
        continue
      fi
      if [[ ! -f "$ISO_DATE.$EXTENSION" ]] then
        mv "$FILE" "$ISO_DATE.$EXTENSION"
      else
        for SUFFIX in {a..z}; do
          SUFFIXED_DATE="$ISO_DATE$SUFFIX"
          if [[ ! -f "$SUFFIXED_DATE.$EXTENSION" ]] then
            mv "$FILE" "$SUFFIXED_DATE.$EXTENSION"
            break
          fi
        done
      fi
    fi
done

This is a little barbaric, though, and only works on macOS.

Going Cross-Platform

Given that I have been I wanted something more reliable and future-proof, so I found a HEIF plugin for pillow, reached for the ffmpeg bindings and wrote this:

#!/bin/env python3

from PIL import Image, ImageFilter, UnidentifiedImageError
from PIL.ExifTags import TAGS
from pi_heif import HeifImagePlugin
from os import listdir, rename, stat, chdir
from stat import S_ISDIR
from os.path import splitext, exists
from pprint import pprint
from ffmpeg import probe
from sys import argv, exit
from time import strftime, gmtime

# build a list of alphabetical suffixes, starting with a blank
SUFFIXES = ['']
SUFFIXES.extend(list(map(chr,range(ord("a"), ord("z")+1))))

PHOTO_EXTENSIONS = [".jpg",".jpeg",".heic", ".png", ".cr2", ".dng", ".gif"]
VIDEO_EXTENSIONS = [".mp4", ".m4v", ".mov"]

def parse_exif(image: Image) -> dict:
    exif = image.info.get("exif")
    if not exif:
        return None
    tags={}
    for k, v in image.getexif().items():
        tag = TAGS.get(k)
        tags[tag] = v
    return tags


def safe_rename(filename: str, date: str, marker: str="-") -> str:
    ext = splitext(filename)[1].lower()
    for s in SUFFIXES:
        new_filename = f"{date}{s}{ext}"
        if not exists(new_filename):
            print(f"{filename} -{marker}-> {new_filename}")
            rename(filename, new_filename)
            return new_filename
            break
    print(f"{filename} -!-> {filename}")
    return filename


def scan_files(path: str) -> int:
    if exists(path) and S_ISDIR(stat(path).st_mode):
        chdir(path)
    else:
        print(f"invalid path {path}")
        return -1

    for filename in listdir():
        (name, ext) = splitext(filename)
        ext = ext.lower()
        # photos
        if ext in PHOTO_EXTENSIONS:
            try:
                image = Image.open(filename, "r")
                tags = parse_exif(image)
                image.close()
            except UnidentifiedImageError:
                tags = None
            # use EXIF data
            if tags and 'DateTime' in tags:
                # We get this as a string, so we can use it right away
                date = tags['DateTime'].replace(" ","").replace(":","")
                if(not filename.startswith(date)):
                    safe_rename(filename, date)
            # use modification date instead (Apple Photos sets it correctly on export)
            else:
                date = strftime("%Y%m%d%H%M%S", gmtime(stat(filename).st_mtime))
                if(not filename.startswith(date)):
                    safe_rename(filename, date, marker="?")
        # video            
        elif ext in VIDEO_EXTENSIONS:
            streams = probe(filename)["streams"]
            for s in streams:
                if 'creation_time' in s['tags']:
                    date = s['tags']['creation_time'].replace("T",'').replace("-","").replace(":","")[:14]
                    if(not filename.startswith(date)):
                        safe_rename(filename, date)
                        break
        else:
            print(f"skipping {filename}")
    return 0

if __name__ == "__main__":
    if len(argv) == 2:
        exit(scan_files(argv[1]))
    else:
        print(f"Usage: {__file__} <path>")

This is designed to work in almost exactly the same way the old CLI tool used to, but for all the file formats I have (except .jxr JPEG-XR files from the , which I can file manually).

I just archived all of my photos from 2021 and 2022 with the above, so I would call it “good enough” for the moment,

It currently sits alongside an imagehash version that I hope to finish some day and use to batch remove duplicates and cropped versions–which is going to be essential once I start archiving the photos my kids take as well…


  1. I’ve also mostly given up on Flickr, although I might take up Pixelfed if it becomes more usable and my photographer friends join up as well. ↩︎

  2. I’ve also never found in myself the faith required to trust Adobe with my photos, although I routinely try hoping that they aren’t fussy about storage. Guess what, everyone likes to reinvent the photo database wheel, and developers seem unable to just take a read-only filesystem tree from a NAS and work with it as is. ↩︎