The 'rpt' command, UNIX program design (and introducing 'fea')

2019-10-18


A little more than three years ago, I wrote a small utility called rpt for easily renaming files. Back then, I must have had the need to rename files following a certain pattern, and it became tedious (well.. RePeTitive..) to do that on the command line. At the same time, I knew that it would be easy to do it with vim, given that the muscle memory for that one was already tediously acquired.

So, out of that came.. rpt, a simple command line tool that you can install via pip:

python3 -m pip install rpt

By default, it uses the mv command, so the idea is that you can call it as rpt * (or e.g. rpt *.jpg) and you get a list of files in your favorite text editor. You can't (read: should not) change the line ordering or remove lines, but any line that you change will cause the file to be renamed, like opening the "folder" in Netrw and editing it there. The Netrw mode in vim has a keybinding R that allows you to rename the file, but it's down in the command line and not really useful for editing.

After saving the file, rpt will look at it and pick out any changes, and call mv <oldfilename> <newfilename> (dealing properly with spaces) on the files, which is perfect for quick file renaming.

Adding fancy features

It then also gained the ability to change the command ordering in a variety of ways, so that you could make it work with e.g. mpg123, converting MP3 files to WAV:

rpt --command=mpg123 --output=-w --swap *.mp3

Of course, by that point, it might have made more sense to just delegate all this re-arrangement of command lines to the --command and remove the logic from rpt, as Program design in the UNIX environment by Pike and Kernighan suggests -- rpt should do one thing (pick up the arguments, open the user's editor, then create a command line and execute it). Especially since the options --swap, --output=-w and --command=mpg123 are not that easy to remember to begin with, you could just as well write a script mp3towav that would take care of fiddling with the arguments to put them in the right order and with the right options to declare input/output parameters:

#/bin/sh
exec mpg123 -w "$2" "$1"

And at this point, we could just as well take mpg123 -w {new} {old} and give that to rpt to build a command out of it, and that's just what happened today...

Simple command line with rpt 2.0.0

Version 2.0.0 of rpt removes the --swap, --output and --input options and just has the --command option (now also available as -c). The option can create an arbitrary command line, where {new} and {old} is replaced with the new and old filename, so you can create nice, arbitrary command lines.

Check out rpt 2.0.0 on the rpt release page.

Auto-editing

In addition to that, auto-editing using sed (which basically always worked, but was never really "advertised" as such) is also possible, from the example in the documentation:

env EDITOR="sed -i -e '/^[^#]/ s/$/.bak/'" rpt -c cp *.py

Of course, at that point, you could do similar things with a shell script in an easier way:

for file in *.py; do cp "$file" "$file.bak"; done

But that is not the main use case of rpt. The main use case is still mass-renaming files with your favorite text editor, interactively.

Read on for the special case.

Non-Interactive use

Imagine a command like this:

SOMECMD "cp {} {}.bak" *.py

And that would output (correctly-quoted) file names to pipe into a shell, just replacing {} with each argument, one per line.

Here's a first try:

#!/usr/bin/env python3

import sys, shlex

_, cmd, *args = sys.argv

print('\n'.join(cmd.replace('{}', shlex.quote(arg)) for arg in args))

Cool, this now allows us to run commands like above (adding a .bak extension), but it does not allow us to easily replace ".mp3" with ".wav", one of the other use cases of rpt.

Note that you could do something similar with ls and sed, but there would still be issues with shell quoting:

ls *.py | sed -e 's/\(.*\)[.]\(.*\)/cp "\1.\2" "\1.bak"/'

Imagine that a file name contains ", $, !! or any other special character, then it wouldn't work correctly. Also, the regular expression is a bit unwieldy. "I'll just use ' for quoting then!" you say, but then a file with ' in the name comes up, so that's no good, either..

Introducing: fea -- For each argument

Here is the "for each argument" script, fea:

#!/usr/bin/env python3
# fea: For Each Argument
# 2019-10-18 Thomas Perl <m@thp.io>
# https://thp.io/2019/rpt-and-fea.html

import sys, shlex, re

_, cmd, *args = sys.argv

print('\n'.join(re.sub(r'[{](.)([^}]*?)(\1)([^}]*?)(\1)[}]',
    lambda m: shlex.quote(re.sub(m.group(2), m.group(4), arg)),
    cmd).replace('{}', shlex.quote(arg)) for arg in args))

The fea script only prints out the commands (so you can see if it would execute the right commands), if you are happy with what it proposes, just pipe its output into sh -e -x to execute.

The following replacements happen here:

{} -> insert arg
{#regex#replacement#} -> run re.sub(regex, replacement) on the arg,
                         you can pick any character for "#", as long
                         as it appears at the start, middle and end
                         (to separate the regex from the replacement)

Example commands:

# Create backup files
fea "cp {} {}.bak" *.py

# Encode WAV files with oggenc
fea "oggenc {} -o {/wav$/ogg/}" *.wav

# Decode MP3 files with mpg123
fea "mpg123 -w {/mp3$/wav/} {}" *.mp3

# Render markdown documents to HTML
fea "markdown {} > {/md$/html/}" *.md

# Fancy replacement
fea "markdown {} > {#input/(.*).md#output/\1.html#}" input/*.md

# As mentioned above, note that you need to pipe the
# fea output into a shell to execute the command
fea "cp {} {}.bak" *.py | sh -e -x

Of course, most of these can be done in the shell, too:

for file in input/*.md; do
    markdown "$file" > "output/$(basename "${file/md$/html}")"
done

But oh my did I get the shell quoting right this time?

For fea, ugly file names are no problem:

touch 'some$weird "filename.py'
touch 'and !! more.py'
touch 'why oh why?.py'
touch "this is ridiculous'.py"
fea "cp {} {}.bak" *.py | sh -e -x

I'm sure there might be something built-in or existing for this already, but I haven't found anything yet. Let me know if you know how to solve this in an easy and concise way with shell built-ins or other tools (awk/perl come to mind?).

Related Links

Thomas Perl · 2019-10-18