POSIX sh equivalent for Bash% q printing

advertisements

Suppose I have a #!/bin/sh script which can take a variety of positional parameters, some of which may include spaces, either/both kinds of quotes, etc. I want to iterate "[email protected]" and for each argument either process it immediately somehow, or save it for later. At the end of the script I want to launch (perhaps exec) another process, passing in some of these parameters with all special characters intact.

If I were doing no processing on the parameters, othercmd "[email protected]" would work fine, but I need to pull out some parameters and process them a bit.

If I could assume Bash, then I could use printf %q to compute quoted versions of args that I could eval later, but this would not work on e.g. Ubuntu's Dash (/bin/sh).

Is there any equivalent to printf %q that can be written in a plain Bourne shell script, using only built-ins and POSIX-defined utilities, say as a function I could copy into a script?

For example, a script trying to ls its arguments in reverse order:

#!/bin/sh
args=
for arg in "[email protected]"
do
    args="'$arg' $args"
done
eval "ls $args"

works for many cases:

$ ./handle goodbye "cruel world"
ls: cannot access cruel world: No such file or directory
ls: cannot access goodbye: No such file or directory

but not when ' is used:

$ ./handle goodbye "cruel'st world"
./handle: 1: eval: Syntax error: Unterminated quoted string

and the following works fine but relies on Bash:

#!/bin/bash
args=
for arg in "[email protected]"
do
    printf -v argq '%q' "$arg"
    args="$argq $args"
done
eval "ls $args"


This is absolutely doable.

The answer you see by Jesse Glick is approximately there, but it has a couple of bugs, and I have a few more alternatives for your consideration, since this is a problem I ran into more than once.

First, and you might already know this, echo is a bad idea, one should use printf instead, if the goal is portability: "echo" has undefined behavior in POSIX if the argument it receives is "-n", and in practice some implementations of echo treat -n as a special option, while others just treat it as a normal argument to print. So that becomes this:

esceval()
{
    printf %s "$1" | sed "s/'/'\"'\"'/g"
}

Alternatively, instead of escaping embedded single quotes by making them into:

'"'"'

..instead you could turn them into:

'\''

..stylistic differences I guess (I imagine performance difference is negligible either way, though I've never tested). The resulting sed string looks like this:

esceval()
{
    printf %s "$1" | sed "s/'/'\\\\''/g"
}

(It's four backslashes because double quotes swallow two of them, and leaving two, and then sed swallows one, leaving just the one. Personally, I find this way more readable so that's what I'll use in the rest of the examples that involve it, but both should be equivalent.)

BUT, we still have a bug: command substitution will delete at least one (but in many shells ALL) of the trailing newlines from the command output (not all whitespace, just newlines specifically). So the above solution works unless you have newline(s) at the very end of an argument. Then you'll lose that/those newline(s). The fix is obviously simple: Add another character after the actual command value before outputting from your quote/esceval function. Incidentally, we already needed to do that anyway, because we needed to start and stop the escaped argument with single quotes. Honestly, I don't understand why that wasn't done to begin with. You have two alternatives:

esceval()
{
    printf '%s\n' "$1" | sed "s/'/'\\\\''/g; 1 s/^/'/; $ s/$/'/"
}

This will ensure the argument comes out already fully escaped, no need for adding more single quotes when building the final string. This is probably the closest thing you will get to a single, inline-able version. If you're okay with having a sed dependency, you can stop here.

If you're not okay with the sed dependency, but you're fine with assuming that your shell is actually POSIX-compliant (there are still some out there, notably the /bin/sh on Solaris 10 and below, which won't be able to do this next variant - but almost all shells you need to care about will do this just fine):

esceval()
{
    printf \'
    UNESCAPED=$1
    while :
    do
        case $UNESCAPED in
        *\'*)
            printf %s "${UNESCAPED%%\'*}""'\''"
            UNESCAPED=${UNESCAPED#*\'}
            ;;
        *)
            printf %s "$UNESCAPED"
            break
        esac
    done
    printf \'
}

You might notice seemingly redundant quoting here:

printf %s "${UNESCAPED%%\'*}""'\''"

..this could be replaced with:

printf %s "${UNESCAPED%%\'*}'\''"

The only reason I do the former, is because one upon a time there were Bourne shells which had bugs when substituting variables into quoted strings where the quote around the variable didn't exactly start and end where the variable substitution did. Hence it's a paranoid portability habit of mine. In practice, you can do the latter, and it won't be a problem.

If you don't want to clobber the variable UNESCAPED in the rest of your shell environment, then you can wrap the entire contents of that function in a subshell, like so:

esceval()
{
  (
    printf \'
    UNESCAPED=$1
    while :
    do
        case $UNESCAPED in
        *\'*)
            printf %s "${UNESCAPED%%\'*}""'\''"
            UNESCAPED=${UNESCAPED#*\'}
            ;;
        *)
            printf %s "$UNESCAPED"
            break
        esac
    done
    printf \'
  )
}

"But wait", you say: "What I want to do this on MULTIPLE arguments in one command? And I want the output to still look kinda nice and legible for me as a user if I run it from the command line for whatever reason."

Never fear, I have you covered:

esceval()
{
    case $# in 0) return 0; esac
    while :
    do
        printf "'"
        printf %s "$1" | sed "s/'/'\\\\''/g"
        shift
        case $# in 0) break; esac
        printf "' "
    done
    printf "'\n"
}

..or the same thing, but with the shell-only version:

esceval()
{
  case $# in 0) return 0; esac
  (
    while :
    do
        printf "'"
        UNESCAPED=$1
        while :
        do
            case $UNESCAPED in
            *\'*)
                printf %s "${UNESCAPED%%\'*}""'\''"
                UNESCAPED=${UNESCAPED#*\'}
                ;;
            *)
                printf %s "$UNESCAPED"
                break
            esac
        done
        shift
        case $# in 0) break; esac
        printf "' "
    done
    printf "'\n"
  )
}

In those last four, you could collapse some of the outer printf statements and roll their single quotes up into another printf - I kept them separate because I feel it makes the logic more clear when you can see the starting and ending single-quotes on separate print statements.

P.S. There's also this monstrosity I made, which is a polyfill which will select between the previous two versions depending on if your shell seems to be capable of supporting the necessary variable substitution syntax (it looks awful though, because the shell-only version has to be inside an eval-ed string to keep the incompatible shells from barfing when they see it): https://github.com/mentalisttraceur/esceval/blob/master/sh/esceval.sh