I know how to parse a string into variables in the manner of this SO question, e.g.
ABCDE-123456
becomes:
var1=ABCDE
var2=123456
via, say, cut
. I can do that in one script, no problem.
But I have a few dozen scripts which parse strings/arguments all in the same fashion (same arguments & variables, i.e. same parsing strategy). And sometimes I need to make a change or add a variable to the parsing mechanism.
Of course, I could go through every one of my dozens of scripts and change the parsing manually (even if just copy & paste), but that would be tedious and more error-prone to bugs/mistakes.
Is there a modular way to do parse strings/arguments as such?
I thought of writing a script which parses the string/args into variables and then export
s, but the export
command does not work form child-to-parent, (only vice-versa).
Something like this might work:
parse_it () {
SEP=${SEP--}
string=$1
names=${@:2}
IFS="$SEP" read $names <<< "$string"
}
$ parse_it ABCDE-123456 var1 var2
$ echo "$var1"
ABCDE
$ echo "$var2"
123456
$ SEP=: parse_it "foo:bar:baz" id1 id2 id3
$ echo $id2
bar
The first argument is the string to parse, the remaining arguments are names of variables that get passed to read
as the variables to set. (Not quoting $names
here is intentional, as we will let the shell split the string into multiple words, one per variable. Valid variable names consist of only _, letters, and numbers, so there are no worries about undesired word splitting or pathname generation by not quoting $names
). The function assumes the string uses a single separator of "-", which can be overridden via the environment.
For more complex parsing, you may want to use a custom regular expression (bash
4 or later required for the -g
flag to declare
):
parse_it () {
reg_ex=$1
string=$2
shift 2
[[ $string =~ $reg_ex ]] || return
i=1
for name; do
declare -g "$name=${BASH_REMATCH[i++]}"
done
}
$ parse_it '(.*)-(.*):(.*)' "abc-123:xyz" id1 id2 id3
$ echo "$id2"
123