How to determine the number of lines between two strings using Bash and the standard utilities?

advertisements

I have a file which contains data like this:

abc
abc, Iteration 1
abc
abc, Iteration 2
...
abc
abc, Iteration 19
abc
abc, Iteration 20

I would like to determine the number of lines between the lines which end exactly in the strings "Iteration 1" and "Iteration 2" and store the number of lines to the variable numlines. In the example above, numlines should contain the value 1.

I would like to use wc -l, sed, or awk.


Vijay's helpful sed answer is concise, but invariably processes the entire input file (and also creates extra child processes, because wc -l must be invoked as well - although that will hardly matter overall).

Try the following awk solution, which exits as soon as the end of the range is found (it also creates only a single child process - the subshell is optimized away in favor of the simple awk command); with large input files, this may matter, depending on where inside the file the range is positioned:

numlines=$(awk '/Iteration 1$/ {b=NR; next} /Iteration 2$/ {print NR-b-1; exit}' file)

Tip of the hat to karakfa for helping to optimize the command.

Note: /Iteration 1$/ and /Iteration 2$/ are regular expressions that match strings Iteration 1 and Iteration 2 at the end of a line ($).
The strings at hand happen not to contain regular-expression metacharacters that need escaping (with \), but you may have to do so in other cases.
If the strings to match are not literals known in advance, generic escaping would be difficult; in that case, consider Ed Morton's solution, which is based on strings, not regular expressions.