In Perl, how can I remove all spaces that are not inside double quotation marks & ldquo; & rdquo;

advertisements

I'm tying to come up with some regex that will remove all space chars from a string as long as it's not inside of double quotes (").

Example string:

some string with "text in quotes"

Result:

somestringwith"text in quotes"

So far I've come up with something like this:

    $str =~ /"[^"]+"|/g;

But it doesn't seem to be giving the intended result.

I'm honestly very new at perl and haven't had too much regexp experience. So if anyone willing to answer would also be willing to provide some insight into the why and how that would be great!

Thanks!

EDIT

String will not contain escaped "'s

It should actually always be formatted like this:

Some.String = "Some Value"

Result would be

Some.String="Some Value"


Here is a technique using split to separate the quoted strings. It relies on your data being consistent and will not work with loose quotes.

use strict;
use warnings;

my @line = split /("[^"]*")/;
for (@line) {
    unless (/^"/) {
        s/[ \t]+//g;
    }
}
print @line;  # line is altered

Basically, you split up the string in order to isolate the quoted strings. Once that is done, perform the substitution on all other strings. Since the array elements are aliased in the loop, substitutions are performed on the actual array.

You can run this script like so:

perl -n script.pl inputfile

To see the output. Or

perl -n -i.bak script.pl inputfile

To do in-place edit on inputfile, while saving backup in inputfile.bak.

With that said, I'm not sure what your edit means. Do you want to change

Some.String = "Some Value"

to

Some.String="Some Value"