Removing unwanted characters from a text file using SAS

I am trying to change the delimiter from comma to pipe in a text file using SAS. The data in the input file looks like-

Site,Variable,20151120010000,5.82,1,1,Project|Code|comment
Site,Variable,20151120020000,5.82,1,1,Project|Code|comment
Site,Variable,20151120030000,5.81,1,1,Project|Code|comment, out of service

I want to change the commas (delimiter) to pipe but if there is a comma (for example in the last line), I don't want to change it to pipe. Basically Project|Code|comment, out of service is one column. I am using the code below (as suggested by a stack overflow member)-

%let flname1=D:\temp\comma_file_%sysfunc(today(),yymmddn8.).txt;
%put &=flname1;
%let flname2=D:\temp\pipe_file_%sysfunc(today(),yymmddn8.).txt;
%put &=flname2;
data _null_;
length x1-x9 $200;
infile "&flname1" dsd dlm=',' truncover;
file "&flname2" dsd dlm='|';
input x1-x9;
put x1-x9;
run;

The output I get using this code looks like-

Site|Variable|20151120010000|5.82|1|1|"Project|Code|comment"||
Site|Variable|20151120020000|5.82|1|1|"Project|Code|comment"||
Site|Variable|20151120030000|5.81|1|1|"Project|Code|comment"|out of service|

I want the output to look like-

Site|Variable|20151120010000|5.82|1|1|Project|Code|comment
Site|Variable|20151120020000|5.82|1|1|Project|Code|comment
Site|Variable|20151120030000|5.81|1|1|Project|Code|comment,out of service

This might be pretty easy but I am just starting to learn SAS. Any help is greatly appreciated.


Just read the file as a series of text fields and re-write it using a different delimiter. Your problem is that the first few fields are comma delimited and the last two are pipe delimited. It looks like you have three fields but that the first field is 7 comma delimited values. So read the last two columns using ('|') as the delimiter and the first 7 using both pipe and comma ('|,') as the delimiter. Then re-write it using another the delimiter. You will need to make two filerefs to allow it be processed using different delimiters.

filename original "sample1.dat";
filename copy     "sample1.dat";
filename out      "sample1.csv";
data _null_;
   length field1-field9 $200;
   infile original dsd dlm='|' truncover;
   input field7-field9 ;
   infile copy dsd dlm=',|' truncover;
   input field1-field7 ;
   file out dsd dlm='|';
   put field1-field9;
run;

That will generate what you requested.

Note that if you write the new file using comma (',') as the delimiter it will look like this instead since SAS will protect the embedded delimiter with quotes.

Site,Variable,20151120010000,5.82,1,1,Project,Code,comment
Site,Variable,20151120020000,5.82,1,1,Project,Code,comment
Site,Variable,20151120030000,5.81,1,1,Project,Code,"comment, out of service"

Or you could use the SCAN() function to break the first field out into 7. That eliminates the need to read the line with multiple delimiters.

data _null_;
   length field1-field9 $200;
   infile original dsd dlm='|' truncover;
   input field7-field9 ;
   array field (9);
   do i=1 to 7;
     field(i) = scan(field7,i,',','m');
   end;
   file out dsd dlm='|';
   put field1-field9;
run;