Try to replace; with ASCII control characters in a text file using a cmd file

advertisements

I am trying to write a .cmd file to take a text file and add a string to the beginning of each line, replace all the semicolons with a ASCII control code 30 (RS), and end each line with a RS followed by a ASCII code 31 (US). When I put the RS and US in the file, notebook won't save it unless it is in Unicode, but when I try to run it in Unicode, it won't run.

Here is what I have that is not working:

@echo on > Convert.txt & setLocal enableDELAYedexpansion

set old=;
set new=▲
set [email protected]▲▲▲
set endstr=▲▼

for /f "tokens=* delims= " %%a in (test1.txt) do (
set str=%%a
set str=%BGNSTR%!str:%old%=%new%!%endstr%
>> Convert.txt echo !str!
)

If I substitute any other characters for the RS/US, it does what I want it to do.


Your script should work as expected but and are not ASCII control codes:

▲    U+25B2    Black Up-Pointing Triangle
▼    U+25BC    Black Down-Pointing Triangle

Unfortunately, neither ASCII control code 30 (RS) nor ASCII code 31 (US) is visible here; therefore inserted next screenshot from hexadecimal editor. In next script, bgnstr variable is slightly shortened to keep next output from mycharmap.bat script in acceptable length.

@ECHO OFF
SETLOCAL EnableExtensions EnableDelayedExpansion
set old=;
set "new="
set "[email protected]"
set "endstr="
> Convert.txt (
  for /f "tokens=* delims= " %%a in (test1.txt) do (
    set str=%%a
    set str=%BGNSTR%!str:%old%=%new%!%endstr%
    echo !str!
  )
)

Input/output:

==> type test1.txt
a;b;c
d;e;f;
h;i;j

==> D:\bat\SO\39006271.bat

==> type Convert.txt
@TSabc
@TSdef
@TShij

ASCII control codes are invisible above in output from type Convert.txt; the mycharmap.bat script in next code and following screenshot show them. The mycharmap.bat script comes from this my answer at superuser.com: Full description of Windows Alt+x codes

==> for /F "skip=2 delims=" %G in ('type Convert.txt') do @mycharmap.bat "'%G'"
Ch Unicode    Alt?    CP    IME    Alt   Alt0    IME 0405/cs-CZ; CP852; ANSI 1250

 @  U+0040      64         …64…     64    064    Commercial At
 T  U+0054      84         …84…     84    084    Latin Capital Letter T
 S  U+0053      83         …83…     83    083    Latin Capital Letter S
    U+001E                 …30…           030    Information Separator Two
    U+001E                 …30…           030    Information Separator Two
    U+001E                 …30…           030    Information Separator Two
 h  U+0068     104        …104…    104   0104    Latin Small Letter H
    U+001E                 …30…           030    Information Separator Two
 i  U+0069     105        …105…    105   0105    Latin Small Letter I
    U+001E                 …30…           030    Information Separator Two
 j  U+006A     106        …106…    106   0106    Latin Small Letter J
    U+001E                 …30…           030    Information Separator Two
    U+001F                 …31…           031    Information Separator One
 @TShij