tr最熟知的用法就是echo $var | tr ‘a-z’ “A-Z”这样进行大写化或者反过来小写化了. tr的作用就是字符的替换和删除. 以下是msys里tr的说明:
Usage: tr [OPTION]... SET1 [SET2]
Translate, squeeze, and/or delete characters from standard input,
writing to standard output.
- -c, –complement first complement SET1
- -d, –delete delete characters in SET1, do not translate
- -s, –squeeze-repeats replace sequence of characters with one
- -t, –truncate-set1 first truncate SET1 to length of SET2
- –help display this help and exit
- –version output version information and exit
SETs are specified as strings of characters. Most represent themselves.
Interpreted sequences are:
- \NNN character with octal value NNN (1 to 3 octal digits)
- \ backslash
- \a audible BEL
- \b backspace
- \f form feed
- \n new line
- \r return
- \t horizontal tab
- \v vertical tab
- CHAR1-CHAR2 all characters from CHAR1 to CHAR2 in ascending order
- [CHAR1-CHAR2] same as CHAR1-CHAR2, if both SET1 and SET2 use this
- [CHAR*] in SET2, copies of CHAR until length of SET1
- [CHAR*REPEAT] REPEAT copies of CHAR, REPEAT octal if starting with 0
- [:alnum:] all letters and digits
- [:alpha:] all letters
- [:blank:] all horizontal whitespace
- [:cntrl:] all control characters
- [:digit:] all digits
- [:graph:] all printable characters, not including space
- [:lower:] all lower case letters
- [:print:] all printable characters, including space
- [:punct:] all punctuation characters
- [:space:] all horizontal or vertical whitespace
- [:upper:] all upper case letters
- [:xdigit:] all hexadecimal digits
- [=CHAR=] all characters which are equivalent to CHAR
Translation occurs if -d is not given and both SET1 and SET2 appear.
-t may be used only when translating. SET2 is extended to length of
SET1 by repeating its last character as necessary. Excess characters
of SET2 are ignored. Only [:lower:] and [:upper:] are guaranteed to
expand in ascending order; used in SET2 while translating, they may
only be used in pairs to specify case conversion. -s uses SET1 if not
translating nor deleting; else squeezing uses SET2 and occurs after
translation or deletion.
tr的基本作用是从输入(管道,标准/重定向输入,文件)中将SET1的字符替换为相应的SET2字符. 替换准则是两个SET中一一对应的替换.所以最好是两个SET等长,因此tr作用最大就是大小写替换了.因为不能字符串替换的效果,所以使用其实很有限,除了大小写替换,还有就是删除字符了.
选项:
tr SET1 SET2 < test.txt
: 将SET1中相应位置的字符替换为SET2中的字符.-d SET1
: 删除匹配的,此时不需要SET2.例如我要删除\t.tr -d "\t"
-s SET1 SET2
: 将连续的SET1替换为SET2中对应单个字符.例如将多个空格变为1个:tr -s ' ' ' '
-t SET1 SET2
: 将SET1长度截断至和SET2一样,即舍弃SET1后面过长的部分,如tr -t 1234 ab
将01234变为0ab34-c SET1 SET2
: 将SET1以外的字符替换为SET2中最后一个字符.例如tr -c 12 ab
将01234变为b12bb.因此SET2更常为单字符.
字符集法则:
- 自动补全: 当SET1>SET2,重复SET2最后一个字符来补全SET2长度到SET1; 当SET1<SET2时,SET2后面多出的无效.
- 特殊字符: 支持如\t,\n这些转移, 另外支持\123八进制指定相应字符号.
- 升序集
a-z
: 使用-
号表示从字符a到z的升序集,同样表达是[a-z] - 补长集
[c*]
: (只适合于SET2) 当两个集合长度不一,[c*]会重复一个char补全到两个集合相等,例如tr 123456 a[b*]c
等效于123456 abbbbc - 重复次数
[c*N]
: N是重复次数.例如tr 123456 a[b*3]c
等效于123456 abbbcc - 预定义集合:
- [:alnum:] :所有字母字符与数字
- [:alpha:] :所有字母字符
- [:blank:] :所有水平空格
- [:cntrl:] :所有控制字符
- [:digit:] :所有数字
- [:graph:] :所有可打印的字符(不包含空格符)
- [:lower:] :所有小写字母. 例如
tr [:lower:] [:upper:]
- [:print:] :所有可打印的字符(包含空格符)
- [:punct:] :所有标点字符
- [:space:] :所有水平与垂直空格符
- [:upper:] :所有大写字母
- [:xdigit:] :所有 16 进位制的数字
题外话: declare
另外,一个大小写替换的方法是用声明declare. 顾明思议, 对变量进行声明和限定:
declare -l var
和declare -u var
声明了变量内容全是小写或全是大写.但声明后变量的值是不变的,需要再次赋值时才起效.-l和-u选项要在较新的bash才起效,所以建议还是使用tr来进行该操作.
取消这两个声明设置使用+l/+u
.
例如:
v="abc"
echo $v # abc
declare -u v
echo $v # abc,声明不改变值
v="abc"
echo $v # ABC,再次赋值时声明起效
declare -l v="ABC"
echo $v # abc, 声明同时赋值马上起效.