R语言|替换表内字符|gsub()


jydat_glu$HIS_ITEMNAME <- gsub("1","",jydat_glu$HIS_ITEMNAME)

gsub()可以用于字段的删减、增补、替换和切割,可以处理一个字段也可以处理由字段组成的向量。

具体的使用方法为:gsub("目标字符", "替换字符", 对象).

在gsub函数中,任何字段处理都由将“替换字符”替换到“目标字符”这一流程中实现,令替换字符为''''可实现删除,令替换字符为"目标字符+增补内容"可实现增补,替换和切割也是使用类似的操作。

    > text <- "AbcdEfgh . Ijkl MNM"
    > gsub("Efg", "AAA", text)

#将Efg改为AAA

区分大小写
    [1] "AbcdAAAh . Ijkl MNM"

 

 任何符号,包括空格、Tab和换行都是可以识别的

 

  > gsub(" I", "i", text)   #可识别空格
    [1] "AbcdEfgh .ijkl MNM"



同时字符可以识别多个,进行批量置换

    > gsub("M", "N", text)
    [1] "AbcdEfgh . Ijkl NNN" 



除此之外,gsub还有其他批量操作的方法

    > gsub("^.* ", "a", text) #开头直到最后一个空格结束替换成a
    [1] "aMNM"
    > gsub("^.* I(j).*$", "\\1", text) #只保留一个j
    [1] "j"
    > gsub(" .*$", "b", text) #第一个空格直达结尾替换成b
    [1] "AbcdEfghb"
    > gsub("\\.", "\\+", text) #句号.和加号+是特殊的,要添加\\来识别
    [1] "AbcdEfgh + Ijkl MNM"


Syntax    Description
\\d    Digit, 0,1,2 ... 9
\\D    Not Digit
\\s    Space
\\S    Not Space
\\w    Word
\\W    Not Word
\\t    Tab
\\n    New line
^    Beginning of the string
$    End of the string
\    Escape special characters, e.g. \\ is "\", \+ is "+"
|    Alternation match. e.g. /(e|d)n/ matches "en" and "dn"
?    Any character, except \n or line terminator
[ab]    a or b
[^ab]    Any character except a and b
[0-9]    All Digit
[A-Z]    All uppercase A to Z letters
[a-z]    All lowercase a to z letters
[A-z]    All Uppercase and lowercase a to z letters
i+    i at least one time
i*    i zero or more times
i?    i zero or 1 time
i{n}    i occurs n times in sequence
i{n1,n2}    i occurs n1 - n2 times in sequence
i{n1,n2}?    non greedy match, see above example
i{n,}    i occures >= n times
[:alnum:]    Alphanumeric characters: [:alpha:] and [:digit:]
[:alpha:]    Alphabetic characters: [:lower:] and [:upper:]
[:blank:]    Blank characters: e.g. space, tab
[:cntrl:]    Control characters
[:digit:]    Digits: 0 1 2 3 4 5 6 7 8 9
[:graph:]    Graphical characters: [:alnum:] and [:punct:]
[:lower:]    Lower-case letters in the current locale
[:print:]    Printable characters: [:alnum:], [:punct:] and space
[:punct:]    Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
[:space:]    Space characters: tab, newline, vertical tab, form feed, carriage return, space
[:upper:]    Upper-case letters in the current locale
[:xdigit:]    Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
————————————————
原文链接:https://blog.csdn.net/lztttao/article/details/82086346

R