正則表達式 Regular Expression

Regular Expression1WIKI: Regular Expression

     

Python: re

Functions

re.search(pattern, string, flags=0)

尋找string第一個符合正則表達式 (regular expression) pattern的substring. 沒有則傳回 None.

flags: class re.RegexFlag
  (配對正則表達式的選項, 忽略大小寫等)

re.findall(patternstringflags=0) 尋找string所有符合正則表達式 (regular expression) pattern的substring, 但這些 substring 彼此不重疊. 沒有則傳回 None.
re.match(pattern, string, flags=0) 尋找string中符合正則表達式 (regular expression) pattern的substring. 
這個 substring 必須為 string[0 : i], i > 0
re.fullmatch(patternstringflags=0) 確認整個 string 是否符合正則表達式 (regular expression) pattern. 否則傳回 None.
re.compile(patternflags=0)

prog = re.compile(pattern)
result = prog.match(string)


result = re.match(pattern, string)

re.split(patternstringmaxsplit=0flags=0)

 

re.sub(patternreplstringcount=0flags=0)

 

sub: substitute (替換)
repl: replacement (替換品)

re.subn(patternreplstringcount=0flags=0)

 

re.escape(pattern)

 

class re.Match

Match.group()

Returns one or more subgroups of the match.

m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
m.group(0)       # The entire match: 'Isaac Newton'
m.group(1)       # The first parenthesized subgroup: 'Isaac'
m.group(2)       # The second parenthesized subgroup: 'Newton'
                 # Because there are only 2 (\w), 
                 #   so "physicist" is not matched.
m.group(1, 2)    # Multiple arguments give us a tuple: ('Isaac', 'Newton')

Using the (?P<name>...) syntax to name subgroups.

r"(\w+) (\w+)"r"(?P<first_name>\w+) (?P<last_name>\w+)"

 

m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", \
    "Isaac Newton, physicist")
m.group('first_name') # 'Isaac'
m.group('last_name')  # 'Newton'
Match.__getitem__(g)

m[g]m.group(g)

Match.expand(template)

 

m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", \
    "Isaac Newton, physicist")
print(m.expand(r'His name is \1 \2')) 
  # His name is Isaac Newton
print(m.expand(r'His name is \g<1> \g<2>'))
  # His name is Isaac Newton
print(m.expand(r'His name is \g<first_name> \g<last_name>'))
  # His name is Isaac Newton

Match.groups(default=None)

Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.

The default argument is used for groups that did not participate in the match; it defaults to None.

 

m = re.match(r"(\d+)\.(\d+)\s(\d+)", "12.345 ")
print(m)        # None

m = re.match(r"(\d+)\.(\d+)\s?(\d+)?", "12.345")
m.groups()      # ('12', '345', None)
m.groups('678') # ('12', '345', '678')
Match.groupdict(default=None)

參考:(?P<name>...)

輔助工具

Last Updated on 2025/04/11 by A1go

References

目錄
Bitnami