正則表達式 Regular Expression
- 2023.09.22
- Incomplete Python re Regular Expression
Regular Expression1WIKI: Regular Expression
Python: re
Functions
re.search(pattern, string, flags=0) |
尋找
|
re.findall(pattern, string, flags=0) | 尋找string 中所有符合正則表達式 (regular expression) pattern 的substring, 但這些 substring 彼此不重疊. 沒有則傳回 None. |
re.match(pattern, string, flags=0) | 尋找string 中符合正則表達式 (regular expression) pattern 的substring. 而這個 substring 必須為 string[0 : i], i > 0 |
re.fullmatch(pattern, string, flags=0) | 確認整個 string 是否符合正則表達式 (regular expression) pattern . 否則傳回 None. |
re.compile(pattern, flags=0) |
|
re.split(pattern, string, maxsplit=0, flags=0) |
|
re.sub(pattern, repl, string, count=0, flags=0) |
|
re.subn(pattern, repl, string, count=0, flags=0) |
|
re.escape(pattern) |
|
class re.Match
Match.group()
Returns one or more subgroups of the match.
m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist") m.group(0) # The entire match: 'Isaac Newton' m.group(1) # The first parenthesized subgroup: 'Isaac' m.group(2) # The second parenthesized subgroup: 'Newton' # Because there are only 2 (\w), # so "physicist" is not matched. m.group(1, 2) # Multiple arguments give us a tuple: ('Isaac', 'Newton')
Using the (?P<name>...)
syntax to name subgroups.
r"(\w+) (\w+)"
⇒ r"(?P<first_name>\w+) (?P<last_name>\w+)"
m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", \ "Isaac Newton, physicist") m.group('first_name') # 'Isaac' m.group('last_name') # 'Newton'
Match.__getitem__(g)
m[g]
≡ m.group(g)
Match.expand(template)
m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", \ "Isaac Newton, physicist") print(m.expand(r'His name is \1 \2')) # His name is Isaac Newton print(m.expand(r'His name is \g<1> \g<2>')) # His name is Isaac Newton print(m.expand(r'His name is \g<first_name> \g<last_name>')) # His name is Isaac Newton
Match.groups(default=None)
Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.
The default
argument is used for groups that did not participate in the match; it defaults to None
.
m = re.match(r"(\d+)\.(\d+)\s(\d+)", "12.345 ") print(m) # None m = re.match(r"(\d+)\.(\d+)\s?(\d+)?", "12.345") m.groups() # ('12', '345', None) m.groups('678') # ('12', '345', '678')
Match.groupdict(default=None)
輔助工具
Last Updated on 2025/04/11 by A1go