Search through texts with regex in python
- 1.Create your regular expression pattern
- 2. Use re.findall() and re.search() to find your matches
- Reference
In this tutorial, I will go over two common functions, re.findall()
and re.search()
, that I find useful to search through texts with.
1.Create your regular expression pattern
-
I usually use regex101.com to check if my regular expression pattern matches the text I want
-
A screenshot of example usage:
-
Some common tokens I use
-
.
: a single character -
\s
: any white space -
\d
: Andy digit -
\b
: Any word boundary -
\w
: any word character -
a|b
: match either a or b -
(...)
: capture the...
part into groups -
a?
: zero or one of a -
a*
: zero or more of a -
a+
: one or more of a -
a{8}
: exactly 8 of a
-
2. Use re.findall() and re.search() to find your matches
- re is a module in the python standard library that provides regular expression matching operation
- Two of the most common functions I use are
re.findall()
andre.search()
import re
text = 'The result depends on the number of capturing groups in the pattern. If there are no groups, return a list of strings matching the whole pattern.'
match = re.findall(r'\b(\w+)\b\s(groups)',text)
match
match = re.search(r'\b(\w+)\b\s(groups)',text)
match
If you want to check the matching groups of re.search()
, you can do:
match.group(0),match.group(1), match.group(2)