Python Regular Expressions or Regex is a sequence of characters that forms a search pattern. They can be used to check if a string in a program contains the specified search pattern or not. They are mostly used in the UNIX world. The re module raises the exception re.error if an error occurs while compiling or using a regular expression. Its main use is to offer a search, where it takes a regex and a string. However, it either returns the first match or else returns none.
Python search(), start() and end() functions
Please check the below example to understand the different functions.
import re s = 'Developer Helps: First solve, then write the code' match = re.search(r'write', s) print('Start Index:', match.start()) print('End Index:', match.end())
Start Index: 35 End Index: 40
Here r character (r’write’) stands for raw, not regex. The raw string is slightly different from a regular string, it won’t interpret the \ character as an escape character. This is because the regular expression engine uses \ character for its own escaping purpose.
Metacharacters in python regular expressions
They are the special characters that affect how the regular expressions around them are interpreted. Metacharacters don’t match themselves. Instead, they indicate some rules. Characters or signs like | , + , or * , are special characters.
|\||drops the special meaning of character following it|
|||represents a class of the character|
|^||character matches the beginning|
|$||character matches the end|
|.||matches any characters except newline|
|?||means 0 or no occurrence|
match function syntax
re.match(pattern, string, flags=0)
This function attempts to match the RE pattern to string with optional flags. Below is the description of the paraments used under the ‘match’ function.
- pattern: This is the regular expression that the user wants to match.
- string: This is the string that the user would search to match the pattern at the beginning of the string.
- flag: You can specify different flags using bitwise OR (|). These are also called modifiers.
Python program to understand re.match()
Please check the below example to understand match() function.
import re Substring ='string' Stringa ='''We are learning regex with Developer Helps regex is very useful for string matching. It is fast too.''' Stringb ='''string We are learning regex with Developer Helps regex is very useful for string matching. It is fast too.''' print(re.match(Substring, Stringa, re.IGNORECASE)) print(re.match(Substring, Stringb, re.IGNORECASE))
None <re.Match object; span=(0, 6), match='string'>
Python Regular Expression Modifiers
Regular expression literals can have an optional modifier to control various aspects of matching. The modifiers can be put in use as an optional flag. You can provide multiple modifiers using exclusive OR (|). Below is a table to understand the concept better:
|re.I||It performs case-sensitive matching|
|re.L||It interprets words according to the current scenario|
|re.M||It matches $ at the end of the line and matches ^ at the start of any line|
|re.S||It makes a period match any line which can include the current line also|
|re.U||match letters according to the Unicode character set.|
|re.X||Permits “cuter” regular expression syntax. It ignores whitespace|
The Match object has properties and methods used to retrieve information about the search, and the result:
.span() It will return a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match