Multiple repeat error when the regex has multiple wildcards in python re module

When the regex being compiled in python which has multiple wild cards like plus or asterisks sequentially repeated, you will need to escape or else run into "multiple repeat error". This can happen with any of re module methods like search and sub etc, where the regex is compiled.

>>> import re
>>> re.compile("r++")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/re.py", line 190, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: multiple repeat

>>> re.compile("r**")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/re.py", line 190, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: multiple repeat


To avoid the miscellaneous like this we just need to escape the string with if we need to do the exact match without any wildcard expansions.

>>> re.compile(re.escape('r++'))
<_sre.SRE_Pattern object at 0xb74e28a0>

Comments

Popular posts from this blog

Avoid using global/class-level mutable datatypes like list/dicts

Weakref proxy is for instance only ...