Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
125 views
in Technique[技术] by (71.8m points)

How to choose a regex pattern in Python

I'm studying Python 3 but I'm struggling to get regex with the re module.

Here's my problem: I have the string

phrase = "s000000000 s1133122 s21 s3 s4 s5212638476234857634 s6 s7 s8 s9000"

and, using the function

re.findall(pattern, phrase)

I'd like to extract:

  1. s0-s9 strings without the additional characters;
  2. s0-s3 strings without the additional characters;
  3. s0-s3 strings with the additional characters;
  4. s4-s9 strings with the additional characters.

I managed to accomplish the first three tasks by using these following patterns:

  1. pattern = "s[0-9]"
  2. pattern = "s[0-3]"
  3. pattern = "s[0-3]+"

For the last task, though, I tried to replicate what I did in the third one and used

pattern = "s[4-9]+"

but, instead of getting as result

["s4", "s5212638476234857634", "s6", "s7", "s8", "s9000"]

I get

["s4", "s5", "s6", "s7", "s8", "s9"]

Why is that? What am I missing? The instructions on the book I'm studying from states that the plus sign means "one or more characters", and the s[0-3]+ pattern in fact works, but I cannot make it work for this specific problem.

question from:https://stackoverflow.com/questions/65923618/how-to-choose-a-regex-pattern-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You need to use

s[4-9]d*

See the regex demo. Note: you might want to start matching from a word boundary if s should not be preceded with any word chars, s[4-9]d*. In Python, it would look like r's[4-9]d*'.

Details:

  • s - an s char
  • [4-9] - a digit from 4 to 9
  • d* - zero or more digits.

See the Python demo:

import re
rx = r"s[4-9]d*"
text = "s000000000 s1133122 s21 s3 s4 s5212638476234857634 s6 s7 s8 s9000"
print( re.findall(rx, text) )
# => ['s4', 's5212638476234857634', 's6', 's7', 's8', 's9000']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...