You can use a regular expression, matching all the text from MyKeyword
to MyData
in lazy mode:
>>> import re
>>> re.findall("MyKeyword.*?MyData.?","MyKeyword This is my data, MyData. MyKeyword and chunk of text here. Random text. MyData is this etc etc ")
['MyKeyword This is my data, MyData.', 'MyKeyword and chunk of text here. Random text. MyData']
.*?
means 0 to infinite characters, but in lazy mode (*?
), i.e. as less as possible;
.?
means an optional period.
EDIT (according to the new requirement):
The regex you need is something like
MyKeyword.*?(?= ?MyData|$)|MyData.*?(?= ?MyKeyword|$)
It starts from the point where it matches MyKeyword
(resp. MyData
), and then it catches as less characters as possible, as above, until it reaches MyData
(resp. MyKeyword
) or the end of the string.
Indeed:
|
is a special character which means "or"
$
matches the end of the string
?
is an optional space
(?=<expr>)
is called positive lookahead and it means "followed by <expr>
"
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…