Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
329 views
in Technique[技术] by (71.8m points)

c# - Regex match take a very long time to execute

I wrote a regular expression that parses a file path into different group (DRIVE, DIR, FILE, EXTENSION).

^((?<DRIVE>[a-zA-Z]):\)*((?<DIR>[a-zA-Z0-9_]+(([a-zA-Z0-9_s_-.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+)))\)*(?<FILE>([a-zA-Z0-9_]+(([a-zA-Z0-9_s_-.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+)).(?<EXTENSION>[a-zA-Z0-9]{1,6})$))

I made a test in C#. When the path I want to test is correct. The result is very quick and this is what I wanted to expect.

string path = @"C:Documents and SettingsjhrMy DocumentsVisual Studio 2010ProjectsFileEncryptorDds.FileEncryptorDds.FileEncryptor.csproj";

=> OK

But when I try to test with a path that I know that will not match, like this :

string path = @"C:Documents and SettingsjhrMy DocumentsVisual Studio 2010ProjectsFileEncryptorDds.FileEncryptorDds.FileEncryptor?!??????";

=> BUG

The test freezes when I call this part of code

Match match = s_fileRegex.Match(path);

When i look into my Process Explorer, I see the process QTAgent32.exe hanging at 100% of my processor. What does it mean ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The problem you are experiencing is called catastrophic backtracking and is due to the large number of ways that you regular expression can match the start of the string, which gives slow performance due to the backtracking regular expression engine in .NET.

I think you are using * too frequently in your regular expression. * does not mean "concatenate" - it means "0 or more times". For example there should not be a * here:

((?<DRIVE>[a-zA-Z]):\)*

There should be at most one drive specification. You should use ? instead here, or else no quantifier at all if you want the drive specification to be compulsory. Similarly there appear to be other places in your regular expression where the quantifier is incorrect.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.8k users

...