Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
921 views
in Technique[技术] by (71.8m points)

regex - Why is lookahead (sometimes) faster than capturing?

This question is inspired by this other one.

Comparing s/,(d)/$1/ to s/,(?=d)//: the former uses a capture group to replace only the digit but not the comma, the latter uses a lookahead to determine whether the comma is succeeded by a digit. Why is the latter sometimes faster, as discussed in this answer?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The two approaches do different things and have different kinds of overhead costs. When you capture, perl has to make a copy of the captured text. Look-ahead matches without consuming; it has to mark the location where it starts. You can see what's happening by using the re 'debug' pragma:

use re 'debug';
my $capture = qr/,(d)/;
Compiling REx ",(d)"
Final program:
   1: EXACT  (3)
   3: OPEN1 (5)
   5:   DIGIT (6)
   6: CLOSE1 (8)
   8: END (0)
anchored "," at 0 (checking anchored) minlen 2 
Freeing REx: ",(d)"
use re 'debug';
my $lookahead = qr/,(?=d)/;
Compiling REx ",(?=d)"
Final program:
   1: EXACT  (3)
   3: IFMATCH[0] (8)
   5:   DIGIT (6)
   6:   SUCCEED (0)
   7: TAIL (8)
   8: END (0)
anchored "," at 0 (checking anchored) minlen 1 
Freeing REx: ",(?=d)"

I'd expect look-ahead to be faster than capturing in most cases, but as noted in the other thread regex performance can be data dependent.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.8k users

...