Regex – What regex can match sequences of the same character

perlregex

A friend asked me this and I was stumped: Is there a way to craft a regular expression that matches a sequence of the same character? E.g., match on 'aaa', 'bbb', but not 'abc'?

m|\w{2,3}| 

Wouldn't do the trick as it would match 'abc'.

m|a{2,3}| 

Wouldn't do the trick as it wouldn't match 'bbb', 'ccc', etc.

Best Solution

Sure thing! Grouping and references are your friends:

(.)\1+

Will match 2 or more occurences of the same character. For word constituent characters only, use \w instead of ., i.e.:

(\w)\1+