News, Events, Trends, Activities, Conferences and Notes to do with Web Culture, Development, New Media, Content Management, Mobile and PDA Access and Web Infrastructure
|
See Also: Home Links Personal Site Blogroll FriendFeed CV | |
Wiki Menu:Tags:
|
Understanding GreedinessThis talk was presented by Jacinta Richardson of Perl training Australia and looked at getting around the natural greediness of regular expressions. I use regexes a lot and was aware of the need to override greediness but never really understood what it was about greediness that made it inneficient. Some times you really do want to match greedily, ie the largest possible match in a string, but occasionally you really only want the 1st possible match in a pattern, not from the 1st to the last which is the default greedy behaviour. Its tempting to use the '.' metacharacter in regexes which matches any character (except newline) to find any text between to other marker sections of text. For example to find BLADE between 954 and '2003' in the string 954BLADE2003 you could use a regex like /4(.*)2/ which would return BLADE, this is a simple example and a very short string so performance wise is fine, and there is only one match. What the regex engine actually does in this search is the following..
As mentioned this works find in this short and contrived example, but what if the string was many KB or MB long. An obvious improvment is to use something like /4(\w*)2/ which will stop the greedy swallow after the 1st non word character, ie the '2' in this case. Another improvment if we actually know the length of the target string ('BLADE' in this case which is '5' chars long) is to use something like /4(.{5})/ which fetches exaclty 5 word-chars following the '4'. This is faster apparently coz we're not checking type in the regex where just fetching raw characters of a fixed length. This jump-to-line-end and step back approach that regex engines (apparently they all behave similarly) was interesting to understand a little better, Jacinta called it the "Are we there yet" syndrome making reference to the way kids ask that all the time in the car as the journey grinds along.
See Also: OSDC 2006 | Web Development | Notes Index |