Text Pattern Matching With Python And Perl

2005-02-07

Python

Suppose you want to replace all strings of the form “<img src="some.gif" width="30" height="20">” to “<img src="some.png" width="30" height="20">” in your html files.

What you need, is to match a text pattern, and capture parts of it for replacement. By historical convention, this is called regular expression, or regex. Python provides regex in its “re” module. Here's a example of how to use it in our case.

# -*- coding: utf-8 -*-
# Python

import re

text = r'''<html>
blab blab
<P> look at this <img src="./some.gif" width="30" height="20"> pict
and this one <img class="floating" src="../that.gif">, both are
beautiful, but also look: <img src ="my.gif">, and sequel
 <img src=
"girl.gif"> yeah! </p>

'''

new = re.sub(r'src\s*=\s*"([^"]+)\.gif"', r'src="\1.png"', text)

print new

The first argument to re.sub is a regex pattern. The second argument is the replacement string, which can contain captured pattern (the “\1”) the third argument is the text to be checked. A optional 4th argument is number of replacement to make. If omitted, it replace all occurances of matches.

See regex doc here: Pyhton Regex Documentation: String Pattern Matching.

Perl

In Perl, regex replacement is done with “s///”. For example:

$text = "123";
$text =~ s/2/9/;
print $text; 

If all you want is to test a match instead of replace, do like this: “$text =~ m/regexPatternHere/” .

Reference: perldoc perlop↗.

Reference: perldoc perlre↗.


See also:


Page created: 2005-01.
© 2005 by Xah Lee.
Xah Signet