Emacs Lisp: Multi-Pair String Replacement Function

Advertise Here

, 2010-08, …, 2011-11-15

This page discusses the emacs lisp package〔xfrp_find_replace_pairs.el〕 for doing multi-pair find & replace. It explains why the package is needed, how to use it, and some implementation detail.

Need for Multi-Pair Replacement

You have a given region in a buffer. You want to do more than one pair of find & replace strings. For example:

HTML entities:

URL percentage encoding:

For ~10 more examples, see: Emacs Lisp Multi-Pair Find & Replace Applications.

Standard Elisp Solution for Multi-Pair Replacement

The normal way to do find replace in a region is like this:

(defun replace-html-chars-region (start end)
  "Replace “<” to “&lt;” and some other chars in HTML.
This works on the current region."
  (interactive "r")
  (save-restriction 
    (narrow-to-region start end)

    (goto-char (point-min))
    (while (search-forward "&" nil t) (replace-match "&amp;" nil t))

    (goto-char (point-min))
    (while (search-forward "<" nil t) (replace-match "&lt;" nil t))

    (goto-char (point-min))
    (while (search-forward ">" nil t) (replace-match "&gt;" nil t))
    ) )

Basically, you narrow to region, and for each pair you use a while loop. This is quite cumbersome.

It would be nicer, if you can write it like this:

(defun replace-html-chars-region (start end)
  (interactive "r")
  (replace-pairs-region start end
 '(
 ["&" "&amp;"]
 ["<" "&lt;"]
 [">" "&gt;"]
 )))

Emacs Lisp Package: xfrp_find_replace_pairs.el

I wrote a elisp package that solves this problem. It can be downloaded at: code.google.com xfrp_find_replace_pairs.el.

It implements these functions:

For each function, there's a plain text version and a regex version.

Each function also has a string and region version. The string version works on a given string, the region version works on a region in buffer. This saves you from doing string/region conversion.

Here's a sample call for replace-pairs-in-string:

(replace-pairs-in-string "abcdef" [["a" "1"] ["b" "2"] ["c" "3"]])
;; returns "123def"

Find & Replace Feedback Loop Problem

One interesting issue about multiple find & replace is that if find & replace operation is done to the input string, in a sequential way, then you may end up with a substring that's not in the original input string nor in any of the find & replace pairs.

For example, if the input string is “abcd”, and you want to replace “a” by “c” and “c” by “d”. But if the replacement is done sequentially in a loop, you'll get “dbdd”, not “cbdd”.

The function replace-pairs-in-string and replace-pairs-region will not have the feedback loop problem. It guarantees that a replacement is done IF AND ONLY IF the original input string contains a substring in one of your find string.

For a version that does feedback, use replace-pairs-in-string-recursive.

Here's a example showing their difference:

(replace-pairs-in-string
  "aaaaa" [["aaa" "b"] ["ba" "c"]] )    ; ⇒ "baa"

(replace-pairs-in-string-recursive
  "aaaaa" [["aaa" "b"] ["ba" "c"]] )    ; ⇒ "ca"

For the regex versions: replace-regexp-pairs-in-string, replace-regexp-pairs-region, they do each find & replace pair sequentially.

Applications

For about 10 examples of using multi-pair find & replace, See: Emacs Lisp Multi-Pair Find & Replace Applications.

Emacs ♥

blog comments powered by Disqus