Xah Lee, 2010-08, …, 2011-11-15
This page discusses the emacs lisp package〔xfrp_find_replace_pairs.el〕 for doing multi-pair find & replace. It explains why the package is needed, how to use it, and some implementation detail.
You have a given region in a buffer. You want to do more than one pair of find & replace strings. For example:
HTML entities:
& ⟷ &< ⟷ <> ⟷ >URL percentage encoding:
⟷ %20~ ⟷ %7e, ⟷ %2cFor ~10 more examples, see: Emacs Lisp Multi-Pair Find & Replace Applications.
The normal way to do find replace in a region is like this:
(defun replace-html-chars-region (start end) "Replace “<” to “<” and some other chars in HTML. This works on the current region." (interactive "r") (save-restriction (narrow-to-region start end) (goto-char (point-min)) (while (search-forward "&" nil t) (replace-match "&" nil t)) (goto-char (point-min)) (while (search-forward "<" nil t) (replace-match "<" nil t)) (goto-char (point-min)) (while (search-forward ">" nil t) (replace-match ">" nil t)) ) )
Basically, you narrow to region, and for each pair you use a while loop. This is quite cumbersome.
It would be nicer, if you can write it like this:
(defun replace-html-chars-region (start end) (interactive "r") (replace-pairs-region start end '( ["&" "&"] ["<" "<"] [">" ">"] )))
I wrote a elisp package that solves this problem. It can be downloaded at: code.google.com xfrp_find_replace_pairs.el.
It implements these functions:
For each function, there's a plain text version and a regex version.
Each function also has a string and region version. The string version works on a given string, the region version works on a region in buffer. This saves you from doing string/region conversion.
Here's a sample call for replace-pairs-in-string:
(replace-pairs-in-string "abcdef" [["a" "1"] ["b" "2"] ["c" "3"]]) ;; returns "123def"
One interesting issue about multiple find & replace is that if find & replace operation is done to the input string, in a sequential way, then you may end up with a substring that's not in the original input string nor in any of the find & replace pairs.
For example, if the input string is “abcd”, and you want to replace “a” by “c” and “c” by “d”. But if the replacement is done sequentially in a loop, you'll get “dbdd”, not “cbdd”.
The function replace-pairs-in-string and replace-pairs-region will not have the feedback loop problem. It guarantees that a replacement is done IF AND ONLY IF the original input string contains a substring in one of your find string.
For a version that does feedback, use replace-pairs-in-string-recursive.
Here's a example showing their difference:
(replace-pairs-in-string "aaaaa" [["aaa" "b"] ["ba" "c"]] ) ; ⇒ "baa" (replace-pairs-in-string-recursive "aaaaa" [["aaa" "b"] ["ba" "c"]] ) ; ⇒ "ca"
For the regex versions: replace-regexp-pairs-in-string, replace-regexp-pairs-region, they do each find & replace pair sequentially.
For about 10 examples of using multi-pair find & replace, See: Emacs Lisp Multi-Pair Find & Replace Applications.
Emacs ♥