Xah Lee, 2007-10-09
This page shows a example of writing a emacs lisp function that cleans up a file's content by repeated application of Find-replace operation. If you don't know elisp, first take a look at Emacs Lisp Basics.
I want to write a command such that it repeatedly does find-replace on several find-replace pairs on the current file.
I have a website of Math Surface Gallery, which contains a Java applet called JavaView that allows people to view 3D objects interactively. (i.e. live rotation with the mouse) For example, this is one of the java applet page: Costa surface applet. There are about 70 of such surfaces. Each of these surface has a raw data file that the java applet reads. For example, for the Costa surface above, the raw data file is: costa.mgs.gz. These files are just Mathematica graphics in plain text, and compressed with gzip.
The content of the file looks like this:
Graphics3D[{{
Polygon[{{3.552, -0.001061, 2.689}, {3.552, 0.03079, 2.689},
{3.025, 0.02634, 2.524}, {3.025, -0.001061, 2.524}}],
Polygon[{{3.552, 0.03079, 2.689}, {3.550, 0.1250, 2.689},
{3.023, 0.1074, 2.524}, {3.025, 0.02634, 2.524}}],
Polygon[{...}],
...
}}]
Since the file contains thousands or tens of thousands of polygons, it can get large, and takes a while for the java applet to load it from the net. One way to reduce file size is to reduce the number of polygons. But given a file, spaces and end-of-line characters can be deleted, and the decimal numbers can be safely truncated to 3 digits. So, typically, i open the file, do global find-and-replace operations (“Alt-x query-replace”) by replacing “, ” to just “,”, and delete line endings (replacing “\n” by nothing), delete multiple spaces. To truncate decimals to 3 places, i use the “Alt+x query-replace-regexp” with pattern “\([0-9]\)\.\([0-9][0-9][0-9]\)[0-9]+” and replace it with “\1.\2”.
After a while, this process gets repetitious. It would be nice, to have a emacs command, so that when invoked, it will perform all these find-and-replace operations on the current file in one-shot. This would reduce some 50 keystrokes and eye-balling into a single brainless button punch.
Here's the solution:
(defun replace-mgs () "Reduce size of a mgs file by removing whitespace and truncating numbers. This function does several find and replace on the current buffer. Removing spaces, removing new lines, truncate numbers to 3 decimals, etc. The goal of these replacement is to reduce the file size of a Mathematica Graphics file (.mgs) that are read over the net by JavaView." (interactive) (goto-char (point-min)) (while (search-forward "\n" nil t) (replace-match "" nil t)) (goto-char (point-min)) (while (search-forward-regexp " +" nil t) (replace-match " " nil t)) (goto-char (point-min)) (while (search-forward ", " nil t) (replace-match "," nil t)) (goto-char (point-min)) (while (search-forward-regexp "\\([0-9]\\)\\.\\([0-9][0-9][0-9]\\)[0-9]+" nil t) (replace-match "\\1.\\2" t nil)) )
This function is relatively simple. It does a series of replacement using the “while” loop, each time moving the cursor to the beginning of file. The gist is the “search-forward”, “search-forward-regexp”, and “replace-match”.
The “search-forward” function takes a string and moves the cursor to the end of the string that matches. “search-forward-regexp” does similar. The “replace-match” simply replaces the text matched by the last search.
One interesting aspect about “search-forward-regexp” is that you must use 2 backslashes to represent one backslash. This is because backslash in emacs string needs a backslash to represent it. Then, this string is passed to emacs's regex engine.
Another thing of interest is that the first 2 optional parameters to “replace-match” function is “fixedcase” and “literal”, both are booleans. If “fixedcase” is non-nil, then emacs will not alter the case of the replacement. (ohterwise it decides smartly based on the case of the matched text) If “literal” is non-nil, then emacs will interprete the replacement string as literal. (in our case, we want “literal” to be “nil” when we use search-forward-regexp.)
Emacs is beautiful!
Addendum: here's the Mathematica code to export graphics into a text file forcing all numbers to be printed in a simple “d.dddd” format.
Otherwise, Mathematica may print numbers in various forms such as “2.25`*^-9”, “\(7.2389`\)”, “3.141592653589793238462643383279503`20”.
writeToFileRounded[expr_Graphics3D,fileName_?StringQ,prec_:4]:=Module[{},
OpenWrite[fileName];
WriteString[fileName,"Graphics3D["];
WriteString[fileName,
StringReplace[
ToString@
NumberForm[First@SetPrecision[Chop[expr,10^-(prec+1)],prec],
ExponentFunction\[Rule](If[-Infinity<#<Infinity,Null,#]&)],
"],"->"],\n"]];
WriteString[fileName,"]"];
Close[fileName]
];
writeToFileRounded[surf,"helicoid.ma",4]
(*the first argument is a Graphics3D object, the second is a name to
save to, the third is number of decimal places for the coordinate
values.*)
Related essays:
Page created: 2007-10. © 2007 by Xah Lee.