2005-01-27
Suppose you want to walk into a directory, say, to apply a string replacement to all html files. The os.path.walk() rises for the occasion.
# Python import os mydir= '/Users/xah/Documents/unix_cilre/python' def myfun(s1, s2, s3): print s2 # current dir print s3 # list of files there print '------==(^_^)==------' os.path.walk(mydir, myfun, 'somenull')
The os.path.walk(base_dir,f,arg) will walk a dir tree starting at base_dir, and whenever it sees a directory (including base_dir), it will call f(arg,current_dir,children), where the current_dir is the string of the current directory, and children is a list of all children of the current directory. Specifically: a list of strings that are file names and directory names.
Now, suppose for each file ending in .html we want to apply function g to it. So, when ever myfun is called, we need to loop thru the children list, find files and ending in html, then call g. Here's the code.
import os mydir= '/Users/xah/web/SpecialPlaneCurves_dir' def g(s): print "g touched:", s def myfun(dummy, dirr, filess): for child in filess: if '.html' == os.path.splitext(child)[1] and os.path.isfile(dirr+'/'+child): g(dirr+'/'+child) os.path.walk(mydir, myfun, 3)
Note that “os.path.splitext()” splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The “os.path.isfile()” makes sure that this is a actual file and not a dir with “.html” suffix.
One important thing to note: in the mydir, it must not end in a slash. One'd think Python'd take care of such trivia but no. This took me a while to debug. (as of Python 2.4.2, this is fixed.)
Also, the semantics of “os.path.walk()” is nice. The myfun can be a recursive function, calling itself, crystalizing a program's semantic.
Reference: Python Doc↗.
In Perl, use the package “File::Find”'s “find” function to traverse a dir. Example:
# perl use File::Find qw(find); $mydir= '/Users/xah/web/SpecialPlaneCurves_dir'; sub wanted { if ($_ =~/\.html$/ && -T $File::Find::name) { print $File::Find::name, "\n";} } find(\&wanted, $mydir);
The line “use File::Find qw(find);” imports the “find” function. The “find” function is a directory walker. It will visit every file and subdirectorys in a given directory. For each, it sets the variable “$_”'s to the name of the file, sets the variable “$File::Find::name” to the full path of the current file, sets the variable “$File::Find::dir” to the full path of the current dir.
The “find” function has 2 parameters. The first is a reference to a function that will be called each time when “find” visits a file. The second is the path you want to traverse.
Note: The name “wanted” is just a convention used by the “File::Find” package. When your function “wanted” is called, nothing is passed to it as argument. This means, you cannot write your “wanted” function as a functional programing style that takes a file path as its parameter. Instead, you must call the variable “$File::Find::name” or “$_” inside the body of “wanted” to know the current file name.
Note: also, “wanted” cannot be written as a recursive function that calls itself to decent to subdirs.
Reference: perldoc File::Find↗.
See also:
Page created: 2005-01. © 2005 by Xah Lee.