Xah Lee, 2005-01-27
Suppose you want to walk into a directory, say, to apply a string replacement to all html files. The os.path.walk() rises for the occasion.
# Python import os mydir= '/Users/xah/Documents/unix_cilre/python' def myfun(s1, s2, s3): print s2 # current dir print s3 # list of files there print '------==(^_^)==------' os.path.walk(mydir, myfun, 'somenull')
The
os.path.walk(base_dir,f,arg) will walk a dir tree starting at
base_dir, and whenever it sees a directory (including base_dir), it
will call f(arg,current_dir,children), where the current_dir is the
string of the current directory, and children is a list of all
children of the current directory. Specifically: a list of strings that are
file names and directory names.
Now, suppose for each file ending in “.html” we want to apply function g to it. So, when ever myfun is called, we need to loop thru the children list, find files and ending in html, then call g. Here's the code.
import os mydir= '/Users/xah/web/SpecialPlaneCurves_dir' def g(s): print "g touched:", s def myfun(dummy, dirr, filess): for child in filess: if '.html' == os.path.splitext(child)[1] and os.path.isfile(dirr+'/'+child): g(dirr+'/'+child) os.path.walk(mydir, myfun, 3)
Note that os.path.splitext() splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile() makes sure that this is a actual file and not a dir with “.html” suffix.
One important thing to note: in the mydir, it must not end in a slash. One'd think Python'd take care of such trivia but no. This took me a while to debug. (as of Python 2.4.2, this is fixed.)
Also, the semantics of os.path.walk() is nice. The myfun can be a recursive function, calling itself, crystalizing a program's semantic.
In Perl, use the package “File::Find”'s “find” function to traverse a dir. Example:
# perl use File::Find qw(find); $mydir= '/Users/xah/web/SpecialPlaneCurves_dir'; sub wanted { if ($_ =~/\.html$/ && -T $File::Find::name) { print $File::Find::name, "\n";} } find(\&wanted, $mydir);
The line use File::Find qw(find); imports the “find” function. The “find” function is a directory walker. It will visit every file and subdirectorys in a given directory.
For each, it
sets the variable $_'s to the name of the file,
sets the variable $File::Find::name to the full path of the current file,
sets the variable $File::Find::dir to the full path of the current dir.
The “find” function has 2 parameters. The first is a reference to a function that will be called each time when “find” visits a file. The second is the path you want to traverse.
Note: The name “wanted” is just a convention used by the “File::Find” package. When your function “wanted” is called, nothing is passed to
it as argument. This means, you cannot write your “wanted” function as a
functional programing style that takes a file path as its
parameter. Instead, you must call the variable $File::Find::name or $_ inside the body of “wanted” to know the current file name.
Note: also, “wanted” cannot be written as a recursive function that calls itself to decent to subdirs.
blog comments powered by Disqus