| |||||||||
Book: The Judo Language 0.9 |
|
If no parameters are specified, the C:\>type pwd.judo cd; println curDir(); cd '~/..'; println curDir(); cd 'jhuang'; println curDir(); C:\>java judo -q pwd.judo C:/Documents and Settings/jhuang C:/Documents and Settings C:/Documents and Settings/jhuang The Make and Remove DirectoryIn Judo, you make a new directory with the
If you are making a directory like
When removing a directory, normally the directory must be empty. To force a removal of a directory, use the Rename and Move FilesRenaming and moving files are both achieved by the
The first parameter is the source file or directory, the second is the target, which can be either a non-existing path name or an existing directory. If the target is a directory, the source file or directory is moved into that directory. If the target exists and is not a directory, an error occurs. Again, let's experiment on the command line: C:\z>dir Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\z 09/04/2004 07:05a <DIR> . 09/04/2004 07:05a <DIR> .. 09/04/2004 07:04a 5 alfa 09/04/2004 07:05a <DIR> ddd 1 File(s) 5 bytes 3 Dir(s) 38,403,932,160 bytes free C:\z>java judo -x "move 'alfa', 'beta'" C:\z>java judo -x "move 'beta', 'ddd'" C:\z>java judo -x "move 'ddd/beta', 'gamma'" C:\z>dir Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\z 09/04/2004 07:06a <DIR> . 09/04/2004 07:06a <DIR> .. 09/04/2004 07:04a 5 gamma 09/04/2004 07:05a <DIR> ddd 1 File(s) 5 bytes 3 Dir(s) 38,403,796,992 bytes free
|
ListCommand | ::= | ( listFiles [ < Expr > ] | ls ) [ FileSelection ] ( ListOption )* [ StatsOption | Action ] |
FileSelection | ::= | Expr ( except Expr | in Expr )* |
ListOption | ::= | ordered [ by ( date | size | extension ) ] | limit Expr | as tree | recursive | noHidden | fileOnly | showDir | dirOnly |
StatsOption | ::= | count | ( size | compressedSize | lines | words | perFile )+ |
Action | ::= | remove | setFileTime [ Expr ] | setReadOnly | addToClasspath | exec Expr | Block |
In plain English, listFiles
uses a FileSelection along with zero or more list options to find the files and directories (or folders) in a file system directory or from an archive file, and can do the following three things:
count
, size
, lines
or words
; orremove
, setFileTime
and setReadOnly
,exec
clause, or
The ls
command works like listFiles
but just prints out the files and returns nothing. It emulates the Unix shell command ls
or Windows dir
command. For ls
, no actions are allowed.
This command is rich and have many options. Some options and/or actions are valid and compatible with each other. The Judo parser enforces these rules and will issue error messages if incompatible options are specified. Let's see how this command can be used to achieve various purposes on files, starting with the FileSelection.
By default, fileOnly
is on, meaning that only file names are returned or displayed. To show directory names along with file names, use the showDir
option. To see directory names only, use dirOnly
.
The FileSelection includes an inclusive list, an exclusive list and a base, all of which are optional. The following is a couple of examples:
C:\>type ls.judo ls '*.java, *.judo' except '*/save/*, */alfa*' in 'c:/temp'; C:\>java judo -q ls.judo C:/temp/mytest.judo C:/temp/Test.java
The ls
command here lists and prints the files in a file system direcotry. The following example lists and prints files in a jar archive:
C:\devenv\envroot\projects\judoscript-0.9\testcases\2.ess_fs_archive>type ls_jar.judo ls 'src/*.java' except '*/save/*, */alfa*' in 'c:/src.jar'; C:\>java judo -q ls_jar.judo src/judo.java src/juspProc.java src/rjudo.java src/rjudo_server.java src/vjudo.java
The inclusive and exclusive (except
) lists are expressions evaluated as strings, and the string value is a comma-separated list of path name patterns that may contain wildcard characters like *
(for 0 or more characters) and ?
(for a single character). Because the path names are all absolute during list operation, it is wise to use *
as prefix to path patterns. If the inclusive list is not present, it is assumed '*'
except for the remove
operation. If the base directory or archive file is not specified, the list operation starts at the current directory.
There is one tricky situation with the inclusive list. Since the inclusive list can be any expression, it can also be a variable name. If a variable name happens to be a listFiles
option name, confusion arises:
fileOnly = '*.java, *.judo'; listFiles fileOnly; // !#@~*$
The rule is, listFiles
option names take precedence over variable names. So variable fileOnly
can never be used listFiles
and ls
commands; you have to use a different variable name.
The listFiles
and ls
commands can also take options like recursive
, noHidden
, fileOnly
and dirOnly
; they direct the command and affects the file selection in a different way. Their meanings should be self-evident.
So far we have seen the simple way of listing files, that is, print out via the ls
command. In contrast, listFiles
either returns the result to the script for programmatic processing or it does in-line processing with the specified actions. We will cover all the usages in the following sections.
In the simplest case, listFiles
returns the found file path names in an Array
and store it in the predefined local variable $_
. The following example use listFiles
to get a list of file names and then print them out, behaving just like ls
:
Listing 15.1 ls_clone.judo |
---|
listFiles '*.java, *.judo' except '*/save/*, */alfa*' in 'c:/temp'; for x in $_ { println x; } |
Again, be careful with $_
since a number of statements in Judo uses it and can be overwritten without notice. It is safer to assign it to a dedicated variable immediately following the listFiles
command.
When recursive
option is specified, listFiles
and ls
commands recurse into sub-directories for all files and/or directories that match the file selection criteria. The returned files can be sorted by path name, file time or size via the ordered by
clause.
Sometimes, you just need to find a few files. In this case, specify the limit
clause to improve performance. The following code snippet assumes a junit.jar
lies somewhere in the vicinity of the script, finds it and adds it to the classpath:
Listing 15.2 prepare_unittest.judo |
---|
cd #script.getFilePath(), '..'; // move up one directory listFiles '*/junit.jar' recursive limit 1; if $_.length <= 0 { println <err> "junit.jar is not found. Can't proceed with testing."; return; } #classpath.add( $_[0] ); // now, do some unit testing // .... |
Speaking of classpath, listFiles
provides another action, addToClasspath
, that makes this even easier:
Listing 15.3 addtoclasspath.judo |
---|
cd #script.getFilePath(), '../lib/'; // move to the lib/ directory listFiles '*.jar, *.zip' addToClasspath; println #classpath; // verify |
Most of the time, listFiles
return path names as an Array
in the $_
local variable. You can also merge the result into an existing array like this:
Listing 15.4 add_libjars_2cp.judo |
---|
/* * To add all the jar/zip files in ${deploy}/lib and ${thirdparty}/lib * into the (user) classpath. */ arr = []; listFiles <arr> '*.jar, *.zip' in '${deploy}/lib'; listFiles <arr> '*.jar, *.zip' in '${thirdparty}/lib'; #classpath.add(arr); |
By the way, you can do addToClasspath
on the second listFiles
command to add all the found files to the classpath.
Sometimes it is convenient to process the content of a directory as a tree rather than a list (or array). This can be easily done with the as tree
option:
C:\>type get_tree.judo listFiles '*.java, *.judo' except '*/save/*, */alfa*' in 'c:/temp' recursive as tree; println $_; for x in $_.getChildren() { println x; } C:\>java judo -q get_tree.judo {isDir=true,path=C:/temp} {isDir=true,path=C:/temp/Adobe} {isDir=true,path=C:/temp/Cookies} {isDir=true,path=C:/temp/History} {isDir=true,path=C:/temp/Temporary Internet Files} {isDir=true,path=C:/temp/VBE} {path=C:/temp/Test.java} {path=C:/temp/mytest.judo}
The returned value is a TreeNode
object for the root directory, which may have one or more children nodes holding information about the files and directories. Each node has a path
attribute and an isDir
boolean attribute. Needless to say, only directory nodes have children nodes. Refer to Tree Node for how to work with trees. For instance, you can use TreeNode
's traversal methods like bfsAllNodes()
or dfsAllNodes()
to display all the nodes:
Listing 15.5 get_dir_tree.judo |
---|
listFiles '*.java, *.judo' except '*/save/*, */alfa*' in 'c:/temp' dirOnly recursive as tree; for x in $_.dfsAllNodes() { println x; } |
The listFiles
command can not only find files but also calculate some statistics about the files, that is, size
, lines
, words
and count
; for files within archives, you can get compressedSize
, too. The return value for these commends are all different based on the options. The count
is an individual option, where size
, compressedSize
, lines
and words
can be used together.
The count
option returns counts of the selected files and directories. It returns an array of three elements: count of files, count of directories or folders, and the total count. The last element is redundant: it is always the sum of the other two. The following example demonstrates its use by directly running some code on command-line:
C:\src>java judo -x "listFiles '*.java' count; println $_" [5,0,5] C:\src>java judo -x "listFiles '*.java' recursive count; println $_" [289,29,318] C:\src>java judo -x "listFiles '*.java' recursive fileOnly count; println $_" [289,0,289]
You can get file statistics with the size
, compressedSize
, lines
and words
options. These options can be used together as well. When used individually, the return value is a number; if multiple options are used together, an array of numbers are returned.
Listing 15.6 dirstats.judo |
---|
listFiles '*' in 'C:/src/com/judoscript' dirOnly; for x in $_ { // get status for each directory listFiles '*.java, *.jj' in x recursive size lines words; println $_[0]:>8, ' ', $_[1]:>6, ' ', $_[2]:>6, ' ', x; } |
The result is something like this:
35889 993 4286 C:/src/com/judoscript/xml 524476 15003 63402 C:/src/com/judoscript/util 20219 455 2321 C:/src/com/judoscript/user 40954 1273 4451 C:/src/com/judoscript/studio 215570 6361 20776 C:/src/com/judoscript/parser 12537 391 1398 C:/src/com/judoscript/jusp 7181 195 883 C:/src/com/judoscript/jdk14 29410 654 2911 C:/src/com/judoscript/gui 32918 877 3775 C:/src/com/judoscript/ext 70331 1886 7652 C:/src/com/judoscript/db 297949 8179 33558 C:/src/com/judoscript/bio 0 0 0 C:/src/com/judoscript/ant
And we just found an empty directory that can be removed.
You can also get the statistics per file. The return value is a SortedMap
which are path names mapped a number or an array of numbers, depending on how many options are specified. The following example shows both cases:
Listing 15.7 filestats.judo |
---|
listFiles '*.java' fileOnly lines perFile; for f in $_ { stats = $_.(f); println stats:>8, ' ', f; } listFiles '*.java' fileOnly size lines words perFile; for f in $_ { stats = $_.(f); println stats[0]:>8, ' ', stats[1]:>6, ' ', stats[2]:>6, ' ', f; } |
.... .......
35 C:\src\com\judoscript\ValueBase.java
369 C:\src\com\judoscript\ValueSpecial.java
45 C:\src\com\judoscript\Variable.java
1460 C:\src\com\judoscript\VariableAdapter.java
273 C:\src\com\judoscript\VersionInfo.java
83 C:\src\com\judoscript\_Thread.java
16991 TOTAL
..... .... .... .......
1409 35 190 C:\src\com\judoscript\ValueBase.java
15734 369 1860 C:\src\com\judoscript\ValueSpecial.java
1759 45 226 C:\src\com\judoscript\Variable.java
63038 1460 6331 C:\src\com\judoscript\VariableAdapter.java
14173 273 1762 C:\src\com\judoscript\VersionInfo.java
2665 83 317 C:\src\com\judoscript\_Thread.java
628614 16991 73332 TOTAL
For Unix users, features likes size
, lines
and words
may remind you of the wc
utility.
Since listFiles
command can be used for so many purposes, let us summarize its return values.
Sample Command | Return Value |
---|---|
listFiles | $_ is an Array of paths. |
listFiles as tree | $_ is a TreeNode with these attributes: path and isDir . |
listFiles count | $_ is an Array of three elements: $_[0] is the count of files, $_[1] is the count of directories or folders, and $_[2] is the sum of the other two. |
listFiles size | $_ is a number as the cumulative size of all files. The option can also be compressedSize , lines or words . |
listFiles size perFile | $_ is a SortedMap , where each path is mapped to a number as the size of that file. |
listFiles size lines words | $_ is an Array of three numbers for the size, number of lines and number of words. |
listFiles lines size | $_ is an Array of three numbers for the number of lines and size. |
listFiles lines size perFile | $_ is a SortedMap , where each path is mapped to an Array of three numbers for the number of lines and size. |
listFiles exec '..' | $_ is an Array of all the path names processed. See Run Shell Commands on Selected Files. |
listFiles { ... } | $_ is an Array of all the path names processed. See Arbitrarily Process Selected Files. |
The listFiles
command natively supports three operations on selected files via the keywords remove
, setFileTime
and setReadOnly
. The command returns the path names of all the files and directories affected.
The remove
command can remove files and empty directories in the file system. The following example removes all the files left over by the vi
editor:
C:\src>java judo -x "listFiles '*~' recursive remove; println $_"
To remove a directory that is not empty, you would have to use the rmdir
mentioned below.
The setFileTime
command can optionally take a Date
value. If the time value is not specified, the current time is used, and this command becomes much like the Unix touch
utility.
The setReadOnly
command is used to set the read-only flag on a file or directory in the file system. The Java platform does not support setting files to be read-write, so this setReadOnly
command does not have a counterpart to set files and directories to be read-write.
Of course, you can use the operationg system's commands or utilities to do these operations via the exec
command explained next.
For the selected files in the listFiles
command, you can apply any operating system shell commands or utilities on them via the exec
clause. The following is an example that does the same as setReadOnly
:
Listing 15.8 list_exec.judo |
---|
listFiles exec isWindows() ? 'attrib +r $_' : 'chmod 666 $_'; |
The exec
clause takes an operating system command-line, which uses $_
to represent the current file being processed. In this example, we use the attrib
utility on the Windows platform or the chmod
command on Unix to set the read-only attribute for files and directories. You can easily modify this to make files writable as well.
Sometimes it is more efficient to run native executables directly if possible. This is because the listFiles
command runs the command line for a single file or path, where the native executable can take wildcard characters in the parameter to handle a set of files, for instance:
% chmod 666 *.java
The processed file names are also returned in an array as $_
.
The exec
command is really a shortcut. For example, listFile exec 'chmod 666 $_';
is a shortcut for:
listFile; for x in $_ { exec 'chmod 666 ${x}'; }
This is also true for arbitrary file processing that is discussed next.
In addition to the exec
clause to run operating system commands on the selected files, you can specify a block of code to process the selected file. Again, the file name being processed in stored in $_
. The following example counts the total blank and non-blank lines in all the files:
Listing 15.9 count_blank.judo |
---|
cnt1 = 0; cnt2 = 0; listFiles '*.java, *.jj' in 'c:/src' recursive { do $_ as lines { // now, $_ has become the line just read! if ($_.isEmpty()) ++cnt1; else ++cnt2; } } println ' Blank lines: ', cnt1:>7; println 'Non-Blank lines: ', cnt2:>7; println ' Total Files: ', $_.length:>7; |
The result is:
Blank lines: 5983 Non-Blank lines: 47562 Total Files: 282
This command also returns all the path names in an array. So in this example, the meaning of all three occurrances of $_
are all different. The first $_
within do $_ as lines
is a string representing the path name being processed; the second $_
in if ($_.isEmpty())
is the line just read from the file; and the last $_
in println ' Total Files: ', $_.length:>7
is an array of all the path names just processed.
Next is a more elaborate example. We will go through the source code tree and update the copyright note for all the source files owned by the project. The modified source files will be generated in another directory and left there.
Listing 15.10 upd_copyright.judo |
---|
src = 'C:/src/'; src_len = src.length(); target = 'C:/temp/new_src/'; mkdir target; listFiles '*.java, *.jj' in src fileOnly recursive { // Construct the path for the new file: var path = $_.getFilePath(); var file = $_.getFileName(); var newPath = target + path.substring(src_len); mkdir newPath; // make sure the dir is there; ok if exists. var newfile = openTextFile(newPath + file, 'w'); // Process the lines in the source file: var updated = false; do $_ as lines { // now, $_ holds the line just read. if !updated && $_.startsWith(' * Copyright (C) 2001-') { println <newfile> ' * Copyright (C) 2001-', #year, ' James Huang http://www.judoscript.com'; updated = true; } else { println <newfile> $_; } } // Done. newfile.close(); println 'Updated ', path.substring(src_len), file; } |
This concludes our discussion of the listFiles
command. The listFiles
command does much more than what the name suggests. It is indeed a file processor, allowing you to obtain a set of files and directories, and process them individually (via the exec
and a code block) or collective (such as getting the result in a tree via the as
tree
option). It can also return a number of statistics. This single command includes functionalities of a number of popular shell utilities, such as ls
, wc
and touch
on Unix.
Beyond individual file processing, Judo has some other commands to do copying and moving. These are covered in the rest of this chapter.
As we have mentioned eariler, the listFiles
and ls
commands can be applied to contents within zip, jar and tar (gzipped or not) archives. Obviously files and folders contained in archives are all read-only, so commands like remove
, setFileTime
and setReadOnly
don't apply, and shell command with exec
generally doesn't make sense. Information gathering is valid, and you can do read-only processing on files in zip or jar achives. You can't do anything in a tar archive due to its sequential nature.
Let's take a look at some examples involving zip and tar archives, starting with a zip archive like this:
C:\>jar tvf awebapp.zip 276 Tue Jun 15 13:33:32 PDT 2004 index.jsp 347 Mon Aug 30 13:32:30 PDT 2004 login.jsp 0 Mon Aug 30 13:28:26 PDT 2004 META-INF/ 55 Tue Jun 15 13:33:32 PDT 2004 META-INF/MANIFEST.MF 0 Mon Aug 30 13:34:16 PDT 2004 WEB-INF/ 0 Mon Aug 30 13:41:06 PDT 2004 WEB-INF/classes/ 0 Mon Aug 30 13:38:44 PDT 2004 WEB-INF/classes/foo/ 0 Mon Aug 30 13:39:06 PDT 2004 WEB-INF/classes/foo/bar/ 1604 Mon Aug 30 13:39:06 PDT 2004 WEB-INF/classes/foo/bar/LoginDAO.class 0 Mon Aug 30 13:36:22 PDT 2004 WEB-INF/lib/ 118726 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/lib/commons-beanutils.jar 31605 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/lib/commons-logging.jar 498051 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/lib/struts.jar 0 Mon Aug 30 13:28:24 PDT 2004 WEB-INF/src/ 3672 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/src/build.xml 0 Mon Aug 30 13:40:54 PDT 2004 WEB-INF/src/java/ 0 Mon Aug 30 13:37:46 PDT 2004 WEB-INF/src/java/foo/ 0 Mon Aug 30 13:38:28 PDT 2004 WEB-INF/src/java/foo/bar/ 1026 Mon Aug 30 13:38:28 PDT 2004 WEB-INF/src/java/foo/bar/LoginDAO.java 1923 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/src/README.txt 8868 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/struts-bean.tld 66192 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/struts-html.tld 14511 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/struts-logic.tld 1942 Tue Jun 15 13:33:32 PDT 2004 WEB-INF/web.xml C:\>java judo -q "ls '*' in 'awebapp.zip' recursive" WEB-INF/ WEB-INF/web.xml WEB-INF/struts-logic.tld WEB-INF/struts-html.tld WEB-INF/struts-bean.tld WEB-INF/src/ WEB-INF/src/README.txt WEB-INF/src/java/ WEB-INF/src/java/foo/ WEB-INF/src/java/foo/bar/ WEB-INF/src/java/foo/bar/LoginDAO.java WEB-INF/src/build.xml WEB-INF/lib/ WEB-INF/lib/struts.jar WEB-INF/lib/commons-logging.jar WEB-INF/lib/commons-beanutils.jar WEB-INF/classes/ WEB-INF/classes/foo/ WEB-INF/classes/foo/bar/ WEB-INF/classes/foo/bar/LoginDAO.class META-INF/ META-INF/MANIFEST.MF login.jsp index.jsp C:\>java judo -q "ls '*' in 'awebapp.zip' fileOnly recursive" WEB-INF/web.xml WEB-INF/struts-logic.tld WEB-INF/struts-html.tld WEB-INF/struts-bean.tld WEB-INF/src/README.txt WEB-INF/src/java/foo/bar/LoginDAO.java WEB-INF/src/build.xml WEB-INF/lib/struts.jar WEB-INF/lib/commons-logging.jar WEB-INF/lib/commons-beanutils.jar WEB-INF/classes/foo/bar/LoginDAO.class META-INF/MANIFEST.MF login.jsp index.jsp
The following program gets the overall sizes of the top level folders:
Listing 15.11 dirstats_zip.judo |
---|
listFiles '*' in 'awebapp.zip' dirOnly; for x in $_ { // get status for each directory listFiles '*.java, *.jj' in x recursive size compressedSize; println $_[0]:>8, ' ', $_[1]:>8, ' ', x; } |
C:\>java judo -q dirstats_zip.judo
748120 590201 WEB-INF/
55 56 META-INF/
The same operations can be done on tar archives:
C:\>tar tvfz awebapp.tar.gz drwxr-xr-x jhuang/None 0 2004-08-30 13:28:26 META-INF/ -rw-r--r-- jhuang/None 55 2004-06-15 13:33:33 META-INF/MANIFEST.MF drwxr-xr-x jhuang/None 0 2004-08-30 13:34:17 WEB-INF/ drwxr-xr-x jhuang/None 0 2004-08-30 13:41:08 WEB-INF/classes/ drwxr-xr-x jhuang/None 0 2004-08-30 13:38:46 WEB-INF/classes/foo/ drwxr-xr-x jhuang/None 0 2004-08-30 13:39:08 WEB-INF/classes/foo/bar/ -rw-r--r-- jhuang/None 1604 2004-08-30 13:39:08 WEB-INF/classes/foo/bar/LoginDAO.class drwxr-xr-x jhuang/None 0 2004-08-30 13:36:23 WEB-INF/lib/ -rw-r--r-- jhuang/None 118726 2004-06-15 13:33:33 WEB-INF/lib/commons-beanutils.jar -rw-r--r-- jhuang/None 31605 2004-06-15 13:33:33 WEB-INF/lib/commons-logging.jar -rw-r--r-- jhuang/None 498051 2004-06-15 13:33:33 WEB-INF/lib/struts.jar drwxr-xr-x jhuang/None 0 2004-08-30 13:28:26 WEB-INF/src/ -rw-r--r-- jhuang/None 3672 2004-06-15 13:33:33 WEB-INF/src/build.xml drwxr-xr-x jhuang/None 0 2004-08-30 13:40:54 WEB-INF/src/java/ drwxr-xr-x jhuang/None 0 2004-08-30 13:37:48 WEB-INF/src/java/foo/ drwxr-xr-x jhuang/None 0 2004-08-30 13:38:28 WEB-INF/src/java/foo/bar/ -rw-r--r-- jhuang/None 1026 2004-08-30 13:38:28 WEB-INF/src/java/foo/bar/LoginDAO.java -rw-r--r-- jhuang/None 1923 2004-06-15 13:33:33 WEB-INF/src/README.txt -rw-r--r-- jhuang/None 8868 2004-06-15 13:33:33 WEB-INF/struts-bean.tld -rw-r--r-- jhuang/None 66192 2004-06-15 13:33:33 WEB-INF/struts-html.tld -rw-r--r-- jhuang/None 14511 2004-06-15 13:33:33 WEB-INF/struts-logic.tld -rw-r--r-- jhuang/None 1942 2004-06-15 13:33:33 WEB-INF/web.xml -rw-r--r-- jhuang/None 276 2004-06-15 13:33:33 index.jsp -rw-r--r-- jhuang/None 347 2004-08-30 13:32:31 login.jsp C:\>java judo -q "ls '*' in 'awebapp.tar.gz' recursive" login.jsp index.jsp WEB-INF/ WEB-INF/web.xml WEB-INF/struts-logic.tld WEB-INF/struts-html.tld WEB-INF/struts-bean.tld WEB-INF/src/ WEB-INF/src/README.txt WEB-INF/src/java/ WEB-INF/src/java/foo/ WEB-INF/src/java/foo/bar/ WEB-INF/src/java/foo/bar/LoginDAO.java WEB-INF/src/build.xml WEB-INF/lib/ WEB-INF/lib/struts.jar WEB-INF/lib/commons-logging.jar WEB-INF/lib/commons-beanutils.jar WEB-INF/classes/ WEB-INF/classes/foo/ WEB-INF/classes/foo/bar/ WEB-INF/classes/foo/bar/LoginDAO.class META-INF/ META-INF/MANIFEST.MF C:\>java judo -q "ls '*' in 'awebapp.tar.gz' fileOnly recursive" login.jsp index.jsp WEB-INF/web.xml WEB-INF/struts-logic.tld WEB-INF/struts-html.tld WEB-INF/struts-bean.tld WEB-INF/src/README.txt WEB-INF/src/java/foo/bar/LoginDAO.java WEB-INF/src/build.xml WEB-INF/lib/struts.jar WEB-INF/lib/commons-logging.jar WEB-INF/lib/commons-beanutils.jar WEB-INF/classes/foo/bar/LoginDAO.class META-INF/MANIFEST.MF
Files and folders within zip and tar archives all start at the root with no name. This fact becomes obvious when you get the file names in a tree:
Listing 15.12 get_tree_zip.judo |
---|
listFiles '*' in 'awebapp.zip' fileOnly recursive as tree; for x in $_.dfsAllNodes() { println x; } |
C:\>java judo -q get_tree_zip.judo
{isDir=true,path=}
{path=login.jsp}
{path=index.jsp}
{path=WEB-INF/web.xml}
{path=WEB-INF/struts-logic.tld}
{path=WEB-INF/struts-html.tld}
{path=WEB-INF/struts-bean.tld}
{path=WEB-INF/src/java/foo/bar/LoginDAO.java}
{path=WEB-INF/src/build.xml}
{path=WEB-INF/src/README.txt}
{path=WEB-INF/lib/struts.jar}
{path=WEB-INF/lib/commons-logging.jar}
{path=WEB-INF/lib/commons-beanutils.jar}
{path=WEB-INF/classes/foo/bar/LoginDAO.class}
{path=META-INF/MANIFEST.MF}
The first node, which is the root, has an empty path name.
Let's port the count_blank.judo
program to make it work with files residing in a zip archive.
listFiles '*.java, *.jj' in 'C:/src.jar' recursive { do $_ in 'C:/src.jar' as lines { ...... } }
For brevity we omitted unchanged parts. There are two significant changes: one is in listFiles ... in 'C:/src.jar'
, and the second is in do $_ in 'C:/src.jar' as lines
. This should work and give the right result.
But there is one performance concern. The do $_ in 'C:/src.jar' as lines
will open and close the zip archive for every single source file contained in the zip archive. Since the zip archive is already opened by the listFiles
command itself, why not use that open zip archive for this purpose? The remedy is, Judo has provided a built-in parameter, $$archive
, that holds the open zip archive and is available only in the processing block. Hence the revised version:
Listing 15.13 count_blank_zip.judo |
---|
cnt1 = 0; cnt2 = 0; listFiles 'src/com/judoscript/*.java, src/com/judoscript/*.jj' in 'C:/src.jar' recursive { do $_ in $$archive as lines { // now, $_ has become the line just read! if ($_.isEmpty()) ++cnt1; else ++cnt2; } } println ' Blank lines: ', cnt1:>7; println 'Non-Blank lines: ', cnt2:>7; println ' Total Files: ', $_.length:>7; |
As discussed in 14. Print, File I/O and In-Script Data, you can open the files within a zip archive such as $$archive.openTextFile($_)
to do more sophisticated read-only operations.
The Judo copy
command operates between directories and archives such as zip, jar, tar and gzipped tar files. It has good support for jar files including jar file manifest. The command's synta is:
CopyCommand | ::= | copy ( FileSelection | URL ) ( to | into ) Expr ( CopyOption | ArchiveOption )+ |
CopyOption | ::= | force | echo | Echo | keepDirs | dupOk |
ArchiveOption | ::= | compress | store | ( under | strip | manifest ) Expr |
URL | ::= | Expr |
The file selection is exactly the same as that of the listFiles
command, that is, files can be selected from a directory or an archive. The to
and into
clause specifies the target; the to
clause is for a destination directory or a file (where the source must be a single file); the into
clause specifies a new archive, whose type is determined by the file extension such as .zip
, .jar
, .war
, .tar
and .tar.gz
. Next, let's see copying files in the local file system first.
In a copy
command, when
to
clause does not exist, orthen it is assumed that a single source file will be copied to this name. If the target path name is not absolute, it is relative to the current directory. If there are more than one source file or the source is a directory, an exception is raised. Let's again run some Judo code from the command line and see what happens:
C:\x>java judo -x "copy 'alfa' to 'beta'" C:\x>md y C:\x>java judo -x "copy 'alfa' to 'y'" C:\x>java judo -x "copy 'alfa' to 'y/gamma'" C:\x>dir Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\x 09/02/2004 01:46p <DIR> . 09/02/2004 01:46p <DIR> .. 08/30/2004 06:15a 5 alfa 08/30/2004 06:15a 5 beta 2 File(s) 10 bytes 2 Dir(s) 38,433,390,592 bytes free C:\x>dir y Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\x\y 09/02/2004 01:49p <DIR> . 09/02/2004 01:49p <DIR> .. 08/30/2004 06:15a 5 alfa 08/30/2004 06:15a 5 gamma 2 File(s) 10 bytes 2 Dir(s) 38,433,390,592 bytes free
When copying files from a base, the relative paths can be retained via the keepDirs
option.
C:\>md z C:\>java judo -x "copy 'y/gamma' to 'z'" C:\x>dir z Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\x\z 09/02/2004 01:58p <DIR> . 09/02/2004 01:58p <DIR> .. 08/30/2004 06:15a 5 gamma 09/02/2004 01:57p <DIR> y 1 File(s) 5 bytes 3 Dir(s) 38,432,976,896 bytes free C:\>java judo -x "copy 'y/gamma' to 'z' keepDirs" C:\x>dir z\y Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\x\z\y 09/02/2004 01:57p <DIR> . 09/02/2004 01:57p <DIR> .. 08/30/2004 06:15a 5 gamma 1 File(s) 5 bytes 2 Dir(s) 38,432,976,896 bytes free
Copying a single file is simple, but the power of copy
command is to deal with a set of files.
Suppose you want to copy a tree of files and directories to a different directory, use the recursive
option; the keepDirs
option is implicitly turned on:
C:\>md y C:\>java judo -x "copy '*' in 'C:/x' to 'C:/y' recursive"
When copying file(s) from one location to another in the file system, by default the copy
command compares the file's time and size; if both time and size are the same between the source and target file, it passes the file without physically copying it. The force
option disables this optimization. The echo
option displays the source files being actually copied, and the Echo
option displayes both copied and passed files.
The same copy
is also a versatile archiving command. In the simplest form, it is almost identical to copying files in the file systems except for using into
clause rather than to
.
copy '*' in 'C:/x' into 'C:/test.zip' recursive;
The archive can be a zip, jar, tar or gzipped tar file, whose type is determined by the archive file extension. These extension are recognized: zip
, jar
, war
, ear
, rar
, tar
, taz
and tar.gz
; these extensions can be in mixed case as well. What if a file with an unknown extension is intended to be used as a, say, tar file? You can create the archive file via the createZip()
, createJar()
and createTar()
system functions, and use that open archive object as the destiny:
zip = createZip('iamdoc.doc'); copy '*' recursive into zip; zip.close(); tar = createTar('ship.tarball'); copy '*' recursive into tar; tar.close();
The open archive object is important for archiving multiple sources, as discussed in Save Multiple File Sets into a Single Archive.
The copy
command has these archiving options: compress
, store
, manifest
, under
and strip
. The first three options are zip-/jar-specific.
The under
option allows you to copy a tree of files under a specific prefix within the archive. Conversely, when copying files out of an archive, you can use strip
to strip that prefix. Suppose we have a directory like this:
C:\src\com\judoscript\ C:\src\com\judoscript\util\ C:\src\com\judoscript\parser\ C:\src_native\
and we want to archive files in C:\src\
into a zip (or tar) file under src/
like this:
src_java/com/judoscript src_java/com/judoscript/util/ src_java/com/judoscript/parser/
This is the way to do this:
copy '*' in 'C:/src/' recursive into 'src.zip' under 'src_java';
Later, when copying them out of src.zip
, we use this:
copy '*' in 'src.zip' recursive to 'C:/x' strip 'src_java/';
For zip or jar files, by default files are compressed. If you want to just store the files without compiling, such as creating Java executable jar files, use the store
option. The compress
option is also available but is almost always redundant. For jar files, you can also specify a manifest text along the way. The following is an example to create a Java executable jar:
Listing 15.14 make_xjar.judo |
---|
copy '*.java, *.properties' in 'C:/temp/classes/' recursive into 'judo.jar' store manifest [[* Manifest-Version: 1.0 Main-Class: judo Created-By: James Jianbo Huang (c) 2001-(* #year *) *]] ; |
Save Multiple File Sets into a Single Archive
The copy
target in the into
clause can be an open archive object, returned by the createZip()
, createZip()
and createTar()
system functions. Hence, you can easily copy multiple sets of files into a single archive. The under
clause is also handy for organizing files stored within the archives. For instance, I have a source file directory, a documentation directory and an example directory. Everyday I make a backup file with this structure:
src/ docs/ examples/
This is easily done with Judo:
Listing 15.15 backup.judo |
---|
zf = createZip('~/archives/work-'+Date().fmtDate('yyyyMMdd')+'.zip'); copy '*' in 'c:/src/' except '*/alfa*, */beta*, */save/*' recursive noHidden echo into zf under 'src/'; copy '*' in 'c:/docs/' except '*/alfa*, */beta*, */save/*' recursive noHidden echo into zf under 'docs/'; copy '*' in 'c:/examples/' except '*/alfa*, */beta*, */save/*' recursive noHidden echo into zf under 'examples/'; zf.close(); |
In the first line, a new zip file is created based on the date. Then, three sets of files are copied under different folder names before finally the zip archive is closed (and saved). Don't forget to close it!
When copying multiple sets of files into a single archive, it is possible to have duplicate files. If this is allowed, specify the dupOk
option; otherwise it will fail.
So far, we have seen how the copy
command can copy files between file systems and archives. In fact, it can copy public internet resources as well. All you have to do is to specify a URL as the source, mostly likely a HTTP or FTP URL. The source is never more than one. You can still save it to a location in the file system or into an archive.
To copy the resource to a file, like copying a single file from the file system, you can specify a directory or a file path name:
C:\>cd z C:\z>java judo -x "copy 'http://www.yahoo.com/index.html'" C:\z>java judo -x "copy 'http://www.yahoo.com/index.html' to 'i.html'; C:\z>java judo -x "copy 'http://www.yahoo.com'"; C:\z>java judo -x "copy 'http://www.yahoo.com/'"; C:\z>dir Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\z 09/03/2004 10:22p <DIR> . 09/03/2004 10:22p <DIR> .. 09/03/2004 10:21p 36,682 default.htm 09/03/2004 10:22p 36,676 i.html 09/03/2004 10:22p 36,676 index.html 3 File(s) 110,034 bytes
If file name is not specified in the URL, Judo provides a default file name, "default.htm"
. If the file name exists, such as index.html
in the example, then it is used. Sometimes, the file name part of the URL is not really a file name, for instance, http://finance.yahoo.com/q/cq?s=%5edji+%5eixic+beas+goog
, but Judo will simply use cq
as the target file name. Therefore, if you are copying a resource with a dynamic URL in nature, it's better to provide a target file name.
For static internet resources, you can retain the path of the remote resource in the local file system or archives via the keepDirs
option:
C:\z>java judo -x "copy 'http://dir.yahoo.com/Computers_and_Internet/index.html' keepDirs" C:\z>dir Computers_and_internet Volume in drive C is Local Disk Volume Serial Number is 8097-678E Directory of C:\z\Computers_and_internet 09/03/2004 10:30p <DIR> . 09/03/2004 10:30p <DIR> .. 09/03/2004 10:30p 22,589 index.html 1 File(s) 22,589 bytes
This feature, coupled with 25. SGML and JSP Scraping, can be used to efficiently construct a web crawler.
Network resources can also be copied into archives. The following example emulating copying the Yahoo! directory of Computers and Internet into a zip and a tar files.
Listing 15.16 download_yahoo_dir.judo |
---|
tar = createTar('yahoo_comp.tar.gz'); zip = createZip('yahoo_comp.zip'); urls = [ 'http://dir.yahoo.com/Computers_and_Internet/index.html', 'http://dir.yahoo.com/Computers_and_Internet/Software/index.html', 'http://dir.yahoo.com/Computers_and_Internet/Macintosh/index.html', 'http://dir.yahoo.com/Computers_and_Internet/Internet/index.html', 'http://dir.yahoo.com/Computers_and_Internet/Internet/WAIS/index.html' ]; for u in urls { copy u into tar keepDirs; copy u into zip keepDirs; } tar.close(); zip.close(); |
After execution, the zip archive has these files:
0 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/ 22721 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/index.html 0 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Software/ 23306 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Software/index.html 0 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Macintosh/ 25034 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Macintosh/index.html 0 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Internet/ 20739 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Internet/index.html 0 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Internet/WAIS/ 8981 Fri Sep 03 22:35:06 PDT 2004 Computers_and_Internet/Internet/WAIS/index.html
Judo provides built-in encryption for files and data, based on the javax.crypto
package included in JDK1.4 and up. In Judo, encryption and decryption are password-based; they are provided via these system functions:
function encryptFile
function decryptFile
function encrypt
function decrypt
function setCryptoClassName
The encrypt and decrypt functions can take byte arrays, strings or java.io.InputStream
as input, and produced the encrypted or decrypted result in a byte array.
The default implementation uses MD5 and DES encryption implemented in class com.judoscript.util.PBEWithMD5AndDES
. If you have highly confidential information to safeguard, you can provide your own crypto class via the system function setCryptoClassName()
; the crypto class must extend com.judoscript.util.PBEBase
and implement its encrypt()
and decrypt()
methods.
For big files are hard to transfer or save on different media. Downloading a 550MB file, for instance, may pose great problems for less-than-fast connections. Sometimes you may want to back up a 1GB file onto a 650MB CD-ROM. Judo provides a file chopping and assembling utility just for this purpose.
|