diff/cmp for remote files

December 24, 2009

rdiff, rcmp – diff remote files

rdiff and rcmp extends the diff and cmp utilities to remote files.
Each file or directory argument is either a remote filename of the form [[user@]host1:]file, or a local filename.

Note: the scp program is used to retrieve a remote file. Setting up ssh authentication (~/.ssh/authorized_keys on the remote host ) may be required. Google ‘ssh authorized keys’ for info.
Note2: rdiff can be hardlink’d to rcmp (ie: ln rdiff rcmp). The filename is used to determine functionality.

Example usages:
    $ rdiff -b produser@prodhost:/home/prod/bin/xyz.sh xyz.sh
    $ rcmp produser@prodhost:/home/prod/javalib/xyz.jar $HOME/dev/java/lib/xyz.jar
    $ rcmp ruser@rhost:/usr/local/abc:def /usr/local/abc:def

#! /bin/sh
#  Name: 
#  Synopsis:    rdiff [-b] [[user@]host1:]file1 [[user@]host2:]file2
#               rcmp [[user@]host1:]file1 [[user@]host2:]file2
#  Description: Run diff/cmp on remote files.  
#               Note: This script uses scp or rcp to copy remote files.
#                     (~/.ssh/.authorized_keys, or ~/.rhosts may need to be configured)
#               Note2: Simple pattern matching is used to determine if a file is remote or local 
#                      The filename matches a "user@host:" or "host:" prefix pattern.  To force
#                      a match for a local file, specify the full path with the leading '/'.

        if perl -e 'exit !(($ARGV[0] =~ /.*\@.*:\/.*/) || ($ARGV[0] =~ /.*:\/.*/)) &&
                         !($ARGV[0] =~ /\/.*/);' $1; then
                scp "$1" "$2"
                # rcp "$1" "$2"
                ln "$1" "$2";  # This is slight overkill, after all we already
                               # have the file.  But, it simplifies file cleanup...

# Main
case $0 in
  *rdiff) cmd=diff;;
  *rcmp) cmd=cmp;;

while getopts b opt; do
        case $opt in
          b) options="${options} -b";;
shift `expr $OPTIND - 1`

trap "rm -f $tmpfile1 $tmpfile2" 0 15

if getfile "$1" "$tmpfile1" && getfile "$2" "$tmpfile2"; then
        eval $cmd $options $tmpfile1 $tmpfile2


December 12, 2009

Here’s a fun, short program.
The primary goal is to write a program that outputs the source code of the program when it is executed (ie: clone itself). A secondary goal is to make the program as small as possible.

Here’s a sh/ksh/csh/bash implementation:
cat $0

Here’s a version written in C:
char *s="char *s=%c%s%c; main() { printf(s,34,s,34); putchar(0x0a); }"; main() { printf(s,34,s,34); putchar(0x0a); }

getopts.java is a simple parser for java. It is based on the Unix getopts shell utility of the same name.

Example usage: See main() below.

/* Name:        $Id$
 * Description: getopts is a simple command line parser based on the getopts shell parser.

package HHjlib;

public class getopts
        private String argv[];
        private int argc;
        private int optind = 0;
        private int optind2 = 0;
        private String optarg;

        /** Initialize the getopts parser with the command line argument array */
        public getopts(String args[])
                argv = args;
                argc = args.length;
                optind = 0;
                optind2 = 1;

        /** The getOption() method parses positional parameters.
The optstring parameter contains the option characters to be recognized; If a character is followed by a colon, the option is required to have an argument.
Note: The colon and question mark characters may not be used as option characters. Each time it is invoked, getOption() returns the next option. If an option requires an argument, the argument may be retrieved using the getOptionArg() method after getOption() processing. If an invalid option is seen, getOption() returns a '?'. When the end of options is encountered, getopts returns the null character. */ public char getOption(String optstring) { if (optind >= argc || argv[optind].charAt(0) != '-') return ''; char argv_option = argv[optind].charAt(optind2); int optlen = optstring.length(); for (int indx = 0; indx < optlen; ++indx) { char opt = optstring.charAt(indx); if (argv_option == opt) { int argv_length = argv[optind].length(); if ((indx + 1 < optlen) && (optstring.charAt(indx + 1) == ':')) { if (optind2+1 < argv_length) { optarg = argv[optind].substring(optind2+1); optind++; optind2 = 1; } else if (optind+1 < argc) { optarg = argv[optind+1]; optind += 2; optind2 = 1; } else { optarg = Character.toString(opt); return ':'; } } else { if (optind2 + 1 < argv_length) { optind2++; } else { optind++; optind2 = 1; } } return opt; } } optarg = Character.toString(argv_option); return '?'; } /** Return a cmdline option argument or the option character if command line parsing failed */ public String getOptionArg() { return optarg; } /** Return the index of the next cmdline array element to be processed */ public int getOptionIndex() { return optind; } public static void main(String args[]) { getopts cmdline = new getopts(args); char option; while ((option = cmdline.getOption("abcd:e:")) != '') { switch(option) { case 'a': case 'b': case 'c': System.out.println("Option='" + option + "'"); break; case 'd': case 'e': System.out.println("Option='" + option + "', Argument='" + cmdline.getOptionArg() + "'"); break; case '?': System.err.println("Error: Invalid option '-" + cmdline.getOptionArg() + "'"); System.exit(1); case ':': System.err.println("Error: Missing option argument for '-" + cmdline.getOptionArg() + "'"); System.exit(1); } } for (int indx = cmdline.getOptionIndex(); indx < args.length; ++indx) { System.out.println(args[indx]); } } } /* :!javac % :!java `basename % .java` -a abc :!java `basename % .java` -a -b a b c :!java `basename % .java` -ab -c a b c :!java `basename % .java` -a -bc a b c :!java `basename % .java` -abc arg1 arg2 :!java `basename % .java` -ddarg arg1; # error: missing option argument :!java `basename % .java` -d darg arg1; # error: missing option argument :!java `basename % .java` -ddarg arg1 -e earg; # error: missing option argument :!java `basename % .java` -q; # Invalid option :!java `basename % .java` -d; # error: missing option argument */

Here’s a very simple utility program that I use to debug how the shell interpreter parses quoted arguments in shell scripts. Shell quoted strings can be one of the trickiest things to get right in a script. White-space characters in filenames and directory names, string concatenation, nested single, double, back-tick quotes are only a few of the complications!

Note: To compile the code    $ cc printargs.c -o printargs

Here’s an example of debugging using printargs:
  $ cp $file $target

This works fine as long as the $file or $target name does not contain any white space characters. But if they do, the command will fail. For example: turn on trace ‘set -x’ and set $file=’Star Trek IV.mp4′. The trace will display ‘+ cp Star Trek IV.mp4 destdir’ which looks right, but actually is wrong. Let’s prefix the cp command with printargs to see why.

$ file=’Star Trek IV.mp4′; target=destdir
$ printargs cp $file $target
+ printargs cp Star Trek IV.mp4 destdir
1 : ‘cp’
2 : ‘Star’
3 : ‘Trek’
4 : ‘IV.mp4’
5 : ‘destdir’

The copy command is actually trying to copy three files {Star, Trek, IV.mp4} to the dest directory. It should be just one file (‘Star Trek IV.mp4’). The fix is to quote both $file and $target ie: cp “$file” “$target”.

$ file=’Star Trek IV.mp4′; target=destdir
$ printargs cp “$file” “$target”
1 : ‘cp’
2 : ‘Star Trek IV.mp4’
3 : ‘destdir’

After debugging, remove the printargs prefix from the command and the script should be good to go.

/* Name:        %I%
 * Synopsis:    printargs -c -i -q commandline
 * Description: printargs displays cmdline arguments.  This is useful for debugging 
 *              shell scripts that escape the space, backslash, quote chars ('"`)
 *              and other wildcard characters (eg: "[*?]").
 *                Options: -c  Echo the command line
 *                         -i  Do not display argument indices
 *                         -q  Do not display single quotes surrounding each argument


main(int argc, char *argv[])
        int opt;
        extern int optind;
        int echo_cmdline = 0; 
        int display_indices = 1; 
        int indx;
        char *quotes = "'";
        char *spaces = "";

        while ((opt = getopt(argc, argv, "ciq")) != EOF) {
                switch(opt) {
                case 'c': 
                        echo_cmdline = 1;
                case 'i':
                        display_indices = 0;
                case 'q':
                        quotes = "";
                        fprintf(stderr, "%s: Invalid option '%c'\n", opt);
                        fprintf(stderr, "Syntax: %s [-ciq] command command_options command_args ..."

        if (echo_cmdline == 1) {
                printf("CmdLine: ");
                for (indx = optind; indx <= argc; ++indx)
                        printf("%s%c", argv[indx], indx < argc ? ' ' : '\n');

        for (indx = optind; indx <= argc; ++indx) {
                if (display_indices == 1)
                        printf("%d : ", indx - optind + 1);
                printf("%s%s%s\n", quotes, argv[indx], quotes);


1. hgrep displays grep results in highlights.
2. rgrep runs grep on all the files in a directory tree.

Example usage:
  $ hgrep main *.c; # Search for main in all C files
  $ rgrep -e ‘[Ff]lour’ -e cornstarch $HOME/recipes; # Search all files in the recipes directory for flour or cornstarch

— hgrep —

#! /bin/sh
#  Synopsis:    hgrep pattern file ...
#  Description: Hilighted grep

smso=`tput smso`
rmso=`tput rmso`
pattern="$1"; shift
grep "$pattern" "$@" | sed -e "s@\(${pattern}\)@${smso}\1${rmso}@"

— rgrep —

#! /bin/sh
# Name:         $Id: rgrep,v 1.1 2009/10/11 20:37:21 hhong Exp $
# Synopsis:     rgrep pattern directory ...
# Synopsis:     rgrep [-e pattern ...] pattern directory ...
# Description:  Run grep on all the files of a directory tree

while getopts e:i opt; do
        case $opt in
          e) pattern="$pattern -e '$OPTARG'";;
          i) ignorecase='-i';;
shift `expr $OPTIND - 1`
if [ -z "$pattern" ]; then
        pattern="'$1'"; shift

for arg; do
        if [ ! -d "$arg" ]; then
                echo "$0: Error '$arg' is not a directory." >&2
                exit 1

find "$@" -type d -print | while read dir; do
        edir=`ls -A "$dir"`
        if [ -n "$edir" ]; then  # Check for empty directory
                # Use eval in case $pattern or $dir contain whitespace characters
                eval grep $ignorecase $pattern "'$dir'/*" /dev/null;  

Perform number base conversions and calculations.
Note: The script behaves differently, depending on the filename (the executable may still be the same file). The following hard-links ($ ln math.sh …) to the math.sh script provides the following shortcuts…

  • x2d (hex to decimal), x2o (hex to octal), x2b (hex to binary)
  • d2x (decimal to hex), d2o (decimal to octal), d2b (decimal to binary)
  • o2x (octal to hex), o2d (octal to decimal), o2b (octal to binary)
  • b2x (binary to hex), b2d (binary to decimal), b2o (binary to octal)

Example usage:
  $ d2x;  # (Interactive) decimal to hex conversions
  $ x2d a b c d e f 10/2
  $ echo ‘f+1 ff+2’ | x2d

#! /bin/sh
#  Synopsis:    Math.sh [-i inputbase] [-o outputbase] [math-expression | Number]  ...
#  Description: Math.sh performs base number conversions, and 'bc -l' calculations
#               Note1: Spaces should be avoided in math-expressions.  A space character in 
#                      an expression may be intepreted as a command line field separator.
#               Note2: Renaming or (hard/symbolic) ln'ing to the script alters the default input 
#                     and output base (ie: b2o,b2d,b2x, o2b,o2d,o2x, d2b,d2o,d2x, x2b,x2d,x2d)
#                                          (b=binary, o=octal, d=decimal, x=hexadecimal)
#  Example Usage: $ math.sh;  # Interactive decimal calculator
#                 $ math.sh '1.11111111^2'
#                 $ echo 'ff ff+1' | x2d 

case $0 in
  *b2o) ibase=2;  obase=8;;
  *b2d) ibase=2;  obase=10;;
  *b2x) ibase=2;  obase=16;;
  *o2b) ibase=8;  obase=2;;
  *o2d) ibase=8;  obase=10;;
  *o2x) ibase=8;  obase=16;;
  *d2b) ibase=10; obase=2;;
  *d20) ibase=10; obase=8;;
  *d2x) ibase=10; obase=16;;
  *x2b) ibase=16; obase=2;;
  *x2o) ibase=16; obase=8;;
  *x2d) ibase=16; obase=10;;
        while getopts i:o: opt; do
                case "$opt" in
                  i) ibase=$OPTARG;;
                  o) obase=$OPTARG;;
        shift `expr $OPTIND - 1`

echo "obase=$obase"
echo "ibase=$ibase"
if [ $# -eq 0 ]; then
        echo "$*"
fi | tr 'a-f ' 'A-F\n' 
} | bc -l 

This is a utility script that operates similarly to the find file (ff.sh) script previously presented. But instead of searching a directory tree, it searches the components of a $PATH environment variable.

Example usage:
    $ ffpath javac jdb; # Find the Java compiler and debugger
    $ ffpath -m myscript1 myscript2; # Display the myscript1 and myscript2 scripts using more
    $ ffpath -v PERL5LIB -s’;’ -e ‘pod2text’ sybdbi.pm; # Find and convert the Sybase DBi Perl doc to text

#! /bin/sh
#  Name:        $Id: ffpath,v 1.4 2008/10/09 12:18:51 hhong Exp $
#  Synopsis:    ffpath.sh [-v envvar] [-s field_separator] [-[l1cm]] file ...
#  Description: Search $PATH for file(s)
#               Options: -v envvar  Specify environment variable
#                        -s fs      Use 'fs' for the directory field separator
#                        -l         list file info using 'ls -l' long format (default)
#                        -1         list one file per line ie: 'ls -1'
#                        -c         Use 'cat' to display the file
#                        -m         Use 'more' to display the file
#                        -e prog    Run 'prog' on file

while getopts v:s:l1cme: opt; do
        case "$opt" in
          v) envvar=$OPTARG;;
          s) sep=$OPTARG;;
          l) display='lsl';;
          1) display='ls1';;
          c) display='cat';;
          m) display='more';;
          e) display='exec'; exec=$OPTARG;;
          \?) exit 1;;
shift `expr $OPTIND - 1`

for file; do
        eval echo \$${envvar:-PATH} | tr {$sep:-:} '\n' | while read dir; do
                if [ -f "$dir/$file" ]; then
                        case "${display:-lsl}" in
                          ls1) ls -1 "$dir/$file";;
                          lsl) ls -l "$dir/$file";;
                          echo) echo "$dir/$file";;
                          cat) cat "$dir/$file";;
                          more) less -rXe "$dir/$file";;
                          exec) eval $exec "$dir/$file";;