LinuxDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


Building Unix Tools with Ruby
Pages: 1, 2, 3

Read Command-Line Options and Arguments

The specification presented in an earlier section lists several options, which csvt should understand. Your script can access the list of options and arguments in two ways, reading them directly from the ARGV array (passed to your script automatically by the operating system) or using the GetoptLong module to parse ARGV for you. The latter method is preferred: it's easier and saves time.



GetoptLong is an external module, so it must be explicitly imported before you can use it:

require 'getoptlong'

After your script imports getoptlong, you will also need to create a new instance of GetoptLong:

opts = GetoptLong.new(
    [ "--extract",          "-e",   GetoptLong::REQUIRED_ARGUMENT ],
    [ "--remove",           "-r",   GetoptLong::REQUIRED_ARGUMENT ],
    [ "--help",             "-h",   GetoptLong::NO_ARGUMENT ],
    [ "--usage",            "-u",   GetoptLong::NO_ARGUMENT ],
    [ "--version",          "-v",   GetoptLong::NO_ARGUMENT ]
)

The arguments passed to GetoptLong.new are the names of the long and the short options, and the argument flags that finetune the behavior of the option parser implemented in GetoptLong. The example above shows how the csvt option specification is turned into code. It is a good habit to define both long and short options, but if for some reason it isn't possible or desired, you can omit them and put "" in place of either the long or the short option that you wish to leave undefined. The argument flags can be set to REQUIRED_ARGUMENT, NO_ARGUMENT, or OPTIONAL_ARGUMENT. The GetoptLong option and argument parser uses these settings to decide how it should interpret the contents of ARGV.

Once you have a properly initiated instance of the option parser, you can add code to checks which options have been selected and what mistakes have been made. GetoptLong provides a lot of help here; your job is limited to defining a few global variables and handling any errors that may occur at this stage.

First, let's define a few global variables:

version        = "0.0.1" # used by the --version or -v option handler
extract_f      = false   # set to true when --extract or -e are used
extract_args   = []      # stores the list of arguments of --extract or -e
remove_f       = false   # set to true when --remove or -r are used
remove_args    = []      # stores the list of arguments of --remove or -r
ex_options_n   = 0       # used to store the number of mutually exclusive
                         # options, when > 1, the script will terminate
have_options_f = false   # set to true when at least one option is used

Next, you need to check which options have been used. The general layout of the block of code responsible for testing this and setting appropriate parameters that will be used to change the behavior of csvt follows the pattern show below:

begin
    opts.each do |opt, arg|
        case opt
            when option
                 ... option handler ...
            when option
                 ... option handler ...
        end
    end

rescue
    ... handle exceptions ...
end

The begin-rescue-end construct that wraps the opts.each do loop is required to add the exception handler, rescue-end, that provides a way to gracefully handle unexpected situations. We need that handler, because we do not want the user to see the trace messages printed by the Ruby interpreter when GetoptLong raises an exception. A short error message and a help screen are much more user friendly.

Let's get down to the details. The opts.each do |opt, arg| loop reads options and their arguments, if any are expected:

begin
    opts.each do |opt, arg|

Should the value of opt be some undefined option (e.g., -w), GetoptLong will display a error message about unsupported option, throw an exception, and stop the execution of the script. This sounds a bit drastic, but as you will see in a moment, you can handle that situation easily.

If the value of opt is one of the known options (e.g., --extract), it will be examined by the following case control structure, which sets the extract_f flag and checks which columns from the source file the user wants to print.

Notice that it does not matter if the user uses the long or the short version of the --extract option. GetoptLong treats them both as the same option, which means that you only need to write one handler.

case opt
    when "--extract"
        extract_f    = true
        extract_args = arg.split(",")

        tmp = 0
        extract_args.each do |column|
            begin
                extract_args[tmp] = Integer(column)
                tmp += 1
            rescue
                $stderr.print "csvt: non-integer column index\n"
                printusage(1)
            end
        end

        ex_options_n   += 1
        have_options_f  = true

The --extract option handler sets the extract_f flag, splits the arguments that follow it (remember, these are numbers separated with commas), and checks if all arguments of --extract are numerical, integer indexes. When all goes well, the ex_options_n exclusive options counter is incremented and the have_options_f flag is set to indicate that at least one option was selected by the user. This is used to avoid ambiguity when the user selects mutually exclusive options.

Because the --extract and --remove options are quite similar in the way they work, their handlers are also almost identical (see below).

    when "--remove"
        remove_f    = true
        remove_args = arg.split(",")

        tmp = 0
        remove_args.each do |column|
            begin
                extract_args[tmp] = Integer(column)
                tmp += 1
            rescue
                $stderr.print "csvt: non-integer column index\n"
                printusage(1)
            end
        end

        ex_options_n   += 1
        have_options_f  = true

Requests for csvt version information are handled by the code shown below. Notice that it doesn't matter if other options were used. Once --version or -v are found, csvt prints version information and exits with 0 (no errors).

    when "--version"
        print $0, ", version ", version, "\n"
       exit(0)

Should the user need some help on csvt usage, our script displays the help screen and exits with 0.

    when "--help"
        printusage(0)

    when "--usage"
        printusage(0)
    end
end

Once the loop ends, it's time to check for possible errors like mutually exclusive and missing options. Both are considered errors and result in displaying an error message followed by the help screen.

#################################################################
# test for mutually exclusive options: --extract and --remove

if ex_options_n > 1
    $stderr.print $0, ": cannot use --extract (-e) and --remove (-r) together\n"
    printusage(1)
end

#################################################################
# test for missing options

if have_options_f == false
    printusage(1)
end

The last piece of the option-processing block of code is the exception handler, which prints the help screen, exits csvt, and returns error code 1.

rescue
    # all other errors
    printusage(1)
end

Your code should look like this now:

require 'getoptlong'

    version        = "0.0.1" # used by the --version or -v option handler
    extract_f      = false   # set to true when --extract or -e are used
    extract_args   = []      # stores the list of arguments of --extract or -e
    remove_f       = false   # set to true when --remove or -r are used
    remove_args    = []      # stores the list of arguments of --remove or -r
    ex_options_n   = 0       # used to store the number of mutually exclusive
                             # options, when > 1, the script will terminate
    have_options_f = false   # set to true when at least one option is used 

    def printusage(error_code)
        print "csvt -- extract columns of data from a CSV (Comma-Separate Values) file\n"
        print "Usage: csvt [POSIX or GNU style options] file ...\n\n"
        print "POSIX options                     GNU long options\n"
        print "    -e col[,col][,col]...             --extract col[,col][,col]...\n"
        print "    -r col[,col][,col]...             --remove col[,col][,col]...\n"
        print "    -h                                --help\n"
        print "    -u                                --usage\n"
        print "    -v                                --version\n\n"

        print "Examples: \n"
        print "csvt -e 1,5,6 file            print column 1,5 and 6 from file\n"
        print "csvt --extract 4,1 file       print column 4 and 1 from file\n"
        print "csvt -r 2,7,1 file            print all columns except 2,7 and 1 from file\n"
        print "csvt --remove 6,0 file        print all columns except 6 and 0 from file\n"
        print "cat file | csvt --remove 6,0  print all columns except 6 and 0 from file\n\n"
        print "Send bugs reports to bugs@foo.bar\n"
        print "For licensing terms, see source code\n"
        exit(error_code)
    end

    opts = GetoptLong.new(
        [ "--extract",     "-e",   GetoptLong::REQUIRED_ARGUMENT ],
        [ "--remove",      "-r",   GetoptLong::REQUIRED_ARGUMENT ],
        [ "--help",        "-h",   GetoptLong::NO_ARGUMENT ],
        [ "--usage",       "-u",   GetoptLong::NO_ARGUMENT ],
        [ "--version",     "-v",   GetoptLong::NO_ARGUMENT ]
    )

    begin
        opts.each do |opt, arg|
            case opt
                when "--extract"
                    extract_f    = true
                    extract_args = arg.split(",")

                    tmp = 0
                    extract_args.each do |column|
                        begin
                            extract_args[tmp] = Integer(column)
                            tmp += 1
                        rescue
                            $stderr.print "csvt: non-integer column index\n"
                            printusage(1)
                        end
                    end

                    ex_options_n   += 1
                    have_options_f  = true

                when "--remove"
                    remove_f    = true
                    remove_args = arg.split(",")

                    tmp = 0
                    remove_args.each do |column|
                        begin
                            remove_args[tmp] = Integer(column)
                            tmp += 1
                        rescue
                            $stderr.print "csvt: non-integer column index\n"
                            printusage(1)
                        end
                    end

                    ex_options_n   += 1
                    have_options_f  = true

                when "--help"
                    printusage(0)

                when "--usage"
                    printusage(0)

                when "--version"
                    print "csvt, version ", version, "\n"
                    exit(0)
            end
        end

        #################################################################
        # test for mutually exclusive options: --extract and --remove

        if ex_options_n > 1
            $stderr.print "csvt: cannot use --extract (-e) and --remove (-r) together\n"
            printusage(1)
        end

        #################################################################
        # test for missing options 

        if have_options_f == false
            printusage(1)
        end

    rescue 
        printusage(1)
    end

Pages: 1, 2, 3

Next Pagearrow




Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!


Linux Resources
  • Linux Online
  • The Linux FAQ
  • linux.java.net
  • Linux Kernel Archives
  • Kernel Traffic
  • DistroWatch.com


  • Sponsored by: