ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


FreeBSD Basics

Working With Text

10/18/2000

Once you have your FreeBSD system up and running, what do you spend most of your time doing? Why, working with and creating files, of course. In today's article, I want to concentrate on manipulating text in files; we'll start with some useful commands that came with your FreeBSD system, then we'll examine some of the utilities in the ports collection.

One of the most useful utilities for quick manipulation of text files is the cat utility. One does have to be careful when using cat, especially if you're quick with the Enter key, as it is easy to destroy existing files if you're not careful. By default, cat will display the specified file to your screen. That is,

cat filename

will display the contents of a file named filename.

You can also use cat to display multiple files like so:

cat filename1 filename2 filename3

However, even the fastest speed-reader will miss most of the contents of the cat command if the contents of the file(s) are longer than one screen. To force cat to display its output one screen at a time, pipe its output to the more command like so:

cat file1 file2 | more

Or, save yourself some typing and let the more utility directly display the files for you like so:

more file1 file2

Both commands will get the job done, but if you use cat, it won't tell you when file1 ends and file2 begins; if you just use more, it will. I usually use the more utility to display files, unless I'm absolutely certain that the file is only a few lines long. If the file you wish to view has been gzipped and has a .gz extension, use the utilities zcat or zmore to view the file without having to gunzip it first.

You can also use cat as a quick and dirty editor to create a new file. By default, cat reads a file and sends its results to your terminal; to force it to instead read your input and send it to a file, you need to use the > redirector. Try this in your home directory:

cat > test

Notice that you've lost your prompt as the cat command is waiting for input to send to a file called test. Type what you want to appear in the file, and when you're finished, press Enter, then Ctrl-d.

This is a test file
with a couple of lines of text

and a blank line.
^d

You should now have your cursor back. To view this new file, use cat without the redirector:

cat test

Simple, wasn't it? However, this is where cat can be dangerous. If I already had a file named test in this directory, cat would happily overwrite that file without warning me first. I would be very unhappy if that file happened to be my thesis, or my resume, or any other file I may have been attached to -- something to keep in mind when using cat with the > redirector.

You can also use cat to join multiple files into one file. This command:

cat file1 file2 file3 > bigfile

will create a new file called bigfile out of the contents of file1, file2, and file3 in that order. If there already was a file called bigfile, it will be destroyed (there's that > redirector, again), but you will still have your original three files.

The cat utility also understands the >> redirector, which appends (adds to the end) of a file. If the file does not already exist, cat will create the file for you. So if you type:

cat file4 file5 >> bigfile

bigfile will add the contents of files 4 and 5 without destroying the earlier contents of files 1 to 3, while

cat file4 file5 >> newbigfile

will create a newbigfile that contains the contents of files 4 and 5. You'll notice that the >> redirector is much safer than the > redirector. However, if you're a fast typist, look before you press enter and make sure you really did type >> instead of >.

Remember back in high school when you had to write those 1500 word essays? There may still be times when you need to know how many words or lines are in a file. You don't need a fancy editor to do this for you, and fortunately, your FreeBSD system is much quicker at counting than you are. If I were to leave this article and type:

wc wo*
      97     689    3717 working_with_text

the wc or word count command would tell me that this article has 97 lines, 689 words, and 3717 bytes of information. If I wanted to see how this compared to the other articles in my working directory, I could type this:

cd ~/articles
wc *
     198    1326    8078 buildx.txt
     361    2411   13447 change_prompt
     431    2763   16147 cron_intro
     245    1556    9146 desktop.txt
     263    1768   10426 ethereal.txt
     351    2104   12855 howto_rtfm_part1
     374    2246   14018 howto_rtfm_part2
     363    2642   15523 intro_to
     334    1392    7987 loginshe.txt
     267    1528    8837 mountfs.txt
     321    2319   13712 networking_with_tcpip
     305    1524    9235 nfs.txt
     268    2101   11761 permissions_part1
     367    2235   12373 permissions_part2
     386    2063   12430 ppp.txt
     257    1478    8979 sharity.txt
     310    1738    9777 useful.txt
      97     689    3717 working_with_text
    5498   33883  198448 total

If I just need to know the number of words, I would type:

wc -w filename

and to just know the number of lines:

wc -l filename

Using wc is a lot quicker than opening up an editor, hunting for my mouse under my piles of scrap paper, and trying to find the correct menu option.

To actually number the lines in a file, you can use a switch with the cat utility. Let's say I'm teaching a friend how to write a simple script; it will be easier on us both if I can refer to a line number when pointing out syntax. This command:

cd ~/perlscripts
cat -n square
     1	#!/usr/bin/perl -w
     2	
     3	print shift() **2, "\n";
     4	

will show that this perlscript has four lines: two lines with text and two blank lines. If I'm only interested in numbering the lines that contain text, I would use this command instead:

cat -b square
     1	#!/usr/bin/perl -w

     2	print shift() **2, "\n";

I could now demonstrate to my friend the simply beauty of a perlscript; that one line of code (line 2 of the cat -b output) creates a script that will calculate the square of a number like so:

./square 43
1849

But you don't want to get me started on Perl; let's continue working with text files.

Let's say you're typing out that memo to your boss and you can't remember if the word "actually" has one or two "l"s. The quickest way to find out is to run the look utility at another virtual terminal like so:

look actual
actual
actualism
actualist
actualistic
actuality
actualization
actualize
actually
actualness

Notice that I just supplied the root word "actual" and received all of the possibilities that could be added to that root, including the one I was looking for. I've yet to find a quicker way to get the correct spelling of a word, along with other possibilities that I may actually prefer.

However, if you are a terrible speller, you may prefer an interactive spell checker that will check an entire document for you. Both aspell and ispell in the ports collection will do this for you. Both utilities can be run from the command line, and aspell can be integrated into e-mail readers and other editors. Let's take a quick look at both; I'll start with ispell. If you've installed the ports collection, become root, make sure you're connected to the Internet, and type:

cd /usr/ports/textproc/ispell
make && make install

When it's finished installing, leave the superuser account. If you are in the C shell, type:

rehash

Now let's create a quick text file with some spelling mistakes:

cd ~
cat > typos
This is a very quik file
to demunstrate my
terruble spelling.
^d

To spellcheck this file using ispell, type:

ispell typos

which will highlight the first mispelled word and give you various options on dealing with the misspelling like so:

  quik		File: typos

This is a very quik file 

00: quib
01: quick
02: quid
03: quin
etc.
[SP] <number> R)epl A)ccept I)nsert L)ookup U)ncap Q)uit e(X)it or ? for help

Note the toolbar at the bottom of the screen. Since the correct spelling has been offered, if you press "r," then "1," and Enter, "quik" will be replaced with "quick," and ispell will move on to the next misspelled word. When you are finished, type "x" to save your changes; if you decide that you preferred your misspellings, use "q" to exit without saving the changes.

You can also add words to the ispell dictionary; this is most useful for acronyms or personal names. To do this, press "i" to insert into the dictionary. These inserts will be stored in a file in your home directory called .ispell_english. To find about the other useful features of ispell, use the "?" while in ispell, or read its manpage.

Although ispell is easy to use, it won't catch all of your misspellings. If I was a really terrible speller and had written this line in the typos file:

This is a veery kwik file 

ispell would bypass the word "veery" completely and only offer the word "kaik" as a substitute for "kwik".

Let's try aspell on this file. Again, as root and while connected to the Internet, type:

cd /usr/ports/textproc/aspell
make && make install

Don't forget to leave the superuser account and cd back to your home directory when you are finished. Let's quickly overwrite that typos file with the > redirector:

cd
cat > typos
This is a veery kwik file.
^d

The syntax to use aspell is a little longer than ispell; don't forget the word check, or you'll receive a syntax error.

aspell check typos
This is a *veery* kwik file.
1) very             6) veers
2) veer             7) weary
3) Vera             8) every
4) vary             9) verier
5) leery            0) were
i) ignore           I) Ignore all
r) Replace          R) Replace all
a) Add              x) Exit
?

Note that the misspelled word is in asterisks instead of highlighted; it did catch the word "veery" that ispell missed. Also, instead of a menubar at the bottom, the actions are mixed in with the possible spelling options. If we were to continue spell checking this file, aspell would also give a viable alternative to the word "kwik".

Usually I would tell you to read the manpage for aspell to see all of its features, but it does not have one. Instead, you'll have to:

cd /usr/local/share/doc/aspell/man-html

to find its documentation. You could then open up the file manual.html in your favorite web browser and follow the hyperlinks. The manual is well worth browsing, especially if you would like to integrate aspell into your e-mail reader.

In next week's article, we'll take a look at the Webmin utility found in the ports collection.

Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.


Read more FreeBSD Basics columns.

Discuss this article in the Operating Systems Forum.

Return to the BSD DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.