LinuxDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


The Writer's Workbench
Pages: 1, 2

Checking text for readability

The style command analyzes the writing style of a given text. It performs a number of readability tests on the text and outputs their results, and it gives some statistical information about the sentences of the text.



Give as an argument the name of the text file to check. For example, to check the readability of the file banquet-speech.txt, you'd type:

$ style banquet-speech.txt RET

Like diction, style reads text from the standard input if no text is given.

The various readability formulas that style uses and outputs are as follows:

  • The Kincaid Formula, originally developed for Navy training manuals, a good readability for technical documentation;
  • the Automated Readability Index (ARI);
  • the Coleman-Liau Formula;
  • the Flesh reading easy formula, which gives an approximation of readability from 0 (difficult) to 100 (easy);
  • the Fog Index, which gives a school grade reading level;
  • the WSTF Index, a readability indicator for German document; and
  • the Wheeler-Smith Index, Lix formula and SMOG-Grading tests, all readability indicators which give a school grade reading level.

The sentence characteristics of the text which style outputs are as follows:

  • Number of characters
  • Number of words, their average length, and average number of syllables
  • Number of sentences and average length in words
  • Number of short and long sentences
  • Number of paragraphs and average length in sentences
  • Number of questions and imperatives

Finding difficult sentences

To output just "difficult" sentences of a text, use the -r option followed by a number; style will output only those sentences whose ARI readability index is greater than the number you give.

For example, to output all sentences in the file banquet-speech.txt whose readability is greater than a value of 20, type:

$ style -r 20 banquet-speech.txt RET

Displaying long sentences

You can use style to output sentences longer than a certain length by giving the minimum number of words as an argument to the -l option.

For example, to output all sentences longer than 14 words in the file banquet-speech.txt, type:

$ style -l 14 banquet-speech.txt RET

Spelling

Two additional commands that Walker says were part of the Writer's Workbench have long been standard on Linux: look and spell. Both tools work on the system dictionary file, /usr/dict/words. This file is nothing more than a word list (albeit a very large one), sorted in alphabetical order and containing one word per line. Words that are correct regardless of case are listed in lower-case letters, and words which rely on some form of capitalization in order to be correct (such as proper nouns) appear in that form.

The look tool outputs words in the system dictionary that begin with the text you give as an argument. It's useful for checking to see which words begin with a particular phrase or prefix.

For example, to list all the words in the dictionary that begin with the text "homew", you'd type:

$ look homew RET

This command will output words such as "homeward" and "homework."

When you're unsure whether or not a particular word is spelled correctly, use spell to find out. It reads from the standard input and outputs any words that don't appear in the system dictionary file -- so if a word is potentially misspelled, it will be echoed back on the screen after you type it.

For example, to check if the word "occurance" is spelled correctly, you'd type:

$ spell RET
occurance RET
occurance
^D
$

In this example, spell echoed the word "occurance" after it was typed, meaning that this word was not in the system dictionary and therefore was likely a misspelling. A Control-D was typed to exit spell and return to the shell prompt.

Next week: How to make and manage documents with SGML-tools.

Michael Stutz was one of the first reporters to cover Linux and the free software movement in the mainstream press.


arrowMore Living Linux articles.




Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!


Linux Resources
  • Linux Online
  • The Linux FAQ
  • linux.java.net
  • Linux Kernel Archives
  • Kernel Traffic
  • DistroWatch.com


  • Sponsored by: