| Text
Filters and Formatters for DOS (text converters, formatters, filters, sort, case) Also see HTML utilities. Ratings: [* fair] to [* * * * * excellent] |
Front Page PROGRAMS GROUPED BY FUNCTION
|
* * * *
LM is a text file "line manipulator" -yet this description sorely underestimates its uses. Frankly, the command line syntax of this program will be unintelligible to most casual users- and the included documentation is of little value to the novice. It does not resemble the "standard" syntax of most DOS programs. I only include LM in my list because of it's combination of small size (37K) and versatility to perform nearly any type of text file formatting. I have written a couple batch files to
1. Strip carriage returns out of text files (for import into a word processor)
LM -atTYffilename -^strip.txt
2. To reflow text to a given margin width (e.g., 65):
LM -atTYffilename -^tmp.txt -; -W1w65ftmp.txt
> out.txt
del tmp.txt
LM also can strip or append lines, perform file renaming, file finding, and even play music from text files containing special musical notation. Some sample commands and batch files are provided which the novice user can easily modify to his or her needs.
Info for power users (from the documentation):
"The main operations supported are grip/non-grip, search/replace, synchronised line appendage from other files, input/output line selection by line numbers or passwords, spaces/empty lines absorption, filewise update or renaming, line width imposition and etc. Input lines can also be taken from only the command line."
* * *
Fixtext is a command line utiltiy that performs two general functions. It can convert among DOS, UNIX and MAC text formats. It can also translate (replace) characters or strings within a text. For example, it can convert uppercase letters to lower case letters- or it can convert ASCII characters to their ANSI (Windows) equivalents. There is only one hitch to translation. The user must write the translation tables. While this permits a great deal of flexibility and customization, it also requires effort to create and edit the translation tables (not difficult, but time consuming). A few specific operations such as trimming leading/trailing line spaces and expanding tabs are hard coded as command line switches. I'll soon be adding free utilities that perform specific translation tasks (uppercase to lowercase, etc.). reviewed 6-14-97.
* * *
Typical uses of the sort program are addresses, dictionaries, and other databases. Includes both 32-bit (for 386 computers and above) and 16-bit progrms. Both handle very large files.
* * *
Paginate is one of the few programs I'll probably never use -but which I can still highly recommend to a specific audience. Paginate is best described as a comprehensive command line ascii document formatter. As it's names suggests it can paginate a document for printing. But Paginate can also add page headers and footers, indent paragraphs, produce tables, wrap text at defined margins, etc. In order to generate a formatted document, one has to insert instruction codes within the document to be processed. Frankly, a word processor requires much less work and time for most tasks- and I suspect most home users will have little need for Paginate. But others will undoubtedly love it. Well designed for it's purpose. reviewed 6-17-97
*
If you're like me, I hate using Ghost Script to view postscript text-only documents because it's a disk space eater, and can be a bit imposing for the novice. PSX is a small, simple command line postscript document-to-text converter that I found somewhere on a BBS. It does a very inconsistent job of translation (sometimes good, sometimes very poor)- but if you just want to explore the contents of a postscript text file you downloaded off the Net, this app may suffice. It is donationware. I suspect you won't find the latest version (psx102e) anywhere on the Net except here. I couldn't locate an ftp site that carried it, although I saw one mention of it (in a Norwegian discussion group!) and apparently it is on Compuserve.
* * *
Many documents on the Web are ASCII text and lines are often broken by carriage returns at 60-80 columns. If you import this text into many word processors the carriage returns are retained and interpreted as paragraph marks. This usually fouls your attempts to apply special paragraph styles in your word processor because each line is now considered a paragraph. To get rid of these paragraph marks you need to run the text file through a filter prior to importing into your word processor. Some word processors may include such a filter, but mine (WinWord 2.0) doesn't.
The REMOVE package contains two versions- DOS and Windows (3.1). It strips single carriage returns while preserving actual paragraph boundaries (i.e., double carriage returns). I prefer to use LM for this task, but REMOVE is more user friendly. This package also contains CONV which can convert DOS, MAC, and UNIX text formats.
* * *
Xray extracts plain text from binary files. I'm often curious about what text is contained in executables, dll's, etc.. It can also be used as crude means of getting plain text from any word processor file although formatting is lost in the process.
* * *
Review in the works. The command line syntax in the docs is inaccurate. Use the help switch for proper syntax.
* * *
The folowing utilties were selected from Timo Salmi's (Garbo FTP administrator) larger text filters package. These represent text filters which have very specific tasks. Although I use them infrequently, I use them more often than I think. Text editors just can't perform these tasks easily.
| CUTW | Omit/extract whole words from lines based on their postion in line (1st word, 2nd word) |
| CUT | Omit/extract columns from files. |
| SLICE | Omit/extract rows from files. |
| COL | Take file and convert strings to a single column. |
| DETAB | Convert tab characters to spaces (adjustable number) |
| CONCAT | Join files side by side (columnar) |
My only minor 'style' gripe with this package is that most of the utilitites rely on piping and redirection instead of switch options for input and output files (but I guess that's how a true filter should behave!). reviewed 6-7-97
* *
Wrap is a useful filter that you may use more frequently than imagined. If you've ever tried to view text files using the TYPE or MORE commands, you'll notice that long line are wrapped at the right edge of the screen. Unfortunately, DOS is not smart enough to break lines without also breaking words apart. Wrap is a simple filter which prevents words from being split at the right margin. Also useful when redirecting output to files. The result is a much more readable standard output. I liked it enouigh to port it to OS/2.
[ Front Page ]
(c)1997 Richard L. Green