SORT(C) XENIX System V SORT(C)
Name
sort - Sorts and merges files.
Syntax
sort [-cmu] [-ooutput] [-ykmem] [-zrecsz] [-dfiMnr] [-b] [-
tx] [+pos1] [-pos2] [files]
Description
sort sorts lines of all the named files together and writes
the result on the standard output. The standard input is
read if - is used as a file name or if no input files are
named.
Comparisons are based on one or more sort keys extracted
from each line of input. By default, there is one sort key,
the entire input line, and ordering is determined by the
collating sequence defined by the locale (see locale(M)).
The following options alter the default behavior:
-c Check that the input file is sorted according to the
ordering rules; give no output unless the file is out
of sort.
-m Merge only, the input files are already sorted.
-u Unique: suppress all but one in each set of lines
having equal keys. This option can result in unwanted
characters placed at the end of the sorted file.
-ooutput
The argument given is the name of an output file to use
instead of the standard output. This file may be the
same as one of the inputs. There may be optional
blanks between -o and output.
-ykmem
The amount of main memory used by the sort has a large
impact on its performance. Sorting a small file in a
large amount of memory is a waste. If this option is
omitted, sort begins using a system default memory
size, and continues to use more space as needed. If
this option is presented with a value, kmem, sort will
start using that number of kilobytes of memory, unless
the administrative minimum or maximum is violated, in
which case the corresponding extremum will be used.
Thus, -y0 is guaranteed to start with minimum memory.
By convention, -y (with no argument) starts with
maximum memory.
-zrecsz
Causes sort to use a buffer size of recsz bytes for the
Page 1 (printed 2/7/91)
SORT(C) XENIX System V SORT(C)
merge phase. Input lines longer than the buffer size
will cause sort to terminate abnormally. Normally, the
size of the longest line read during the sort phase is
recorded and this maximum is used as the record size
during the merge phase, eliminating the need for the -z
option. However, when the sort phase is omitted (-c or
-m options) a system default buffer size is used, and
if this is not large enough, the -z option should be
used to prevent abnormal termination.
The following options override the default ordering rules.
-d ``Dictionary'' order: only letters, digits and blanks
(spaces and tabs) are significant in comparisons.
Dictionary order is defined by the locale setting (see
locale(M)).
-f Fold lower case letters into upper case. Conversion
between lowercase and uppercase letters are governed by
the locale setting (see locale(M)).
-i Ignore non-printable characters in non-numeric
comparisons. Non-printable characters are defined by
the locale setting (see locale(M)).
-M Compare as months. The first three non-blank
characters of the field are folded to upper case and
compared so that ``JAN'' < ``FEB'' < ... < ``DEC''.
Invalid fields compare low to ``JAN''. The -M option
implies the -b option (see below).
-n An initial numeric string, consisting of optional
blanks, an optional minus sign, and zero or more digits
with optional decimal point, is sorted by arithmetic
value. The -n option implies the -b option (see
below). Note that the -b option is only effective when
restricted sort key specifications are in effect.
-r Reverse the sense of comparisons.
When ordering options appear before restricted sort key
specifications, the requested ordering rules are applied
globally to all sort keys. When attached to a specific sort
key (described below), the specified ordering options
override all global ordering options for that key.
The notation +pos1 -pos2 restricts a sort key to one
beginning at pos1 and ending at pos2. The characters at
positions pos1 and pos2 are included in the sort key
(provided that pos2 does not precede pos1). A missing -pos2
means the end of the line.
Page 2 (printed 2/7/91)
SORT(C) XENIX System V SORT(C)
Specifying pos1 and pos2 involves the notion of a field (a
minimal sequence of characters followed by a field separator
or a newline). By default, the first blank (space or tab)
of a sequence of blanks acts as the field separator. All
blanks in a sequence of blanks are considered to be part of
the next field; for example, all blanks at the beginning of
a line are considered to be part of the first field. The
treatment of field separators can be altered using the
options:
-tx Use x as the field separator character; x is not
considered to be part of a field (although it may be
included in a sort key). Each occurrence of x is
significant (e.g., xx delimits an empty field).
-b Ignore leading blanks when determining the starting and
ending positions of a restricted sort key. If the -b
option is specified before the first +pos1 argument, it
will be applied to all +pos1 arguments. Otherwise, the
b flag may be attached independently to each +pos1 or
-pos2 argument (see below).
Pos1 and pos2 each have the form m.n optionally followed by
one or more of the flags b, d, f, i, n, or r. A starting
position specified by +m.n is interpreted to mean the n+1st
character in the m+1st field. A missing .n means .0,
indicating the first character of the m+1st field. If the b
flag is in effect, n is counted from the first non-blank in
the m+1st field; +m.0b refers to the first non-blank
character in the m+1st field.
A last position specified by -m.n is interpreted to mean the
nth character (including separators) after the last
character of the mth field. A missing .n means .0,
indicating the last character of the mth field. If the b
flag is in effect, n is counted from the last leading blank
in the m+1st field; -m.1b refers to the first non-blank in
the m+1st field.
When there are multiple sort keys, later keys are compared
only after all earlier keys compare equal. Lines that
otherwise compare equal are ordered with all bytes
significant.
Examples
Sort the contents of infile with the second field as the
sort key:
sort +1 -2 infile
Sort, in reverse order, the contents of infile1 and infile2,
placing the output in outfile and using the first character
Page 3 (printed 2/7/91)
SORT(C) XENIX System V SORT(C)
of the second field as the sort key:
sort-r -o outfile +1.0 -1.2 infile1 infile2
Sort, in reverse order, the contents of infile1 and infile2
using the first non-blank character of the second field as
the sort key:
sort-r +1.0b -1.1b infile1 infile2
Print the password file (passwd(F)) sorted by the numeric
user ID (the third colon-separated field):
sort -t: +2n -3 /etc/passwd
Print the lines of the already sorted file infile,
suppressing all but the first occurrence of lines having the
same third field (the options -um with just one input file
make the choice of a unique representative from a set of
equal lines predictable):
sort-um +2 -3 infile
Files
/usr/tmp/stm???
See Also
coltbl(M), comm(C), join(C), locale(M), uniq(C)
Diagnostics
Comments and exits with non-zero status for various trouble
conditions (e.g., when input lines are too long), and for
disorders discovered under the -c option. When the last
line of an input file is missing a newline character, sort
appends one, prints a warning message, and continues.
Page 4 (printed 2/7/91)