NAME
	  awk -	Finds lines in files and makes specified changes to
	  them

     SYNOPSIS
	  awk -f program [-Fcharacter] [file ...]

	  awk [-Fcharacter] statement ...  [file ...]


	  The awk command is a more powerful pattern-matching command
	  than the grep	command.  It can perform limited processing on
	  the input lines, instead of simply displaying	lines that
	  match.

     FLAGS
	  -Fcharacter
	      Uses character as	the field separator character (a space
	      by default).

	  -f program
	      Searches for the patterns	and performs the actions found
	      in the file program.


     DESCRIPTION
	  The awk command provides a flexible text-manipulation
	  language suitable for	simple report generation.  awk is a
	  more powerful	tool for text manipulation than	either sed or
	  grep.

	  The awk command:


	    o  Performs	convenient numeric processing.

	    o  Allows variables	within actions.

	    o  Allows general selection	of patterns.

	    o  Allows control flow in the actions.

	    o  Does not	require	any compiling of programs.


	  Pattern-matching and action statements can be	specified
	  either on the	command	line or	in a program file.  In either
	  case,	the awk	command	first reads all	matching and action
	  statements, then it reads a line of its input	and compares
	  it to	each specified pattern.	 If the	line matches a speci-
	  fied pattern,	awk performs the specified actions and writes
	  the result to	standard output.  When it has compared the
	  current input	line to	all patterns, it reads the next	line.

	  The awk command reads	input files in the order stated	on the
	  command line.	 If you	specify	a filename as a	- (dash) or do
	  not specify a	filename, awk reads standard input.

	  Enclose pattern-action statements on the command line	in ''
	  (single quotes) to protect them from interpretation by the
	  shell.  Consecutive pattern-action statements	on the same
	  command line must be separated by a ;	(semicolon), within
	  one set of quotes.  Consecutive pattern-action statements in
	  an awk program file must be on separate lines.

	  You can assign values	to variables on	the awk	command	line
	  as follows:

	  variable=value


	  The awk command treats input lines as	fields separated by
	  spaces, tabs,	or a field separator you set with the FS vari-
	  able.	 (Consecutive spaces are recognized as a single
	  separator.)  Fields are referenced as	$1, $2,	and so on.  $0
	  refers to the	entire line.

	Pattern-Action Statements
	  Pattern-action statements follow the form:

	  pattern {action}



	  If a pattern lacks a corresponding action, awk writes	the
	  entire line that contains the	pattern	to standard output.
	  If an	action lacks a corresponding pattern, awk applies it
	  to every line.

	Actions
	  An action is a sequence of statements	that follow C language
	  syntax.  These statements can	include:


	  if (expression) statement [ else statement ]

	  while	(expression) statement

	  for (expression;expression;expression) statement

	  for (variable	in array) statement

	  break

	  continue

	  { [ statement	... ] }

	  variable=expression

	  print	[ expression_list ] [ >file ] [	| command ]

	  printf format[ ,expression_list ] [ >file | >>file | | command ]

	  next

	  exit [ expression ]

	  delete array [ expression ]


	  Statements can end with a semicolon, a newline character, or
	  the right brace enclosing the	action.

	  Expressions can have string or numeric values	and are	built
	  using	the operators +, -, , /, %, and	^ (exponentiation), a
	  space	for string concatenation, and the C operators ++, --,
	  +=, -=, , /=,	%=, ^=,	*=, >, >=, <, <=, ==, !=, and ?:.

	  Because the actions process fields, input white space	is not
	  preserved in the output.

	  The file and command arguments can be	literal	names or
	  expressions enclosed in parentheses.	Identical string
	  values in different statements refer to the same open	file.

	  The print statement writes its arguments to standard output
	  (or to a file	if > file or >>	file is	present), separated by
	  the current output field separator and terminated by the
	  current output record	separator.

	  The printf statement writes its arguments to standard	output
	  (or to a file	if >file or >>file is present, or to a pipe if
	  | command is present), separated by the current output field
	  separator, and terminated by the output record separator.
	  file and command can be literal names	or parenthesized
	  expressions.	Identical string values	in different state-
	  ments	denote the same	open file.  You	can redirect the out-
	  put into a file using	the print ... >	file or	printf ( ... )
	  > file statements.  The printf statement formats its expres-
	  sion list according to the format of the printf() subroutine
	  (see the OSF/1 Programmer's Reference).

	Variables
	  Variables can	be scalars, array elements (denoted x[i]), or
	  fields.
	  Variable names can consist of	uppercase and lowercase	alpha-
	  betic	letters, the underscore	character, the digits (0 to
	  9), and extended characters.	Variable names cannot begin
	  with a digit.

	  Variables are	initialized to the null	string.	 Array sub-
	  scripts can be any string; they do not have to be numeric.
	  This allows for a form of associative	memory.	 Enclose
	  string constants in expressions in ""	(double	quotes).  Mul-
	  tiple	subscripts such	as [i,j,k] are permitted; the consti-
	  tuents are concatenated and separated	by the value of	SUBSEP
	  (see the description in the following	list).

	  There	are several variables with special meaning to awk.
	  They include:


	  ARGC
	      Argument count, assignable.

	  ARGV
	      Argument array, assignable; nonnull members are inter-
	      preted as	filenames.

	  FS  Input field separator (default is	a space).  If it is a
	      space, then any number of	spaces and tabs	can separate
	      fields.

	  NF  The number of fields in the current input	line (record),
	      with a limit of 99.

	  NR  The number of the	current	input line (record).

	  FNR The number of the	current	input line (record) in the
	      current file.

	  FILENAME
	      The name of the current input file.

	  RS  Input record separator (default is a newline character).

	  OFS The output field separator (default is a space).

	  ORS The output record	separator (default is a	newline	char-
	      acter).

	  OFMT
	      The output format	for numbers (default % .6g).

	  SUBSEP
	      Separates	multiple subscripts (default is	031).

	Functions
	  Functions are	defined	at the position	of a pattern-action
	  statement, as	follows:

	  function foo(a, b, c)	{ ... ;	return x }



	  Arguments are	passed by value	if scalar and by reference if
	  array	name; functions	can be called recursively.  Arguments
	  are local to the function; all other variables are global.

	  There	are several built-in functions that can	be used	in awk
	  actions.  (For information about regular expressions as
	  referred to in this section, see grep.)


	  length(argument)
	      Returns the length, in characters, of argument, or of
	      the entire line if there is no argument.

	  blength(argument)
	      Returns the length, in bytes, of argument, or of the
	      entire line if there is no argument.

	  close(argument)
	      Closes the file or pipe expression.  Note	that you must
	      enclose a	filename in double quotes when redirecting
	      output with the awk command; otherwise, it is treated as
	      an awk variable.	For example:

	      print "Hello" > "/tmp/junk"
	      close ("/tmp/junk")



	  exp(number)
	      Takes the	exponential of its argument.

	  rand
	      Returns a	random number on (0, 1).

	  srand(number)
	      Sets seed	for rand.  The default is the time of day.

	  log(number)
	      Takes the	base e logarithm of its	argument.

	  sqrt(number)
	      Takes the	square root of its argument.

	  int(number)
	      Takes the	integer	part of	its argument.

	  substr(string,position,number)
	      Returns the substring number characters long of string,
	      beginning	at position.

	  index(string,string2)
	      Returns the position in string where string2 occurs, or
	      0	(zero) if it does not occur.

	  match(string,regular_expression)
	      Returns the position in string where regular_expression
	      occurs, or 0 (zero) if it	does not occur.	 The RSTART
	      and RLENGTH built-in variables are set to	the position
	      and length, in bytes, of the matched string.

	  split(string,a,[regular_expression])
	      Splits string into array elements	a[1], a[2], . .	.,
	      a[number], and returns number. The separation is done
	      with the specified regular expression or with the	FS
	      field separator if regular_expression is not given.

	  sub(regular_expression,string2,[string])
	      Substitutes string2 for the first	occurrence of the reg-
	      ular expression regular_expression in string.  If	string
	      is not given, the	entire line is used.

	  gsub(regular_expression,string2,[string])
	      Same as sub except that all occurrences of the regular
	      expression are replaced; both sub	and gsub return	the
	      number of	replacements.

	  sprintf(fmt,expression1,expression2, ...)
	      Formats the expressions according	to the printf format
	      string fmt and returns the resulting string.

	  system(command)
	      Executes command and returns its exit status.


	  The getline function sets $0 to the next input record	from
	  the current input file; getline < file sets $0 to the	next
	  record from file.  getline x sets variable x instead.
	  Finally, command | getline pipes the output of command into
	  getline.  Each call of getline returns the next line of out-
	  put from command. In all cases, getline returns 1 for	a suc-
	  cessful input, 0 (zero) for End-of-File, and -1 for an
	  error.

	Patterns
	  Patterns are arbitrary Boolean combinations of patterns and
	  relational expressions (the !, |, and	& operators and
	  parentheses for grouping).  You must start and end regular
	  expressions with slashes.  You can use regular expressions
	  as described for grep, including the following special char-
	  acters:


	  +   One or more occurrences of the pattern.

	  ?   Zero or one occurrence of	the pattern.

	  |   Either of	two statements.

	  ( ) Grouping of expressions.


	  Isolated regular expressions in a pattern apply to the
	  entire line.	Regular	expressions can	occur in relational
	  expressions.	Any string (constant or	variable) can be used
	  as a regular expression, except in the position of an	iso-
	  lated	regular	expression in a	pattern.

	  If two patterns are separated	by a comma, the	action is per-
	  formed on all	lines between an occurrence of the first pat-
	  tern and the next occurrence of the second.

	  Regular expressions can contain extended (multibyte) charac-
	  ters with one	exception: range constructs in character class
	  specifications using brackets	cannot contain multibyte
	  extended characters.	Individual instances of	extended (mul-
	  tibyte) characters can appear	within brackets; however,
	  extended characters are treated as separate 1-byte charac-
	  ters.

	  As in	egrep, inclusion in ranges is determined by the	col-
	  lating sequence as defined by	the current locale. The	wild-
	  card characters , +, and ? match characters and character
	  strings, not bytes.

	  There	are two	types of relational expressions	that you can
	  use.	The first type has the form:

	  expression  match_operator  pattern


	  where	match_operator is either: ~ (for contains) or !~ (for
	  does not contain).

	  The second type has the form:

	  expression  relational_operator  expression

	  where	relational_operator is any of the six C	relational
	  operators: <,	>, <=, >=, ==, and !=.	A conditional can be
	  an arithmetic	expression, a relational expression, or	a
	  Boolean combination of these.

	  You can use the BEGIN	and END	special	patterns to capture
	  control before the first and after the last input line is
	  read,	respectively.  BEGIN must be the first pattern;	END
	  must be the last.  BEGIN and END do not combine with other
	  patterns.

	  You have two ways to designate a character other than	white
	  space	to separate fields.  You can use the -Fcharacter flag
	  on the command line, or you can start	program	with the fol-
	  lowing sequence:

	  BEGIN	{ FS = c }



	  Either action	changes	the field separator to c.

	  There	are no explicit	conversions between numbers and
	  strings.  To force an	expression to be treated as a number,
	  add 0	(zero) to it.  To force	it to be treated as a string,
	  append a null	string (``'').

     EXAMPLES
	   1.  To display the lines of a file that are longer than 72
	       bytes, enter:

	       awk  'length  >72'  chapter1



	       This selects each line of the file chapter1 that	is
	       longer than 72 bytes.  awk then writes these lines to
	       standard	output because no action is specified.

	   2.  To display all lines between the	words start and	stop,
	       enter:

	       awk  '/start/,/stop/'  chapter1



	   3.  To run an awk program (sum2.awk)	that processes a file
	       (chapter1), enter:

	       awk  -f	sum2.awk  chapter1



	   4.  The following awk program computes the sum and average
	       of the numbers in the second column of the input	file:

		    {
			 sum +=	$2
		    }
	       END  {
		    print "Sum:	", sum;
		    print "Average:", sum/NR;
		    }



	       The first action	adds the value of the second field of
	       each line to the	sum variable.  (awk initializes	sum,
	       and all variables, to 0 (zero) before starting.)	 The
	       keyword END before the second action causes awk to per-
	       form that action	after all of the input file is read.
	       The NR variable,	which is used to calculate the aver-
	       age, is a special variable containing the number	of
	       records (lines) that were read.

	   5.  To print	the names of the users who have	the C shell as
	       the initial shell, enter:

	       awk  -F:	'$7 ~ /csh/ {print $1}'	/etc/passwd



	   6.  To print	the first two fields in	reversed order,	enter:

	       awk '{ print $2,	$1 }'



	   7.  The following awk program prints	the first two fields
	       of the input file in reversed order, with input fields
	       separated by a comma and	a space, then adds up the
	       first column and	prints the sum and average:

	       BEGIN	 { FS =	",[ ]*|[ ]+" }
		    { print $2,	$1}
		    { s	+= $1 }
	       END  { print "sum is", s, "average is", s/NR }



     RELATED INFORMATION
	  Commands:  grep(1)/egrep(1)/fgrep(1),	sed(1).

	  Functions:  printf(3).

	  "Using Internationalization Features"	in the OSF/1 User's
	  Guide.

	  The discussion of awk	in the OSF/1 Applications Programmer's
	  Guide.















































Acknowledgement and Disclaimer