AWK — The Unix/Linux Powerful tool.

If you are working as sys admin, Application Support or SRE. It’s important to know about AWK tool for quick data manipulation, extraction and analysis.

What is AWK?

AWK is a pattern-matching program for processing files.

The AWK utility is a data extraction and reporting tool that uses a data-driven scripting language consisting of a set of actions to be taken against textual data for the purpose of producing formatted reports

Here, I will cover some of the important commands that will be helpful for handling the day to day task.

What can we do with AWK?

The basic syntax of AWK command is as follow:-

awk [options] 'selection _criteria ' <<file1>>Pattern-action statement:
awk 'pattern {action}' file_name
E.g ; Print first column of non-empty row. OR Delete empty fields.
NF : Number of fields
$awk 'NF>1 { print $1 } test.txt

Some of the built-in variable commonly use in AWK:-

$1, $2 $3 ..etc are used to extract data fields.

FS: field separator: default is blanks or tabs
$0 is the entire line
$1 → first field, $2 →second field, …. $NF
NF → Number of fields. $NF returns the last column
NR → The variable returns the rows with the current count of the number of input records.
ORS (Output Record Separator) → The variable separates every record in the desired output format.
OFS (Output Field Separator ) → The variable separates every field in the desired output format. By default, the OFS variable is the space, you can set the OFS variable to specify the separator you need.

FNR (File Line number) — Number each line.

# Print 3 column of test.data file.
$ awk '{ print $3 }' test.data
# Print columns 4 and 1.
$ awk '{ print $4, $1 }' test.data
# Number each line
$ awk '{ print FNR $0}'
# Number of line with tab
$ awk '{ print FNR "\t" $0}' test.data // $0 denotes all fields
# Count lines ; ( same a wc -l )
$ awk 'END {print NR}' test.data //NR: Line number
# Print last field of each line.
$ awk '{ print $NF}' test.data
# Print last field of last line.
$ awk 'END { print $NF }' test.data
# Print every line, where the value of 4th filed is more than 10.
$ awk '$4>10{ print }' test.data
# Print the lines starting from 10.
$ awk 'NR >9 { print } test.data
# Seperate with field Seperator.
$ awk -F "--" { print $2 } test.data
# Print alternate lines.
$ awk 'NR%2' test1.data
# Omit the third line.
$ awk 'NR%3' test.data
# Substitute ABC by XYZ.
$awk '{sub(/ABC/,"ARG")}; print' test.data

Stay hungry; Stay Foolish!!