Linux awk command - pattern scanning and processing language

Linux awk command: pattern scanning and processing language

Linux awk command Function Description

The awk command reads the file line by line, slices each line with a space as the default separator, and the sliced portion is then processed for various analyses. awk is a powerful text analysis tool that is particularly powerful when it comes to analyzing data and generating reports as opposed to grep lookups and sed edits.

awk is a programming language for working with text and data under linux/unix. Data can come from standard input (stdin), one or more files, or the output of other commands. It supports advanced features such as user-defined functions and dynamic regular expressions, making it a powerful programming tool under linux/unix. It is used at the command line, but more often as a script. awk has many built-in features, such as arrays, functions, etc., which are the same as C. Flexibility is the biggest advantage of awk.

Linux awk command Syntax

awk [Option] [File]
awk [Option] [Program] [File]

The meaning of each option in the command is shown in the following table:

Option Description
-f <Program File> Reads the AWK program source from the specified program file
-F <File System> Use the specified file system as the input field separator
-v <variable=value> Assign values to variables before starting the program
-mf<value> Set different memory limits. The f flag sets the maximum number of fields
-mr <value> Set different memory limits. The r flag sets the maximum record size
-O Enable optimization in the internal representation of the program
--compat Run in compatibility mode
--dump-variables=<File> Displays a sorted list of global variables, their type values, and final values to files
--exec=<File> Similar to the -f option, but this is the last processing of the option
--gen-po The AWK program is scanned and parsed, and a GNU.po file is generated on standard output
--non-decimal-data Identify octal and hexadecimal values in input data
--profile=<file> Send the analysis data file. The default value is awkprof.out
--re-interval Enable regular expression matching using interval expressions
--source=<Program Text> Uses the specified program text as the source code for the AWK program
--traditional Matches traditional UNIX AWK regular expressions
--usage Displays a relatively short summary of the options available on standard output
--use-lc-numeric Enforce the use of the locale’s decimal character when parsing input data

AWK has a number of built-in variables for setting environment information that can be changed. The following table gives some of the most commonly used variables:

Built-in variables Description
ARGC Number of command line parameters
ARGV Permutation of command line arguments
NVIRON An array containing the values of the current environment
FILENAME Name of the current input file
FNR The number of input records for the current input file
FS Enter a field separator, which is a space by default
NF The number of fields in the current input record
NR The number of read records
OFS Output field separator
ORS Output record separator
RS Enter a record delimiter, which by default is a newline character
OFMT An output format for numbers
RT Record termination character
RSTART Matches the index of the first character
RLENGTH Match string length
SUBSEP Character to separate multiple elements in an array, by default "\034"
TEXTDOMAIN The text field of the AWK program
ARGIND The ARGV index for the current file is being processed
BINMODE On non-POSIX systems, specify the use of all file I/O in "binary" mode
CONVFMT Conversion format for numbers, default is "%.6g"
IGNORECASE Controls all regular expression and string operations to be case-sensitive
PROCINFO Provides the elements of an array with access to information about running AWK programs

String constants in AWK are sequences of characters enclosed in double quotes, and the following table lists the commonly used string constants:

String constants Description
\\ The backslash
\a alert characters, usually ASCII BEL characters
\b Backspace key
\f Change the page
\n A newline
\r enter
\t Horizontal TAB characters
\v Vertical TAB characters
\xhex digits The character is represented by a string in x below the hexadecimal number
\c Literal character c

Linux awk command Example

Show only the last 5 users logged into the system

last -n 5 | awk '{print $1}'

Output:

Linux awk command

Show only the accounts in the /etc/passwd file

[root@rhel ~]# cat /etc/passwd |awk -F ':' '{print $1}'
root
bin
daemon
adm
lp
sync
shutdown
halt
..........................

Show only the accounts in the /etc/passwd file and the Shells corresponding to the accounts, with the [Tab] key separating the accounts from the Shells

[root@rhel ~]# cat /etc/passwd |awk -F ':' '{print 1'' \t''7}'
root    /bin/bash
bin     /sbin/nologin
daemon  /sbin/nologin
adm    /sbin/nologin
lp     /sbin/nologin
sync   /bin/sync
shutdown     /sbin/shutdown
halt   /sbin/halt
mail   /sbin/nologin
uucp   /sbin/nologin
........................(Omitted)

Show only the accounts and their corresponding shells in /etc/passwd, with a comma between accounts and shells, add the column name, shell to all lines, and add blue, /bin/nosh to the last line

[root@rhel~]#shell''}{print1'', ''7}\
>END{print''blue, /bin/nosh''}'
name, shell
root, /bin/bash
bin, /sbin/nologin
daemon, /sbin/nologin
adm, /sbin/nologin
lp, /sbin/nologin
sync, /bin/sync
shutdown, /sbin/shutdown
halt, /sbin/halt
........................(Omitted)
tcpdump, /sbin/nologin
radiusd, /sbin/nologin
blue, /bin/nosh

Search for all lines in the /etc/passwd file that have the root keyword

[root@rhel ~]# awk -F: '/root/' /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

Search for all lines in the /etc/passwd file that begin with the root keyword

[root@rhel ~]# awk -F: '/^root/' /etc/passwd
root:x:0:0:root:/root:/bin/bash

Search for all lines in the /etc/passwd file that have the root keyword and display the corresponding shell

[root@rhel ~]# awk -F: '/root/{print $7}' /etc/passwd
/bin/bash
/sbin/nologin

Statistics on the /etc/passwd file, showing the file name, the line number of each line, the number of columns in each line, and the corresponding full line content

[root@rhel ~]# awk -F ':' '{print ''filename:'' FILENAME '', linenumber:'' NR '', \
> columns:” NF '', linecontent:'' $0}' /etc/passwd
filename:/etc/passwd, linenumber:1, columns:7, linecontent:root:x:0:0:root:/root:/bin/bash
filename:/etc/passwd, linenumber:2, columns:7, linecontent:bin:x:1:1:bin:/bin:/sbin/nologin
filename:/etc/passwd, linenumber:3, columns:7, linecontent:daemon:x:2:2:daemon:/sbin:/sbin/nologin
filename:/etc/passwd, linenumber:4, columns:7, linecontent:adm:x:3:4:adm:/var/adm:/sbin/nologin
filename:/etc/passwd, linenumber:5, columns:7, linecontent:lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
filename:/etc/passwd, linenumber:6, columns:7, linecontent:sync:x:5:0:sync:/sbin:/bin/sync
filename:/etc/passwd, linenumber:7, columns:7, linecontent:shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
filename:/etc/passwd, linenumber:8, columns:7, linecontent:halt:x:7:0:halt:/sbin:/sbin/halt
filename:/etc/passwd, linenumber:9, columns:7, linecontent:mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
........................(Omitted)

Count the number of accounts in the /etc/passwd file

[root@rhel ~]# awk '{count++; print $0; } END{print ''user count is '', count}' /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
........................(Omitted)
cyrus:x:76:12:Cyrus IMAP Server:/var/lib/imap:/sbin/nologin
ldap:x:55:55:LDAP User:/var/lib/ldap:/sbin/nologin
squid:x:23:23::/var/spool/squid:/sbin/nologin
tcpdump:x:72:72::/:/sbin/nologin
radiusd:x:95:95:radiusd user:/home/radiusd:/sbin/nologin
user count is 67

Display the accounts in the /etc/passwd file, showing the UID and username

[root@rhel~]#awk-F':''BEGIN{count=0; }{name[count]=$1; count++; }; \
> END{for (i = 0; i < NR; i++) print i, name[i]}' /etc/passwd
0 root
1 bin
2 daemon
3 adm
4 lp
5 sync
6 shutdown
7 halt
8 mail
9 uucp
10 operator
11 games
12 gopher
........................(Omitted)

Count the number of bytes occupied by files in the current directory

[root@rhel ~]# ls -l |awk 'BEGIN {size=0; } {size=size+$5; } END{print ''[end]size is '', size}'
[end]size is 170057
// Statistics does not include subdirectories under the directory

Count the number of MB occupied by files in the current directory

[root@rhel ~]# ls -l |awk 'BEGIN {size=0; } {size=size+$5; } \
> END{print ''[end]size is '', size/1024/1024, ''MB''}'
[end]size is 0.162179 MB

Count the number of MB occupied by files in the current directory, filter files of 4096 bytes size (usually folders)

[root@rhel ~]# ls -l |awk 'BEGIN {size=0; print ''[start]size is '', size} \
> {if(5! =4096){size=size+5; }} END{print ''[end]size is '', size/1024/1024, ''MB''}'
[start]size is 0
[end]size is 0.130929 MB
Like(1)

Related

Linux Login Logout Command
Linux login commandLinux logout commandLinux nologin commandLinux exit commandLinux sulogin commandLinux rlogin commandLinux poweroff commandLinux ctrlaltdel CommandLinux shutdown commandLinux halt commandLinux reboot commandLinux init commandLinux runlevel commandLinux telinit command
Linux File Management Command
Linux cat commandLinux tac commandLinux nl commandLinux more commandLinux less commandLinux head commandLinux tail commandLinux rev commandLinux fold commandLinux fmt commandLinux expand commandLinux pr commandLinux sort commandLinux uniq commandLinux cut commandLinux comm commandLinux diff commandLinux join commandLinux diff3 commandLinux cmp commandLinux colrm commandLinux paste commandLinux mkdir commandLinux tr commandLinux split commandLinux csplit commandLinux tee commandLinux unexpand commandLinux patch commandLinux awk commandLinux sed commandLinux od commandLinux pwd commandLinux cd commandLinux ls commandLinux dir commandLinux dirs commandLinux touch commandLinux rmdir commandLinux cp commandLinux mv commandLinux rm commandLinux install commandLinux tmpwatch commandLinux file commandLinux du commandLinux wc commandLinux tree commandLinux cksum commandLinux md5sum commandLinux sum commandLinux dirname commandLinux mkfifo Command
Cron Expressions
Cron Expression to Run Every Day at 12 PMUnderstanding Vue Cron ExpressionsUnderstanding JS Cron ExpressionsA Comprehensive Guide to Cron Expressions for Scheduled TasksUnderstanding Linux Cron ExpressionsUnderstanding Quartz Cron ExpressionsCron ExpressionCron Time ExpressionCron Expression ParsingCron Expression: Executing a Task Every SecondCron Expression for Every Minute ExecutionCron Expression to Execute Every 10 MinutesCron Expression: Executing Every HourCron Expression to Execute Once a YearCron Expression: How to Schedule a Task to Run Daily at Midnight?