Sed & AWK


(0 comments)

Sed and AWK are two programs that have the ability to handle text supporting Regular Expression. Where AWK is a programming language level program, so it is capable of handling more complex requirements. However, it is a general rule to use AWK only if it is not possible to solve the problem with Sed. Because in simple cases, using Sed is faster, more compact and easier than using AWK.
We make that clear through the following example. Suppose we have a ChangeLog.txt file as follows:



Updates are grouped into batches separated by a line of characters ----------. The structure of an information group consists of a line of time, followed by a blank line, then updated items, again a blank line, and ends with a line of characters ----------. Above the line of time there may be a blank line.

Now we want to extract the top group, but only get updated items on condition that at most 5 items. Each item needs to be omitted the left directory name along with the '/' character. They are then presented in numbered form.


That goal can be achieved using AWK with the following source code, the source filename is log.awk:


BEGIN {
lines = 0
}

/-----/ {
exit
}

/^[0-9]+/ {
lines += 1
if ( lines > 5 ) exit

OLDFS = FS
FS = "/"
$0 = $0
print $NF
FS = OLDFS
}

We create the file ChangeLog.txt:


cat > ChangeLog.txt << EOF
Tue Dec 18 05:03:28 UTC 2018

01-qt/qca-2.1.3-2.x86_64.rpm: Rebuilt.
01-security-advance/clamav-0.101.0-2.x86_64.rpm: Rebuilt.
01-security-advance/e2fsprogs-1.44.5-1.x86_64.rpm: Upgraded.
01-servers/openldap-2.4.46-2.x86_64.rpm: Rebuilt.
01-servers/sendmail-8.15.2-2.x86_64.rpm: Rebuilt.
02-databases/mysql-8.0.13-2.x86_64.rpm: Rebuilt.
06-vm/qemu-3.0.0-2.x86_64.rpm: Rebuilt.

------------------------------------------------------------

Sat Dec 15 15:58:24 UTC 2018

00-security-core/libgpg-error-1.33-1.x86_64.rpm: Upgraded.
00-security-core/libgcrypt-1.8.4-1.x86_64.rpm: Upgraded.
00-security-core/cryptsetup-2.0.6-1.x86_64.rpm: Upgraded.
00-sys/omarine-update-3.2-1.x86_64.rpm: Upgraded.
01-security-advance/clamav-0.101.0-1.x86_64.rpm: Upgraded.
01-security-advance/cyrus-sasl-2.1.27-1.x86_64.rpm: Upgraded.
01-security-advance/gnutls-3.6.5-1.x86_64.rpm: Upgraded.
01-security-advance/nettle-3.4.1-1.x86_64.rpm: Upgraded.
01-security-advance/polkit-0.115-1.x86_64.rpm: Upgraded.

------------------------------------------------------------

Fri Dec 14 06:11:44 UTC 2018

00-sys/omarine-update-3.1-1.x86_64.rpm: Upgraded.
01-security-advance/nss-3.40.1-1.x86_64.rpm: Upgraded.
06-firefox/firefox-64.0-1.x86_64.rpm: Upgraded.

------------------------------------------------------------
EOF

Run the program as follows:


awk -f log.awk ChangeLog.txt | cat -n



But it would be simpler if we use Sed with the same result:


cat ChangeLog.txt | sed -n '1,/-------/p' | \
sed -e '1,/^ *$/d' -e '/^ *$/,/-----/d' | \
sed -e '6,$d' -e 's,^.\+/,,' | \
cat -n



Command Explanations:


There are three sed commands in which the former pipelines to the next one
sed -n '1,/-------/p': Output from the first line to line ----------
sed -e '1,/^ *$/d' -e '/^ *$/,/-----/d': Delete from the first line until the next blank line, then delete the line ---------- and the blank line above it
sed -e '6,$d' -e 's,^.\+/,,': Delete from the 6th line (if any) until the end of the text stream, then remove the directory name together with the '/' character

Currently unrated

Comments

There are currently no comments

New Comment

required

required (not published)

optional

required