Parsing and passing from awk

From FBSD_tips

Jump to: navigation, search

DRAFT - INCOMPLETE

Back to shell games

Contents

[edit] Rationale

FreeBSD (and unix in general) has lots of 'little' languages that address an area or problem domain well and (like all really good tools) leave all the others to other tools. M4, Awk and Sed are great examples of this. I have always favored Awk for parsing chores. It is quick to writein, surprisingly performant and is in BASE. I heavily favor tools that are in BASE ... except when I don't. :) Awk is not an exceptional system administration language tho, for that I like bourne shell. Getting parsing done in Awk and using this data in Sh is the tidbit that this article is about.

[edit] Awk script

I present here a gerneral case awk script that will parse typical unix data files and emit bourne shell assignment statements for variable name / columns we pass it.

match.awk :

function matchallbefore(str, chr) {
 return substr(str, 0, index(str, chr) - 1)
}

function matchallafter(str, chr) {
 return substr(str, index(str, chr) + 1)
}

BEGIN {
  split(VALUES,ARRAY,/,/)
}
{
 OUTPUT = ""
 if ( split($0,DLINE,SEP) == COLS)
 {
  for (I in ARRAY)
  {
   OUTPUT = OUTPUT "export " matchallafter(ARRAY[I],":") "=" DLINE[matchallbefore(ARRAY[I],":")] ";"
  }
  print OUTPUT
 }
}

This Awk script require that you pass 3 variable assignments to it on the command line.

1) VALUES takes a list of comma separated pairs. The pairs consist of a column number and a shell variable name you want the contents of the column assigned to separated by a colon.

2) COLS take the number of columns in the data file.

3) SEP takes the separator character.

You can directly download this script with this command :

curl --output match.awk http://bsdtips.utcorp.net/mediawiki/index.php?title=Match.awk&action=raw

[edit] Example

[edit] Data file

data.txt :

joe:green:red
sam:brown:blue
julie:smith:yellor

[edit] Shell script

This uses 'eval' to accept the variable assignments awk emitted into the current shells environment.

test.sh :

#!/bin/sh

 awk -v VALUES=1:FIRST,2:LAST,3:COLOR -v COLS=3 -v SEP=":" -f match.awk < data.txt | \
 while read ASSN
 do
  eval $ASSN
  echo "First name : ${FIRST}, Last name : ${LAST}, Color : ${COLOR}"
 done

The 'VALUES' list the is essence of the script here. What is the list is saying is this : column 1 will be assigned to FIRST, column 2 will be assigned to LAST and column 3 will be assigned to COLOR. COLS says ther are 3 columns in the file (ignore lines with less or more) and SEP is the separator character.

[edit] Output

./test.sh
First name : joe, Last name : green, Color : red
First name : sam, Last name : brown, Color : blue
First name : julie, Last name : smith, Color : yellor

[edit] Discussion

This very simple tip allows parsing files with awk syntax which is very well suited to the job and use of the data from within bourne shell (which i have always found it cumbersome to do parsing in). But the basic underlying mechanism of transmitting variable/value pairs between scripting languages can have many more applications than simply parsing data files.

Personal tools