[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

8708.0. "awk formatting problem" by NQOS01::16.29.16.102::Pellerin () Tue Feb 04 1997 23:16

I have a customer who is having a problem with an awk script that runs fine 
on Solaris, but causes a formatting problem on record output when the same 
script is run on Digital UNIX 4.0B.

This is a file that comes from a VAX and until now, has been formatted on a 
Solaris machine and loaded into Sybase on the Sun.  They now are 
investigating using Digital UNIX, and are testing the process.  If he runs 
the script on Solaris and loads the database on the Digital UNIX box, all is 
well.  It is only when he runs the script on Digital UNIX and attempts 
loading the database that he encounters the "oversized row" Sybase error.

Obvious question is what is different between the two awks?  He says he is 
using the "nawk" command under Solaris.  He did spell it out, n-a-w-k.  Are 
there many differences between the awk on Solaris and the awk on Digital 
UNIX, and has anyone heard on "nawk"? (man -k nawk yields zip)

Any help will be appreciated, as I am awk challanged.

Regards,

 -BAP
T.RTitleUserPersonal
Name
DateLines
8708.1Get gawk and the O'Reilly book.ZEKE::palium.zko.dec.com::stoddardInterdum vincit draco!Wed Feb 05 1997 10:1116
	There are three widely used versions of "awk":  plain old
	"awk", "nawk" and "gawk".  "nawk" and "gawk" have some
	features that are not available in plain "awk".  Your 
	customer is probably using one or more of these features.
	You can download a copy of "gawk" from a GNU archive and
	try that on the customer's system.

	For a very good description of awk and family, get a copy
	of "sed & awk" by Dale Dougherty from O'Reilly Press.
	(ISBN 0-937175-59-5)  This book will give you a complete
	breakdown of the differences between the various flavors
	of awk.

	Have a GREAT day!
	Pete

8708.2nawk is now awk...QUARRY::reevesJon Reeves, UNIX compiler groupWed Feb 05 1997 12:523
We used to ship our "new generation" awk separately, as nawk, back in the
ULTRIX days.  Since then, that awk became our standard awk, so "nawk" went
away as a separate command.  "gawk" is still a different command.
8708.3Correction: awk -> oawk (see .4)QUARRY::reevesJon Reeves, UNIX compiler groupWed Feb 05 1997 13:105
I've just been told my last note was unclear.

What I meant to say:
What used to be called nawk is now called awk.
What used to be called awk went away.
8708.4One more timeRHETT::PARKERWed Feb 05 1997 13:1819
    
    Hi Jon,
    
    Still not totally clear. Did the old awk really go away or
    get renamed to oawk?
    
    I've got 4 flavors on a 4.0A machine:
    
     6162 -rwxr-xr-x   2 bin      bin       155648 Aug 19 21:27 /usr/bin/awk
     6242 -rwxr-xr-x   1 bin      bin       221184 Aug 19 21:10 /usr/bin/gawk
     6162 -rwxr-xr-x   2 bin      bin       155648 Aug 19 21:27 /usr/bin/nawk
     6482 -rwxr-xr-x   1 bin      bin       122880 Aug 19 21:02 /usr/bin/oawk
    
    Where awk and nawk are hard linked.
    
    Thanks, 
    
    Lee
    
8708.5Oops.QUARRY::reevesJon Reeves, UNIX compiler groupWed Feb 05 1997 13:252
You're right, awk became oawk.  Just can't get rid of stuff around here
(coming soon: the DIGITAL UNIX garage sale!  Get your old AXP logos!)
8708.6still confusedNEMAIL::PELLERINWed Feb 05 1997 23:0611
    ok guys, it was fun watching you wax nostalgic, but I still am
    confused.  If our old "nawk" is now "awk", then is it the same as the
    Solaris "nawk"?  After all UNIX is UNIX right? ;')    
    
    It looks as though I'll have to go in and do the debugging of his
    script myself.  Looks like I'm gonna learn awk.  How is the pay for a
    good awk programmer?
    
    Thanks.
    
     -BAP 
8708.7VAXCPU::michaudJeff Michaud - ObjectBrokerThu Feb 06 1997 02:057
> After all UNIX is UNIX right? ;')    

	yes, if you also believe Windows is Windows.

> How is the pay for a good awk programmer?

	not as good as the pay for a good perl programmer...
8708.8Wax Sun :-)RHETT::PARKERThu Feb 06 1997 10:4241
    
    Hi BAP, 
    
    Yes, it appears that our awk(1) is basically the same as nawk(1)
    on Solaris. 
    
    Once again, Digital UNIX proves to be ahead of Solaris and much
    better, of course!!
    
    From Digital UNIX awk(1) man page:
    
      Interfaces documented on this reference page conform to industry standards
      as follows:
    
      awk:  XPG4, XPG4-UNIX
    
    From Solaris (SunOS 5.4) nawk(1) man page:
    
    NOTES
         nawk is a new version of awk that provides capabilities una-
         vailable in previous versions.  This version will become the
         default version of awk in the next major release.
    
    
    Sun is just a little behind. :-) 
    
    So, you can use nawk/awk on Digital UNIX and nawk on Solaris -
    probably interchangeably. Got something you want me to test
    on both? 
    
    Is that what you needed to know? BTW: Our man page is much more
    complete than the SunOS man page. 
    
    And, of course, Digital UNIX is THE MOST standards compliant
    UNIX in the entire Universe! 
    
    Hope this helps!
    
    Lee
    
          
8708.9The plot thickensNQOS01::lexser12.lex.dec.com::PellerinFri Feb 07 1997 13:4323
Thanks for the responses.  re: -.1, I have just acquired the following from 
the customer:

- Bourne shell script that envokes the awk command in question (the script is 
identical on Sun and Digital)
- A small sample of the original file
- samples of both output files after the sh script is run on Sun and Digital. 
 
During my visit I learned that the Sybase BCP program that reads in the 
"awked" file is actually updating the database, but a warning message is 
displayed on our server (from the complaining BCP program) to the tune of 
"oversized row...".

Since the database actually gets properly updated I predict that there may be 
an extra character (of no significance) at the beginning or end of the record 
that is being truncated by BCP - but I do not know BCP and am quessing (from 
experience with similar data conversion chores I've had...).

I'll investigate and post.  If I need help I'll yell.

Thanks.


8708.10printf - %d is the problem?NQOS01::mko-ras-port-22.mko.dec.com::PellerinMon Feb 10 1997 08:09155
Ok, I've completed my investigation and awk does not appear to be the main 
problem.  I do a "diff" of my new tested file and the Solaris-outputted file 
and get no differences.

HOWEVER - the "printf" statement that creates the records puts a decimal 
field as the first in every record.  I suspect that the %d presents a 
different sized field than the same %d does in Solaris.  

Please respond with comments, suggestions, etc.  

Thanks.  -BAP

The contents of the "nawk" script follows:

# @(#) customer.nawk 1.13 1/4/94 19:09:52 SSB - Capital Markets 
{
while (getline rec)
	{
	if (substr(rec, 1, 7) == "TRAILER")
		exit

	cust_no = substr(rec, 1, 9)

	if (cust_no == "         ")
		cust_no = ""

	cust_sname = substr(rec, 10, 17)

	if (cust_sname == "                 ")
		cust_sname = "UNKNOWN"

	cust_lname = substr(rec, 27, 35)

	if (cust_lname == "                                   ")
		cust_lname = ""

	cust_sub = substr(rec, 62, 1)
	if (cust_sub == " ")
		cust_sub = ""

	addr2 = substr(rec, 63, 35)

	if (addr2 == "                                   ")
		addr2 = ""

	addr3 = substr(rec, 98, 35)

	if (addr3 == "                                   ")
		addr3 = ""

	addr4 = substr(rec, 133, 35)

	if (addr4 == "                                   ")
		addr4 = ""

	zip = substr(rec, 168, 10)

	if (zip == "          ")
		zip = ""

	cust_type_cd = substr(rec, 178, 3)

	if (cust_type_cd == "999" || cust_type_cd == "00 " ||
		cust_type_cd == "   ")
		cust_type_cd = ""

	cntry_risk = substr(rec, 181, 3)

	if (cntry_risk == "   ")
		cntry_risk = ""

	cntry_res = substr(rec, 184, 3)

	if (cntry_res == "   ")
		cntry_res = ""

	invest_man_no = substr(rec, 187, 9)

	if (invest_man_no == "         ")
		invest_man_no = ""

	cust_class = substr(rec, 196, 1)

	if (cust_class == " ")
		cust_class = ""

	acct_man_cd = substr(rec, 197, 3)

	if (acct_man_cd == "999" || acct_man_cd == "00 " ||
		acct_man_cd == "   ")
		acct_man_cd = ""

	tax_id = substr(rec, 200, 9)

	if (tax_id == "         ")
		tax_id = ""

	tax_status = substr(rec, 209, 1)

	if (tax_status == " ")
		tax_status = ""

	parent_no = substr(rec, 210, 9)

	if (parent_no == "         ")
		parent_no = ""

	key_accnt = substr(rec, 219, 1)

	if (key_accnt == " ")
		key_accnt = "N"

	cust_status = substr(rec, 220, 1)

	if(cust_status == " ")
		cust_status = ""

	front_off_cd = substr(rec, 221, 10)

	if(front_off_cd == "          ")
		front_off_cd = ""

	if ((cust_class == "F")&&(loc == "BOS")) 
		fund_acct_no = substr(rec, 222, 4)
	else
		fund_acct_no = ""
        relationship_cd = substr(rec, 231,1)

        if(relationship_cd == " ")

		relationship_cd = ""

        industry_cd = substr(rec, 232,3)

        if(industry_cd == " ")

		industry_cd = ""



	CUSTID = CUSTID + 1

	printf 
("%d\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\
t%s\t%s\t%s\t%s\t%s\t\t\n",
	CUSTID, cust_no, cust_sname, cust_sub, cust_lname, addr2, addr3,
	addr4, zip, cust_type_cd, cntry_risk, cntry_res, invest_man_no,
	cust_class, acct_man_cd, tax_id, tax_status, parent_no,
	key_accnt, cust_status, front_off_cd, fund_acct_no, relationship_cd, 
	industry_cd) > "customer_bcp.dat"
	}
}



8708.11Test methodology suggestedSMURF::BINDERErrabit quicquid errare potest.Mon Feb 10 1997 15:1216
    Re .10
    
    The awk(1) manpagfe refers the reader to printf(3) for information on
    how it does formatted printing.  According to printf(3), the default
    precision for %d is 1.  This means that exactly the number of digits
    required to express the input integer will be printed.
    
    You say you've run diff on "my new tested file and the
    Solaris-outputted file" - does this mean the files generated by the
    respective awk commands, or the final files output from the script?  If
    you haven't run diff on the awk outputs, please do so - you can tee the
    awk output into a test file somewhere and then save it.  Run diff, and
    then also run the sum command and the ls -l command to verify that the
    files are exactly exactly the same.  :-)
    
    -dick