[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:	DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:	Welcome to the Digital UNIX Conference
Moderator:	SMURF::DENHAM

Created:	Thu Mar 16 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	10068
Total number of notes:	35879

8708.0. "awk formatting problem" by NQOS01::16.29.16.102::Pellerin () Tue Feb 04 1997 23:16

I have a customer who is having a problem with an awk script that runs fine 
on Solaris, but causes a formatting problem on record output when the same 
script is run on Digital UNIX 4.0B.

This is a file that comes from a VAX and until now, has been formatted on a 
Solaris machine and loaded into Sybase on the Sun.  They now are 
investigating using Digital UNIX, and are testing the process.  If he runs 
the script on Solaris and loads the database on the Digital UNIX box, all is 
well.  It is only when he runs the script on Digital UNIX and attempts 
loading the database that he encounters the "oversized row" Sybase error.

Obvious question is what is different between the two awks?  He says he is 
using the "nawk" command under Solaris.  He did spell it out, n-a-w-k.  Are 
there many differences between the awk on Solaris and the awk on Digital 
UNIX, and has anyone heard on "nawk"? (man -k nawk yields zip)

Any help will be appreciated, as I am awk challanged.

Regards,

 -BAP

T.R	Title	User	Personal Name	Date	Lines
8708.1	Get gawk and the O'Reilly book.	ZEKE::palium.zko.dec.com::stoddard	Interdum vincit draco!	`Wed Feb 05 1997 10:11`	16
	There are three widely used versions of "awk": plain old "awk", "nawk" and "gawk". "nawk" and "gawk" have some features that are not available in plain "awk". Your customer is probably using one or more of these features. You can download a copy of "gawk" from a GNU archive and try that on the customer's system. For a very good description of awk and family, get a copy of "sed & awk" by Dale Dougherty from O'Reilly Press. (ISBN 0-937175-59-5) This book will give you a complete breakdown of the differences between the various flavors of awk. Have a GREAT day! Pete
8708.2	nawk is now awk...	QUARRY::reeves	Jon Reeves, UNIX compiler group	`Wed Feb 05 1997 12:52`	3
	We used to ship our "new generation" awk separately, as nawk, back in the ULTRIX days. Since then, that awk became our standard awk, so "nawk" went away as a separate command. "gawk" is still a different command.
8708.3	Correction: awk -> oawk (see .4)	QUARRY::reeves	Jon Reeves, UNIX compiler group	`Wed Feb 05 1997 13:10`	5
	I've just been told my last note was unclear. What I meant to say: What used to be called nawk is now called awk. What used to be called awk went away.
8708.4	One more time	RHETT::PARKER		`Wed Feb 05 1997 13:18`	19
	Hi Jon, Still not totally clear. Did the old awk really go away or get renamed to oawk? I've got 4 flavors on a 4.0A machine: 6162 -rwxr-xr-x 2 bin bin 155648 Aug 19 21:27 /usr/bin/awk 6242 -rwxr-xr-x 1 bin bin 221184 Aug 19 21:10 /usr/bin/gawk 6162 -rwxr-xr-x 2 bin bin 155648 Aug 19 21:27 /usr/bin/nawk 6482 -rwxr-xr-x 1 bin bin 122880 Aug 19 21:02 /usr/bin/oawk Where awk and nawk are hard linked. Thanks, Lee
8708.5	Oops.	QUARRY::reeves	Jon Reeves, UNIX compiler group	`Wed Feb 05 1997 13:25`	2
	You're right, awk became oawk. Just can't get rid of stuff around here (coming soon: the DIGITAL UNIX garage sale! Get your old AXP logos!)
8708.6	still confused	NEMAIL::PELLERIN		`Wed Feb 05 1997 23:06`	11
	ok guys, it was fun watching you wax nostalgic, but I still am confused. If our old "nawk" is now "awk", then is it the same as the Solaris "nawk"? After all UNIX is UNIX right? ;') It looks as though I'll have to go in and do the debugging of his script myself. Looks like I'm gonna learn awk. How is the pay for a good awk programmer? Thanks. -BAP
8708.7		VAXCPU::michaud	Jeff Michaud - ObjectBroker	`Thu Feb 06 1997 02:05`	7
	> After all UNIX is UNIX right? ;') yes, if you also believe Windows is Windows. > How is the pay for a good awk programmer? not as good as the pay for a good perl programmer...
8708.8	Wax Sun :-)	RHETT::PARKER		`Thu Feb 06 1997 10:42`	41
	Hi BAP, Yes, it appears that our awk(1) is basically the same as nawk(1) on Solaris. Once again, Digital UNIX proves to be ahead of Solaris and much better, of course!! From Digital UNIX awk(1) man page: Interfaces documented on this reference page conform to industry standards as follows: awk: XPG4, XPG4-UNIX From Solaris (SunOS 5.4) nawk(1) man page: NOTES nawk is a new version of awk that provides capabilities una- vailable in previous versions. This version will become the default version of awk in the next major release. Sun is just a little behind. :-) So, you can use nawk/awk on Digital UNIX and nawk on Solaris - probably interchangeably. Got something you want me to test on both? Is that what you needed to know? BTW: Our man page is much more complete than the SunOS man page. And, of course, Digital UNIX is THE MOST standards compliant UNIX in the entire Universe! Hope this helps! Lee
8708.9	The plot thickens	NQOS01::lexser12.lex.dec.com::Pellerin		`Fri Feb 07 1997 13:43`	23
	Thanks for the responses. re: -.1, I have just acquired the following from the customer: - Bourne shell script that envokes the awk command in question (the script is identical on Sun and Digital) - A small sample of the original file - samples of both output files after the sh script is run on Sun and Digital. During my visit I learned that the Sybase BCP program that reads in the "awked" file is actually updating the database, but a warning message is displayed on our server (from the complaining BCP program) to the tune of "oversized row...". Since the database actually gets properly updated I predict that there may be an extra character (of no significance) at the beginning or end of the record that is being truncated by BCP - but I do not know BCP and am quessing (from experience with similar data conversion chores I've had...). I'll investigate and post. If I need help I'll yell. Thanks.
8708.10	printf - %d is the problem?	NQOS01::mko-ras-port-22.mko.dec.com::Pellerin		`Mon Feb 10 1997 08:09`	155
	Ok, I've completed my investigation and awk does not appear to be the main problem. I do a "diff" of my new tested file and the Solaris-outputted file and get no differences. HOWEVER - the "printf" statement that creates the records puts a decimal field as the first in every record. I suspect that the %d presents a different sized field than the same %d does in Solaris. Please respond with comments, suggestions, etc. Thanks. -BAP The contents of the "nawk" script follows: # @(#) customer.nawk 1.13 1/4/94 19:09:52 SSB - Capital Markets { while (getline rec) { if (substr(rec, 1, 7) == "TRAILER") exit cust_no = substr(rec, 1, 9) if (cust_no == " ") cust_no = "" cust_sname = substr(rec, 10, 17) if (cust_sname == " ") cust_sname = "UNKNOWN" cust_lname = substr(rec, 27, 35) if (cust_lname == " ") cust_lname = "" cust_sub = substr(rec, 62, 1) if (cust_sub == " ") cust_sub = "" addr2 = substr(rec, 63, 35) if (addr2 == " ") addr2 = "" addr3 = substr(rec, 98, 35) if (addr3 == " ") addr3 = "" addr4 = substr(rec, 133, 35) if (addr4 == " ") addr4 = "" zip = substr(rec, 168, 10) if (zip == " ") zip = "" cust_type_cd = substr(rec, 178, 3) if (cust_type_cd == "999" \|\| cust_type_cd == "00 " \|\| cust_type_cd == " ") cust_type_cd = "" cntry_risk = substr(rec, 181, 3) if (cntry_risk == " ") cntry_risk = "" cntry_res = substr(rec, 184, 3) if (cntry_res == " ") cntry_res = "" invest_man_no = substr(rec, 187, 9) if (invest_man_no == " ") invest_man_no = "" cust_class = substr(rec, 196, 1) if (cust_class == " ") cust_class = "" acct_man_cd = substr(rec, 197, 3) if (acct_man_cd == "999" \|\| acct_man_cd == "00 " \|\| acct_man_cd == " ") acct_man_cd = "" tax_id = substr(rec, 200, 9) if (tax_id == " ") tax_id = "" tax_status = substr(rec, 209, 1) if (tax_status == " ") tax_status = "" parent_no = substr(rec, 210, 9) if (parent_no == " ") parent_no = "" key_accnt = substr(rec, 219, 1) if (key_accnt == " ") key_accnt = "N" cust_status = substr(rec, 220, 1) if(cust_status == " ") cust_status = "" front_off_cd = substr(rec, 221, 10) if(front_off_cd == " ") front_off_cd = "" if ((cust_class == "F")&&(loc == "BOS")) fund_acct_no = substr(rec, 222, 4) else fund_acct_no = "" relationship_cd = substr(rec, 231,1) if(relationship_cd == " ") relationship_cd = "" industry_cd = substr(rec, 232,3) if(industry_cd == " ") industry_cd = "" CUSTID = CUSTID + 1 printf ("%d\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\ t%s\t%s\t%s\t%s\t%s\t\t\n", CUSTID, cust_no, cust_sname, cust_sub, cust_lname, addr2, addr3, addr4, zip, cust_type_cd, cntry_risk, cntry_res, invest_man_no, cust_class, acct_man_cd, tax_id, tax_status, parent_no, key_accnt, cust_status, front_off_cd, fund_acct_no, relationship_cd, industry_cd) > "customer_bcp.dat" } }
8708.11	Test methodology suggested	SMURF::BINDER	Errabit quicquid errare potest.	`Mon Feb 10 1997 15:12`	16
	Re .10 The awk(1) manpagfe refers the reader to printf(3) for information on how it does formatted printing. According to printf(3), the default precision for %d is 1. This means that exactly the number of digits required to express the input integer will be printed. You say you've run diff on "my new tested file and the Solaris-outputted file" - does this mean the files generated by the respective awk commands, or the final files output from the script? If you haven't run diff on the awk outputs, please do so - you can tee the awk output into a test file somewhere and then save it. Run diff, and then also run the sum command and the ls -l command to verify that the files are exactly exactly the same. :-) -dick