COPYRIGHT (C) 1984-2021 MERRILL CONSULTANTS DALLAS TEXAS USA
MXG NEWSLETTER FORTY-TWO
****************NEWSLETTER FORTY-TWO************************************
MXG NEWSLETTER NUMBER FORTY-TWO - Feb 7, 2003
Technical Newsletter for Users of MXG : Merrill's Expanded Guide to CPE
TABLE OF CONTENTS Page
I. MXG Software Version.
II. MXG Technical Notes
III. MVS Technical Notes
IV. DB2 Technical Notes.
V. IMS Technical Notes.
VI. SAS Technical Notes.
VII. CICS Technical Notes.
VIII. Windows NT Technical Notes.
IX. Incompatibilities and Installation of MXG.
See member CHANGES and member INSTALL.
X. Online Documentation of MXG Software.
See member DOCUMENT.
XI. Changes Log
Alphabetical list of important changes
Highlights of Changes - See Member CHANGES.
COPYRIGHT (C) 2002 MERRILL CONSULTANTS DALLAS TEXAS USA
I. MXG Software Version VV.RR is now available.
1. Major enhancements added in MXG VV.RR:
See CHANGES.
II. MXG Technical Notes
2. I want BUILDPDB to ABEND if there are no SMF records to read.
This step can be inserted ahead of the include of BUILDPDB:
DATA _NULL_;
IF END AND NREC=. THEN DO;
PUT ' NO SMF RECORDS FOUND. JOB ABENDED WITH USER 99';
ABORT ABEND 99;
END;
INFILE SMF END=END;
NREC=1;
STOP;
and it will either read one record or ABEND the job.
1. How do you rebuild an old daily PDB?
BUILDPDB has two inputs: today's SMF records, and yesterday's SPIN
data library of SPINxxxx datasets containing incomplete jobs.
BUILDPDB has two outputs: today's PDB library, and the updated SPIN
library of incomplete jobs, to be used as tomorrow's input SPIN.
The SPIN architecture only affects these job-related datasets:
PDB.JOBS/PDB.STEPS/PDB.PRINT/PDB.NJEPURGE/PDB.SPUNJOBS
built from the SMF 6, 26, and 30 records. The parameter SPINCNT
that you chose in your IMACSPIN member controls how long jobs are
held in the SPIN library.
All other data sets in the PDB are created directly from the input
SMF file and are not affected by the contents of the SPIN library.
To recreate a daily PDB, the SPIN library must be restored to be as
it was before that day's BUILDPDB job ran.
a. If the day to be recreated is within the last seven days:
BUILDPDB backs up all of the SPINxxxx datasets into that day's PDB
library for your backup, so you can restore the SPIN library from
that previous day's PDB library and use it for your BUILDPDB job:
// EXEC MXGSAS
//PDBPREV DD DSN=PREVDAY.PDB.LIBRARY,DISP=SHR
//SPIN DD DSN=SPINNEW,DISP=(,CATLG),UNIT=SYSDA....
//SMF DD DSN=THAT.DAYS.SMF.FILE,DISP=SHR
//PDB DD DSN=PDB.TO.CREATE,DISP=(,CATLG),UNIT=....
//SYSIN DD *
PROC COPY IN=PDBPREV OUT=SPIN; SELECT SPIN:;
%INCLUDE SOURCLIB(BUILDPDB);
If you were restoring forward from this day, you would run this job
on the first day, and would then use the DSN=SPINNEW as your new
SPIN library. If you were just recreating one day's PDB, you would
delete the SPINNEW dataset after the recreate.
b. If the day to be recreated is earlier than seven days ago:
You are in a world of hurt, unless you wisely backup your SPIN
dataset at the end of each day's BUILDPDB job, or you wisely backup
backed up each day's PDB library after BUILDPDB has executed.
If you do not have a backup of your SPIN library from the previous
BUILDPDB execution, and you need to rebuild the JOBS, STEPS and
PRINT datasets for billing accuracy, then you must first create and
repopulate a SPIN library, by reading each of the previous SPINCNT
day's SMF files, one step at a time. These SPIN-rebuild steps read
only the SMF 6, 26, and 30 records and execute only the first and
fifth phases of the BUILDPDB logic, for speed and minimum space.
For example, if SPINCNT is 30, you would go back to the 31st day
before the day of interest, and execute these 31 steps:
//W31 EXEC MXGSAS
//SPIN DD DSN=SPINNEW,DISP=(,CATLG),UNIT=SYSDA....
//PDB DD UNIT=SYSDA,SPACE=(CYL,(1000,1000)),DISP=(,DELETE)
//SMF DD DSN=SMFDAY(-30),DISP=SHR
//SYSIN DD *
%LET MACFILE= %QUOTE( IF ID IN (6,26,30); ) ;
%INCLUDE SOURCLIB(BUILD001,BUILD005);
//W30 EXEC MXGSAS
//SPIN DD DSN=SPINNEW,DISP=OLD
//PDB DD UNIT=SYSDA,SPACE=(CYL,(1000,1000)),DISP=(,DELETE)
//SMF DD DSN=SMFDAY(-29),DISP=SHR
//SYSIN DD *
%LET MACFILE= %QUOTE( IF ID IN (6,26,30); ) ;
%INCLUDE SOURCLIB(BUILD001,BUILD005);
... steps for SMF(-28) thru SMF(-1)
//W0 EXEC MXGSAS
//SPIN DD DSN=SPINNEW,DISP=OLD
//PDB DD UNIT=SYSDA,SPACE=(CYL,(1000,1000)),DISP=(,DELETE)
//SMF DD DSN=SMFDAY(0),DISP=SHR
//SYSIN DD *
%LET MACFILE= %QUOTE( IF ID IN (6,26,30); ) ;
%INCLUDE SOURCLIB(BUILD001,BUILD005);
and you would then have the SPIN library recreated to use for the
input to actually create the day's PDB library.
We must go back N+1 days if SPINCNT is N rather than just N days.
Jobs in SPIN prior to the first day would have been output into a
daily PDB when their SMF records that matched up were read, but as
we don't have those old SPIN records, those jobs will not match up
and would have remained in our SPIN if we executed only SPINCNT
times. By running SPINCNT+1 times, those SMF records we read that
will never match up will have been removed from the SPIN library,
so we will have exactly repopulated the SPIN as it existed.
c. And what if you no longer have the daily SMF files, but instead
now have only a weekly file of SMF records that was built by the
concatenation of those daily SMF files?
Perfect accuracy gets complicated
III. MVS Technical Notes.
33. APARs OW55729 and OW55902 are HIPER fixes, dealing with MCCAFCTH
thresholds; OW55729 suggests as a circumvention, setting:
Total Real Size Threshold minimum
<2 GB MCCAFCTH=(350,400)
2-6 GB MCCAFCTH=(2000,2500)
>6 GB MCCAFCTH=(5000,6000)
Without the APARs, zero values for AFC were seen on z/OS 1.3.
In z900 architecture with all real storage, UIC has not been
found to be a useful indicator of storage health; the AFC
(Available Frame Count) seems to better show storage health,
dropping faster and sooner than the UIC drops under stress.
32. Multiple periods for Reporting Classes was added in z/OS 1.2.
See the discussion in Juergen Holtz paper "z/OS Workload Manager,
The Latest & Greatest", SHARE San Francisco, Session 2515, at
http://proceedings.share.org/sanfrancisco/
31. Benchmark of cost of ASMTAPES/MXGTMNT ML-27 at 2, .5, and .1 sec:
The high CPU cost of ML-20 thru ML-26 was finally confirmed to be due
to IBM recovery after the 0E0 ABEND when Cross Memory Services finds
the address space is gone. One of the beauties of MXGTMNT was that
it captured the READTIME of the JOB in that XMEM call, plus some data
that was not available elsewhere, but since IBM cannot provide a way
to know that an ASID is gone, we simply cannot afford to use XMEM for
monitoring mount durations. The redesign of MXGTMNT in ML-27 turns
off XMEM (except for step transition detection), which not only drops
the CPU cost of the monitor so that it is not an issue anymore, but
also captures almost all allocations and many more of those very fast
VTS mounts, but not spending elapsed time in those recovery modules.
Some of the improvement in capture accuracy is due to corrections
in the earlier designs; at the first UCB with an 0E0, ML-19 and
earlier levels terminated that intervals UCB scan, never looking
at the rest of the UCBs; if you had tape devices with addresses
above your VTS devices, MXGTMNT probably never saw them.
Billy Westland, at Kaiser Permanente, built a job stream that created
thousands of very fast DISP=NEW VTS mounts writing one block of data,
running ML-27 at 2, 0.5, and 0.1 seconds sampling interval. The test
drove the 64 VTS UCBs to an AVERAGE of 4.4 mounts per second (note:
not seconds per mount!), and there were many instances of 12-15 tape
mounts in one second, a far higher rate than you are likely ever to
see in a real data center.
Chuck Hopf ran ML-27 at 1/2 second samples on a system with 344 tape
devices, but all were offline, so his previously reported CPU cost of
MXGTMNT of only one second per hour, only shows the cost of nothing.
But the below benchmarks do show that the CPU cost of MXGTMNT is
driven primarily by the number of mounts and allocations that are
seen by sampling, and not the scanning of the UCBs itself.
RAW DATA FROM ML-27 WITH XMEM VERY-FAST-VTS-MOUNTS BENCHMARK:
==SMF Record Count== ==MXGTMNT== Seconds Mounts
Type Type Type CPU CPU sec With a per
INTRV 21 TALO TMNT secs Per Hr mount Second
2.00 2621 769 93 3.84 14 605 4.3
.50 6379 5892 855 13.54 22 1445 4.4
.10 2651 2666 776 44.02 74 656 4.1
.50 0 0 0 1 0 NONE ZERO
The table shows that with this unrealistically high tape mount rate
of over 4 mounts per second, the hourly CPU cost of MXGTMNT ranges
from 14 to 74 seconds per hour as sampling interval is changed; you
will see much less CPU time per hour in your real environment.
Note that at 2 seconds, we captured only 30% of the allocations and
only 4% of the mounts, while at 1/2 second we captured 92% allocate
and 13% of these mounts, and at 1/10 second interval, we saw 100%
of the allocates and 30% of these very fast mounts.
The step elapsed time of these mounting steps were as small as 180
milliseconds, yet we still see the vast majority of them.
30. APAR PQ69884 for TCP/IP V3 SMF 119 record reports invalid byte
count for failed login attempts.
29. APAR OW44949 corrects loss of TYPE 21 (Tape Dismount) SMF records
when there is a long busy timeout during rewinding of a tape on 3590
devices, and excessive OBRTMP records are created.
28. ObjectStar (HURON, TYPEHURN) SMF records are created in error with
PUT03. The vendor's fix (AP09925) resolved the bad SMF records,
which caused INPUT STATEMENT EXCEEDED RECORD LENGTH error in MXG.
27. For Ficon Director, these RMF APARs/PTFs are required, otherwise
all of the ficon director measurements are zeroes:
OS/390 R2.10 to z/OS 1.1: OW46170, OW46825, OW48950, OW57147
z/OS 1.2: OW52396
26. APAR PQ67187 for TCP/IP V3 SMF 119 record ("IBM Communications
Server for z/OS Version 1 Release 2 and 4 IP") corrects error in
SMF 118 output byte count, which was sometimes 'FFFFFFFF'x (a -1 if
input with IB4 but MXG intentionally inputs with PIB4, a value of
4,294,967,295, which won't be accidentally overlooked!),
25. APAR OW56033, "NPM for TCPIP" SMF records (TYPENPIP in MXG) reports
duplicate records were created by the AEST044 program called by the
AESTNETS started task, and the PTF corrects that error. MXG Change
20.070 reported the duplicates, and added code to delete duplicates
so no new change is required to support this APAR.
24. Information APAR II10752 discusses ICF Catalog Performance Issues,
causes of high CPU time in the CAS address space, and suggests many
parameter values that will help or hinder catalog access.
Quite a bit of tutorial information is presented.
23. APAR PQ68057, SMF 119 FTP records, corrects missing local port and
IP address for the data connection, in both server and client data.
22. APAR PQ68360 for SMF 118 TCP records documents that the DSNAME is
blank if an ftp "GET" is done for a file and the local file is a
DDNAME. Temporary fix: specify the datasetname explicitly instead
of using a DD card; the error is "FIN" - "Fixed in Next".
21. APAR PQ57651 corrects RMF SMF 79 Subtype 15 (IMS Long Lock) to
populate R76FRSNA, the database name involved in the lock.
20. PROC SYNCSORT Early Warning EW5504-1 corrects their WER224A ERROR
if that product is used with SAS V8.2; note this is the additional
product PROC SYNCSORT Release 2.2C, and not for the base SYNCSORT.
19. Error message IOS050I CHANNEL DETECTED ERROR, Interface Control
Checks, and WER061A I/O ERR MISCP were due to missing IBM patches
in their microcode; the error occurred on SHARK DASD connected to
ficon, and a circumvention to set new work volumes to 'DISNEW' in
SMS (so the job must then use non-ficon work packs) worked until
IBM provided the patches (that were missed by IBMS install team!).
18. APAR OW56112 for JES3 Type 25 corrects count of scratch volumes.
17. APAR OW55509 provides new function to update WLM's calculation for
the 4 hour rolling average MSU even when no capacity limit has been
defined. Also, at IPL the 4-hour average rolling average had no
history and was calculated over too short a time period, which has
caused the system to be capped during IPL, so now all 48 entries
are initialized with no load (one uncapped service unit per 5 min
interval, as close to zero as you could get) so the average that
controls at IPL time is an unloaded system.
Note that MXG's MSU calculations keep the prior history and will
report the true rolling average across IPLs.
16. JARS truncates SMF records; a site converting from JARS to MXG had
INPUT STATEMENT EXCEEDED errors on an SMF 110-2 with LENGTH=32718
because their JARS SMF copy/dump job's JCL had LRECL=32722, which
truncates any records with LRECL 32723-32760. JARS didn't fail; it
doesn't use any of the 110-2, 74, etc long records, its copy just
destroyed them. As has been stated since day one of MXG:
For SMF files: ALWAYS use RECFM=VBS,LRECL=32760,BLKSIZE=0
to read or write, and never use a smaller LRECL, for SMF VBS files.
(With SMS, BLKSIZE=0 creates 27998 on DASD, 32760 on tape)
And, that same DCB will read or write ANY MVS V, VB, or VBS files,
since V and VB on MVS are subsets of VBS (all three have a 4-byte
BDW and a 4-byte RDW; V uses only the BDW, VB uses BDW and RDW, and
VBS uses both and the low two-bytes of RDW for the spanning bits).
Okay, what about LRECL=32767 instead of LRECL=32760 for VBS?
Yes, MVS currently does allow you to create a VBS file with 32767
LRECL, and SAS does support that value, but many other software
products fail with 32767. The documented design maximum LRECL for
SMF has always been 32760 (i.e., 32756 bytes of actual data, plus 4
bytes of RDW), and IBM has issued APAR/PTFs to acknowledge/correct
cases when an SMF developer failed to obey that design maximum or
32760. Since there is little value in the extra 7 bytes, and lots
of opportunity for I/O errors, 32760 remains the value of choice.
This might change if the current hardware/software constraints of
two-byte lengths in software and physical track length in hardware
routines are revised to fully support longer blocks and records.
15. IBM APAR OW57219 has been opened to correct errors in counts of the
in-flight batch delay samples (they included ALL of the batch delay
samples). The impossible values are set missing by Change 19.198
to circumvent the IBM error, but that change does not need to be
removed when IBM puts the correct values in the record.
14. A posting from Martin Packer: VIO is a Data in Memory (DIM)
technique that has been shown to use more CPU time. Obviously,
it's better if the VIO is done entirely to memory, but it is one of
those techniques that does require that additional cycles be used.
13. How can you measure delays to jobs due to Tape Allocation:
Tape Allocation Recovery cannot be measured from tape mount data:
NOTE: NO LONGER TRUE; ASMTAPES AT ML-29 (MXG 21.04+) NOW WILL
CREATE A NEW SUBTYPE AND NEW DATASET TYPETARC FOR EACH
TAPE ALLOCATION RECOVERY EVENT. 28AUG2003.
PDB.ASUMTAPE (built from TYPETMNT, TYPETALO, and TYPE21) is a mount
event dataset, and each observation tracks one tape mount from
issuance of the mount until verification that the requested volume
has been mounted. The MNTTM is the Mount Pending Delay to the step
for that specific mount event. You can have many Tape Mount Events
with only a single Tape Device Allocation (e.g., multi-volume
dataset).
Tape Allocation Recovery cannot be measured from IBM Type 10 SMF
"Allocation Recovery" records:
An SMF Type 10 is only written when the operator/operator software
replied to the IEF238 message with the DEVNR/UCBADDR of an offline
tape device (that was then varied online by the system and given to
that step as the allocation recovery).
- This is not the normal way that allocations are recovered
- You don't normally have tape devices offline
(Late at night, your tape apes may vary offline those drives that
are the furthest walk from the tape storage area, and then, as
the workload arrives in the early morning, they will let
allocation recoveries vary those drives back online!)
- There is no start time of the delay in the SMF 10 record
- The normal way an allocation is recovered, since you don't have a
bunch of offline tapes, is for your operator/operator software to
reply to the IEF238 message with WAIT, then reply either
HOLD - lets the step keep the devices already allocated
NOHOLD - frees all devices already allocated
which causes the step to wait until a tape device is freed by
another job, and then that device is allocated to this delayed
step, but there is no SMF 10 record written for these normal
recoveries.
Long ago a SHARE request was made to enhance the SMF 10 but
discarded. I have just sent a request thru my z/OS Technical
Liason Consultant to re-open the possibility of writing an SMF 10
for every Allocation Recovery event, with the addition of the
Start of Delay datetime, and how the operator replied:
DEVICE/WAITHOLD/WAITNOHOLD
Tape Allocation Recovery delay might be inferable from the type 30
step record, because the delay for allocation occurs during the
period from Start of Allocation (ALOCTIME) until the Program Load
(LOADTIME), and MXG variable ALOCTM=LOADTIME-ALOCTIME records the
duration it took to allocate the step, but:
ALOCTM is a function of the number of DDs in the step;
-it takes more time to allocate a new dataset on DASD (must search
all DSCB5's in the VTOC for best fit) than to allocate an old
dataset on DASD (one VTOC read)
-you don't know how many DDs are NEW/OLD in the 30
-you must use a heuristic that compares the actual ALOCTM with an
"expected-time-per-DD" times NUMDD to identify those steps with
longer-than-reasonable ALOCTM durations as likely to have been
delayed due to allocation recovery,
but
ALOCTM also includes the elapsed time to execute your site's ACS
allocation rules
and, much worse,
ALOCTM also includes any delay due to HSM recall, and the step
record has no clue that a recall occurred.
-There is an HSM record written for each recall that is processed
with TYPEHSM that could be merged to identify that the step did a
recall, but the step could have had both a recall and an
allocation recovery.
It may be possible to use the MXGTMNT Tape Allocation dataset
PDB.TYPETALO (one observation for each tape allocation event) and
compare its start of each device's actual allocation with the
ALOCTIME in PDB.STEPS to look for unreasonable delays that might
have been caused by allocation recovery, but you would have to
exclude dynamic allocations (they can be identified because their
start in TYPETALO will be after the step's LOADTIME), and you would
thus miss entirely any allocation recoveries for dynamic
allocations (and DB2 database log offloads, for example, use
dynamic allocation).
12. APAR OW52227 revises the State Samples Breakdown in the WLMGL
Report, as noted in the RMF Changes for z/OS 1.4:
"Up to this release, state samples have been reported as a
percentage of average transaction response time (response time
breakdown). The response time is calculated when a transaction
completes. This can result in percentages greater than 100 when
samples are included for long running transactions which have not
completed in the gathering interval (see also "Understanding
Workload Activity Data for IMS or CICS" in topic 6.1).
Percentages greater than 100 in the breakdown section are now
avoided by showing the state values as percentages of the total
transaction samples (state samples breakdown) instead of
percentages of response time.
This functionality is available as SPE and needs to be installed as
APAR OW52227."
11. APAR PQ67187 for TCP/IP states that SMF 118 record with negative
byte counts (in the Bytes Out field) can occur when performing a
GET with the FTP client.
10. APAR OW55803 notes that you can see IEC705I "Tape ON devn" messages
but still not get an SMF 14 or 15 record written, e.g., job ABENDS
with a SYS 013 RC 20 or SYS 613 RC 20, because "tape ON" message
is detected after data set labels have been created or destroyed
(SL/AL mounted & NL requested). The APAR moves the detect/write of
the IEF705I message until after the OPEN was successful, but does
not alter that for ABEND conditions, SMF 14/15 records will not be
created when an OPEN abend is detected; they are created only after
CLOSE, EOV, and FEOV abends, or a successful open.
9. APAR OW56112 for JES3 TYPE 25 SMF record reports fixes incorrect
count of scratch volumes, and in message IAT5110.
8. APAR OW56133 reports that an ABEND in SMF Interval Processing in
MASTER forces a re-IPL. 0C4 in SMF Interval Processing IEFTB728
was caused by someone having uncaptured a captured UCB, but SMF's
ESTAE routine IEEMB836 was not designed to retry, percolating the
ABEND to detach all of the TCBs at and below the Job Step TCB.
Unfortunately, since the SMF Interval Processing was running in
the MASTER address space, all TCBs in MASTER after the first four
were abended, terminating SYSLOG, LOGREC, and IEEVWAIT task that
attaches all commands that run in MASTER. The PTF caused the SMF
Interval processing to always return to the dispatcher, rather than
percolate back to RTM.
7. APAR OW56457 reports SMF 19 records are written as SMF=0, after
PTF UW86732 was installed.
6. APAR OW55788, various RMF III (RMFVSAM) problems including wrong
values in ERB3GEN3 records, and incorrect Private Area size and
address in VSTOR RMFIII report.
5. APAR OW55709, JES3 only, obscure situation, can create corrupted
SMF 26 Purge Record for job who's output had been NJE'd and the job
number of the job was greater than 32767. ABEND in IATXSMF.
4. APAR OW52300 corrects an error while starting WLM managed inits
to take a long time to start, and APAR OW53104 for JES2 may be
involved to prevent errors if you are using JOBCLASS XEQCOUNT.
3. APAR OW56582 fixes problems with latch processing that causes the
NQC field to increase improperly, causing a system hang (in one
case, when a task holding RACF with SHR is swapped out for any
reason, since it cannot be raised to PVLDP if the NQC is GT 1).
While the primary task hit was RACF, the actual error is in GRS,
2. IDMS APAR Q020117 is required for IDSM Version 15 to correctly
capture CPU time in the IDMLOG02 dataset created by VMACIDML.
Without that APAR, the CPU time (STCTIMUS+STCTIMSU) was only 24%
of the true CPU time.
1. HiperCache product caused MONTHBLD to ABEND erratically (last month
ran fine), failing with messages about the TAPETEMP DD, either with
an OUT OF BUFFER MEMORY error or INVALID FORMAT FOR ACCESS METHOD
SASV6SEQ. Some runs failed on the MONTH.JOBS (first), some after
building many of the MONTH datasets (each using TAPETEMP). Messages
SVCHxxxx identified HiperCache as owning TAPETEMP; disabling the
HiperCache for this job eliminated the ABENDS. You Remove/Add the
job to/from the HiperCache JOBNAMES table depending on whether you
enable by it default and EXCLUDE in JOBNAMES/ or vice versa. BMC
is replacing HiperCache with a MainView product, so not too much
time will be spent, since disabling solved the errors.
IV. DB2 Technical Notes.
V. IMS Technical Notes.
1. The JCLIMSLn/ASMIMSLn/TYPEIMSA programs leave the BMPS dataset in
the //WORK library, because PIMSBMP=WORK is set by MXG default.
If you want to have both IMSTRAN and BMPS datasets in one //IMSTRAN
data library, you can use %LET PIMSBMP=IMSTRAN; before the %INCLUDE
of TYPEIMSA member, but then //IMSTRAN must always point to a disk
dataset, because IMSTRAN.IMSTRAN and IMSTRAN.BMPS would be both open
at the same time, and that's not supported if IMSTRAN is a tape.
So to write the BMPS dataset to an IMSTRAN library that could be on
either disk or tape, you would add this code after TYPEIMSA include:
PROC COPY IN=WORK OUT=IMSTRAN; SELECT BMPS;
VI. SAS Technical Notes.
11. SAS/GRAPH Warning "The intervals are not evenly spaced" or "No minor
tick marks will be drawn...." set CC=4 in SAS V8 (was CC=0 in V6).
The SAS/GRAPH option NOGRAPHC on the GOPTIONS statement will force
a return code of zero even with these warnings.
10. CA-ALLOCATE PTF is required for it to be able to extend SAS DASD
datasets to multiple volumes. CA PTF Q022786 corrects errors,
such as ERROR: Write to XXX.XXX.XXX failed. File is full and may be
damaged. SAS Note SN-008518 reports the CA error.
9. Conversion from very old SAS Version causes SAS ERROR 22-322.
SAS ERROR 22-322 with OTHER=?< $HEX2. ?> text will occur when you
use %INCLUDE SOURCLIB(FORMATS); to update an old MXG format library
that was originally created under Versions 5, 6, or 7 of SAS, as the
physical architecture of SAS "Catalogs" have incompatibly changed.
Just erase/scratch the old format file/dataset/catalog and rerun
the FORMATS program to create a new format library under the new
version of SAS. If your MXG format library is that old, you should
also check PDB/SPIN/WEEK/MONTH/TREND libraries that are probably of
that old version, and you should create new datasets with SAS V8/V9
and copy into them the old data, then delete the old, rename the new
back to the old DSNAME, so all your SAS data libraries are in the V8
architecture (which is unchanged in V9; you can read/write to/from
V8/V9 SAS data libraries with SAS V8/V9).
8. Benchmarks on z/OS 1.2 on z900 of SAS Version 9.0 and Version 8.2.
This note was revised March 28, 2003.
The V9.0 to V8.2 benchmarks reported in this note in Newsletter
FORTY-TWO were invalid comparisons: FULLSTATS were enabled in V9.0
but were disabled in V8.2. The benchmarks were rerun with options
NOSTATS NOFULLSTATS NOSTIMER NOFULLSTIMER NOMEMRPT for both versions
and the results show that there is NO increase in CPU time for V9.0,
and there is an significant improvement in elapsed run time with V9:
Eight comparison tests are run. S1 is the SAS initialization cost,
S2 is the cost of both SAS and MXG Initialization. S3, S4 and S5
are single-data-step tests with program TYPE30 to measure compile
cost (DD DUMMY), and the cost with 700MB and 2100MB input. Steps
S6, S7, and S8 are multiple-data-and-proc-step tests with program
BUILDPDB with 0, 700MB, and 2100MB of input:
S1 - DATA _NULL_; with // EXEC SASV9 - i.e., no MXG VMXGINIT
S2 - DATA _NULL_; with // EXEC MXGSASV9, i.e. VMXGINIT executed
S3 - INC (TYPE30); - MXGSASVn - with //SMF DD DUMMY
S4 - INC (TYPE30); - MXGSASVn - with //SMF DD DSN=700MB - 1x
S5 - INC (TYPE30); - MXGSASVn - with //SMF DD DSN=2100MB - 3x
S6 - INC (BUILDPDB); - MXGSASVn - with //SMF DD DUMMY
S7 - INC (BUILDPDB); - MXGSASVn - with //SMF DD DSN=700MB - 1x
S8 - INC (BUILDPDB); - MXGSASVn - with //SMF DD DSN=2100MB - 3x
SAS Version 9.00 TS M0 SAS Version 8.2 TS2M0
Step TCB SRB Elaps SMF TCB SRB Elaps SMF
min min min EXCP min min min EXCP
Comparison with NO Statistics enabled in SAS options:
S1 SAS .00 .00 0.0 1029 .00 .00 0.0 660
S2 MXG .02 .00 0.1 1222 .02 .00 0.0 846
S3 DUMMY .03 .00 0.1 1551 .03 .00 0.1 1148
S4 TYPE30 2.98 .01 4.9 34389 3.09 .01 6.3 33976
S5 TYPE30 8.85 .03 13.5 100000 9.35 .03 27.1 100000
S6 DUMMY 1.13 .01 2.9 28287 .97 .00 3.1 14324
S7 BUILD 5.57 .03 10.7 74596 5.55 .02 14.5 60032
S8 BUILD 12.15 .06 19.7 145000 12.41 .04 28.5 130000
For all runs, the CPU time for V9 is the same or slightly less than
the V8 CPU time, and the elapsed time with V9 is much less than V8.
These runs are with the Pre-Production 9.00 release; we will re-run
these same tests when the Production V9.1 is available for testing.
7. New SAS V9 SYSMSG messages count blocks and I/O operations.
SAS V9 on "MVS" prints new messages on SYSMSG for each SAS step with
count of blocks and I/O operations for each SAS database that was
accessed during that step:
+SAS processed 21881 blocks and performed 4897 I/O operations
on library SYS02311.T182432.RA000.DPTTSM01.WORK04.H01
These messages count only I/O to SAS data libraries, catalogs, etc;
there is no message for INFILE nor FILE I/O. Tests with DD DUMMY
and a known file, and the new count messages, show that:
- I/O counts for INFILE/FILE that are passed by SAS into type 30
"EXCP" counters are the number of Blocks, which is correct and
consistent with IBM documentation for what should be recorded
for sequential access datasets.
- I/O counts for Data Libraries, Catalogs, etc, that are passed by
SAS are NOT the number of Blocks, but instead are the number of
I/O operations (counted as an SSCH or SIO in RMF).
This is not new in V9, SAS has always done it this way, sending
the count of "EXCP" macros in its IEASMFEX call, instead of the
count of Blocks, but that inconsistency with expectations is now
being reviewed by SAS Development.
This "EXCP" confusion is understandable; an ASM programmer issues
an IBM "EXCP" macro to do I/O, and thus counting "EXCP" macros for
EXCP count is easy. But each EXCP macro is serviced by IOS, which
creates the SSCH (Start Sub-Channel, the channel program that does
the real I/O), so an EXCP macro with BUFNO=5 will moves 5 Blocks
as a result of that one "EXCP" macro the ASMer coded.
And I fibbed in my example text of the new SAS message; the actual
V9 text says "... and performed 4897 EXCPS on ..."; my example of
"... 4897 I/O Operations on ..." was to stress what is counted,
but is also a change under consideration by SAS development.
Now that you understand what's in the new messages, you can see
they may be very useful, especially in diagnostics, to figure out
how much was read/written from which library. Those messages for
the preceding V9 benchmarks show this raw detail on SYSMSG:
Step Elapsed SMF ==SAS MESSAGE: BLOCKS/"EXCPS"==
min EXCP PDB WORK
S1 0.0 1033 - 29/12
S2 0.1 1229 - 217/98
S3 0.1 1555 - 332/137
S4 3.6 29057 - 7047/1389
S5 11.0 85046 - 21881/4897
S6 5.3 28333 1013/632 12924/5681
S7 19.2 77409 6795/1552 87453/25860
S8 43.2 136999 7080/1637 114573/33363
And: TYPE30: had 33/33 to FORMATS, 261/149 to SASHELP.
BUILDPDB: 1780/1780 to FORMATS, 2518/901 to SASHELP.
(MANY data steps in BUILDPDB, one in TYPE30)
and BUILDPDB: 86/75 to SPIN, 13/7 to CICSTRAN.
6. BMC reports Fix 361824, available from MAINVIEW Batch Optimizer MBO
product support at 1-800-841-2031 corrects the failure associated
with SAS and PAV volumes causing physical I/O errors on the SAS Data
Library because related I/Os were processed out of order. With the
fix, a logical wait will occur between I/Os when the PAV feature is
enabled. Mar 14, 2003.
This was the original note:
The BMC Optimizer Product Hiper-Cache feature appears to be the
culprit that causes I/O errors when PAV (Parallel Access Volume) is
enabled on ESS (SHARK) DASD volumes that are used for SAS WORK
libraries, with these error messages in daily BUILDPDB jobs:
EXPECTING PAGE 218, GOT PAGE -1 INSTEAD.
PAGE VALIDATION ERROR WHILE READING WORK.XXXXXXX.DATA
FILE WORK.XXXXXXX.DATA IS DAMAGED. I/O PROCESSING DID NOT COMPLETE
but the errors did not repeat when the job was rerun without change.
Two sites had daily job ABENDs about once a week for about a
month; both have V8.2 and 82BX03 Hot Fix Bundle installed; one
SHARK is in 3990 mode, and the other is in 2105-E20 mode.
Those sites have now disabled the Mainview Optimizer product and the
ABEND has not reoccurred (i.e., PAV and SHARK by themselves do not
cause an error, only when Optimizer is also active). To disable the
Mainview Batch Optimizer product for a step, add these DD statements
//DAP@NVPO DD DUMMY
//DAP@NNPO DD DUMMY
to the JCL for that step, or in MXGSASV8 JCL proc for all MXG steps.
I conjectured that the erratic nature might confirm that PAV was
involved; maybe only when there is concurrent use of the device,
i.e. when the IBM code for PAV is active, that an exposure exists.
SAS, IBM, and BMC Technical Support now are investigating GTF traces
to locate the exact cause of the error, but the error has only been
seen when Optimizer is active with both PAV and SHARK DASD present
This note will be updated when a corrective fix has been determined.
When PAV itself was the suspect, this note on how you would disable
PAV was researched; it is kept here now only for future reference:
Your Storage guru would use the IBM Storage Expert Application,
that runs on the OS/2 platform that controls the SHARKs, to set
the count of alias (PAVs) for a DEVNR to zero.
5. New SAS Hot Fix Bundle 82BX03 now replaces all previous Hot Fix and
Bundles, and should be installed on SAS V8.2 for MXG execution, as
it corrects additional I/O errors SAS fixed after 82BA77 et al.
30Oct02.
4. From Scott Barry's posting to MXG-L, to create a datetime value in
DDMMMYYYY.HH.MM.SS.SSSS (where separators must be dots, not colons):
DATA _NULL_;
NOWIS=DATETIME();
DMYHMSS=TRANSLATE(PUT(NOWIS,DATETIME23.4),'.',':');
PUT 'DDMMMYYYY.HH.MM.SS.SSSS=' DMYHMSS;
3. COMPRESS=YES requires more virtual storage than COMPRESS=NO; this
is not really a problem, just an FYI, and it only was noted in a
data step with over 350 datasets being created; job took 93M with
COMPRESS=YES, and "only" 81M with COMPRESS=NO.
2. SAS USER COMPLETION CODE=1310 is a virtual memory problem; your JCL
did not specify REGION= correctly, and/or your installation virtual
memory limits are way too small for SAS. There was no //SASLOG DD
output created - no WELCOME TO SAS message ' as a further clue that
SAS was never loaded in memory. This occurred under SAS V8.
1. A very strange "NOTE: EQUAL SIGN not found." on the SAS log occurred
during the execution phase of a DATA/INFILE program when a new block
of code (for a new subtype) was tested. By inserting PUT statements
(an ACM survey discovered that the number one debugging tool of
all programmers is the insertion of "print" statements!)
the note was created when record 1362 was read, but records 1350 to
1361 were of the new subtype, so the new code had executed without
producing the note; this note was somehow data-driven. Examination
of the new code found that a syntax error, an incomplete comment
that swallowing following statements, was the actual cause!
VII. CICS Technical Notes.
1. APAR PQ67142 reports discrepancies between the ABEND codes in CICS
type 110-1 (CICSTRAN dataset) and the codes in DHFAC2236 messages.
VIII. Windows NT Technical Notes.
IX. Incompatibilities and Installation of MXG 20.20.
1. Incompatibilities introduced in MXG 20.20 (since MXG 19.19):
See CHANGES.
2. Installation and re-installation procedures are described in detail
in member INSTALL (which also lists common Error/Warning messages a
new user might encounter), and sample JCL is in member JCLINSTL.
X. Online Documentation of MXG Software.
MXG Documentation is now described in member DOCUMENT.
XI. Changes Log
--------------------------Changes Log---------------------------------
You MUST read each Change description to determine if a Change will
impact your site. All changes have been made in this MXG Library.
Member CHANGES always identifies the actual version and release of
MXG Software that is contained in that library.
The CHANGES selection on our homepage at http://www.MXG.com
is always the most current information on MXG Software status,
and is frequently updated.
Important changes are also posted to the MXG-L ListServer, which is
also described by a selection on the homepage. Please subscribe.
The actual code implementation of some changes in MXG SOURCLIB may be
different than described in the change text (which might have printed
only the critical part of the correction that need be made by users).
Scan each source member named in any impacting change for any comments
at the beginning of the member for additional documentation, since the
documentation of new datasets, variables, validation status, and notes,
are often found in comments in the source members.
Alphabetical list of important changes after MXG 19.19 now in MXG 20.20:
Dataset/
Member Change Description
See Member CHANGES or CHANGESS in your MXG Source Library, or
on the homepage www.mxg.com.
Inverse chronological list of all Changes:
Changes 20.341 thru 20.001 are contained in member CHANGES.