COPYRIGHT (C) 1984-2021 MERRILL CONSULTANTS DALLAS TEXAS USA
MXG NEWSLETTER FIFTY-TWO
*********************NEWSLETTER FIFTY-TWO*******************************
MXG NEWSLETTER NUMBER FIFTY-TWO, AUG 24, 2008.
Technical Newsletter for Users of MXG : Merrill's Expanded Guide to CPE
TABLE OF CONTENTS
I. MXG Software Version.
II. MXG Technical Notes
III. MVS, aka z/OS, Technical Notes
IV. DB2 Technical Notes.
V. IMS Technical Notes.
VI. SAS Technical Notes.
VI.A. WPS Technical Notes.
VII. CICS Technical Notes.
VIII. Windows NT Technical Notes.
IX z/VM Technical Notes.
X. Incompatibilities and Installation of MXG.
See member CHANGES and member INSTALL.
XI. Online Documentation of MXG Software.
See member DOCUMENT.
XII. Changes Log
Alphabetical list of important changes
Highlights of Changes - See Member CHANGES.
COPYRIGHT (C) 1984,2008 MERRILL CONSULTANTS DALLAS TEXAS USA
I. The 2008 Annual Version MXG 25.25 was dated January 28, 2008.
All sites were mailed a letter with the ftp download instructions.
The availability announcement was posted to both MXG-L and ITSV-L.
You can always request the current version using the form at
http://www.mxg.com/ship_current_version.
1. The current version is MXG 26.07, dated Aug 24, 2008.
See CHANGES member of MXG Source, or http://www.mxg.com/changes.
II. MXG Technical Notes
3. Why are some tape mounts NOT captured by ASMTAPEE/MXGTMNT monitor,
and how MXG solves that problem.
The standard answer, it seems, is "it all depends ...".
The ASMTAPEE/MXGTMNT Tape Mount Monitor uses an IBM-provided exit,
if your tape drive allocations are controlled by IBM software, or
MXGTMNT uses the MXG-provided ASMHSCEX code (that you install in
STK user exit SLSUX01), if HSC controls tape drive allocations.
For the IBM IEF_VOLUMEMNT exit, MXGTMNT captures every mount that
is issued with an IEF233A or IEF233D mount message, and we do NOT
capture any mount that is issued with IEC501A nor IEC501E messages.
Specifically, those IEC501x messages that are NOT captured are for
second-and-subsequent volume mounts of a multi-volume tape dataset.
IBM has confirmed that is "working as designed" for their exit, as
it is taken only for Allocation's mounts, whereas the IEC501x mounts
are OPEN/CLOSE/EOV mounts that do not go thru that exit.
The IBM Volume Mount Exit also misses ALL mounts issued by some
programs: DFHSM, OPC, and DMS jobs mount tapes that MXGTMNT does not
capture in the IBM exit, because those mounts are issued from OPEN
which doesn't use the IBM exit. These mounts can cause SMF type 21
dismount records, but some have a blank volume serial, and some
missed mounts do not have standard SYSLOG mount messages. Also,
none of DFHSMS mounts on 3590s are captured, while mounts for other
jobs on 3590s are captured by the MXGTMNT monitor.
For the STK SLSUX01 exit, STK Support installed our exit and it
captured 100% of all HSC-controlled tape mounts, both to virtual and
to real tape devices, in several tests in their labs by Sun
Technicians.
The solution to these missed mounts in the MXGTMNT event monitor is
its separate capture of SYSLOG tape mount events, and MXG's ASUMTAPE
program that combines the MXGTMNT event, the SYSLOG events, and the
IBM TYPE21 dismount event, to create the PDB.ASUMTAPE dataset that
DOES contain an observation for EVERY tape mount event.
The MXGTMNT monitor captures SYSLOG messages in its subtype 8 record
for these mount-related events:
IEC501A IEC501E IEC705I IEF233A IEF233D IEF502E IEF234E IOS070E
IECTMS6 IECTMS9 IOS690I IEF235D
Dataset TYPESYMT (SYslog MounTs) decodes those SYSLOG records, which
include the JOB, JESNR, SYSTEM, ASID, and the EVENTIME. These SYSLOG
events are used in the ASUMTAPE program to populate these variables:
Used to set SYLMTIME - SYSLOG MOUNT START TIMESTAMP:
IEF233A - First Volume Mount, JCL Allocation Issued
IEF233D - First Volume Mount, Dynamic Allocation Issued
IEF501A - OPEN/CLOSE/EOV, MULTI-VOL, or DEFERRED MOUNT
IEF501E - 2nd+ Volume for OPEN/CLOSE/EOV "Look Ahead"
Used to set SYLVTIME - SYSLOG MOUNT VERIFY/END TIMESTAMP:
IECTMS6 - DEVNR,VOLSER,IS APPROVED FOR TRTCH CHANGE
IECTMS9 - DEVNR,VOLSER, DSNAME17 at OPEN
IEC705I - TAPE ON DEVNR,VOLSER
Used to set SYLKTIME - SYSLOG KEEP TIMESTAMP:
IEF502E - Intermediate Volume KEEP
IEC234E - Last Volume KEEP
Additional SYSLOG messages, below, are captured in TYPESYMT, for
investigation in cases of long tape mount delays, but they are not
used in the construction of PDB.ASUMTAPE:
IEF690I - FOLLOWING VOLUMES UNAVAILABLE
IEF235D - JJJ STEP WAITING FOR VOLUMES
IEC205I - VOLUME LIST
ASUMTAPE creates variables BEGTMNT, ENDTMNT, the begin and end times
of each tape mount event, their delta in TOTMNTTM, the mount delay
to jobs, as well as TAPMTDTM, the duration when the tape volume was
mounted from mount until its keep/dismount for this job:
BEGTMNT='BEGIN TIME*OF TAPE*MOUNT EVENT'
IF SYLMTIME GT 0 and TMNTTIME GT 0 THEN
BEGTMNT=MIN(TMNTTIME,SYLMTIME);
ELSE iF SYLMTIME GT 0 THEN BEGTMNT=SYLMTIME;
ELSE IF TMNTTIME GT 0 THEN BEGTMNT=TMNTTIME;
ELSE BEGTMNT=.;
It is the minimum timestamp of the start of the mount event,
from SYSLOG or MXGTMNT.
ENDTMNT='END TIME*OF TAPE*MOUNT EVENT'
IF SYLVTIME GT 0 AND TENDTIME GT 0 THEN
ENDTMNT=MAX(TENDTIME,SYLVTIME);
ELSE IF SYLVTIME GT 0 THEN ENDTMNT=SYLVTIME;
ELSE IF TENDTIME GT 0 THEN ENDTMNT=TENDTIME;
ELSE ENDTMNT=.;
It is the maximum verification time or mount end, from SYSLOG
or MXGTMNT.
TOTMNTTM='TIME IT TOOK*TO MOUNT*TAPE VOLUME'
IF ENDTMNT GT 0 AND BEGTMNT GT 0 THEN
TOTMNTTM=ENDTMNT-BEGTMNT;
It is the duration the job was delayed for this tape mount.
TAPMTDTM='DURATION*TAPE WAS*MOUNTED*TO DISMOUNT'
IF (SYLKTIME GT 0 OR TY21TIME GT 0) AND BEGTMNT GT 0 THEN
TAPMTDTM=MAX(SYLKTIME,TY21TIME)-BEGTMNT;
It is the duration that the tape volume was mounted on the
device for this mount event.
The variable DSNAME will be populated in PDB.ASUMTAPE if the DSNAME
was captured in TYPETMNT or if it was non-blank in any of these
SYSLOG messages that can contain a DSN: IEC233A,IEC705I, IEC501A or
IEC501E. Some events can never have a DSNAME (e.g., HSM only-234E
KEEP), but variable MNTHAVE identifies which SYSLOG records were
found for this mount event so MNTHAVE can be used to identify those
cases where the DSNAME is always blank.
Each line below is an example, left to right, of the sequence of
the SYSLOG messages for several example mount events:
233A 234E
233A TMS6 TMS9 705I TMS014 234E
233A TMS6 TMS9 705I TMS014 502E for first vol
501A TMS6 TMS9 705I TMS014 502E for intermediates
501A TMS6 TMS9 705I TMS014 234E for final volume
or
501E TMS6 TMS9 705I TMS014 234E for final volume
233A 070E TMS014 502E
690D 235D 705I 234E
2. The CPU cost of performance monitoring and capacity planning.
One MXG user reports they currently write 500 GB of SMF data per day
or an average rate of 6 MegaBytes per second across all platforms.
They dump SMF multiple times each day, and build multiple "PDB's"
throughout the day, and run many ad hoc analysis reports as well.
They have SMF, RMF, OMEGAMON, and NETVIEW monitors consuming CPU.
The daily total CPUTM for each of their workloads were:
OMEGAMON 28:56:37
MXG JOBS 19:05:01
RMF III 12:20:05
RMF I 6:29:11
SMF DUMPS 4:12:30
MONITORS 2:17:10
SMF ASID 0:29:16
TOTAL CPUTM 73:30:50 = 2% of 3744 HOURS with 156 CPs
Thus this sites total daily cost of 74 CPU hours is an average use
of 3 CP engines all day long, but with 156 CP Engines, that is ONLY
2% of the installed CP engine capacity, for the entire CPU cost of
performance monitoring, data collection, building PDBs, archiving,
and all MXG daily reporting and ad hoc analysis.
The UKCMG2008.PPT presentation at http://www.mxg.com/downloads ends
with the above statistics and a SAS/GRAPH showing the daily profile
of this site's CPU consumption for all of the above work.
1. The MXGTMNT Tape Mount Monitor must be at the current ML-39 level
before you execute it under z/OS 1.9. Otherwise it will ABEND with
B78-5C, which was corrected in ML-39.
III. MVS, a/k/a z/OS, Technical Notes.
27. APAR PK85069 corrects ABEND with SMF 120 Subtype 9, but notes that
"Certain types of threads lack a request-specific object that is used
in generating the SMF 120 subtype 9 CPU usage subsection. When
WebSphere Application Server for z/OS attempts to write out the SMF
120 subtype 9 CPU usage subsection, it encounters a null pointer,
which causes the server to terminate.
PROBLEM CONCLUSION:
The SMF 120 subtype 9 CPU usage subsection will now only be written
from threads that contain the necessary request-specific object.
26. Enabling HIPERDISPATCH=YES in OPT for a z9 processor unintentionally
disables IRD. An IBM APAR will be created, but you can correct the
error by setting HIPERDISPATCH=NO until the PTF for the APAR exists.
APAR OS26225 has been opened to ultimately correct this IBM error.
25. APAR OA23592 reports incorrect values in SMF30UCT (MXG PRODTCB in
dataset TYPE30MU, Measured Usage, and in SMF89UCT (MXG PRODTCB in
dataset TYPE89). Values are much larger than they should be, and
could be massively larger, when subtraction incorrectly produced
a "negative" number that MXG sees as large positive value.
24. APAR PK66063 for TCP/IP V3 corrects many things that impact the
SMF 119 records, as well as TCP/IP itself. Jul 20, 2008.
23. APAR OA25065 and OA25603, together, cause SMF 42 subtype 6 interval
records to now be written for the SMSPDSE and SMSPDSE1 address
spaces; those are not "full-function" address spaces (i.e., the
started before SMF was fully enabled at IPL), and the 42-6 were only
written for full function ASIDs, but this APAR revised code for
those address spaces.
22. APAR OA25225 corrects a continual growth in storage used for the TCT
because TCTT30UJ work area (used for SMF type 30 records) was not
freed by IEFTB721 at job end, causing orphaned storage in subpool
255, which could lead to an auxiliary storage shortage,
resulting in MSGIRA200E.
21. APAR OA25825 reports zIIP work not being dispatched on CPs when zIIP
is full but CPs have capacity. Algorithm acknowledged as wrong.
20. APAR OA25095 reports that SMF 72-3 records may not be written for
some CICS or IMS Reporting Class Data. z/OS 1.9 stopped writing
72-3 for inactive Reporting Classes, but that inactive-test was not
correct for CICS or IMS address spaces that are managed to the goals
of the region in the WLM policy and also have reporting only classes
set up in the CICS or IMS subsystem. This caused the variables
IMSTRAN and CICSTRAN in the PDB.RMFINTRV dataset to have zero
counts.
19. APAR OA24435 reports RMF MON III zFS Summary Report incorrectly
reports 0 for USE% for an aggregate >= 2G in size. Jun 10, 2008.
18. CF Utilization when you have shared ICFs and your CFs are at
microcode level 15 can be wrong; the correction is a microcode
update to the CF, MCL number G40953.004, which is documented as
CFCC code returning inaccurate value to software applications
used to calculate performance data(RMF, Omegamon). Incorrect
processor wait time will affect processor utilization numbers.
Problem only shows up when using SHARED CP's or SHARED ICF's in
the CFCC image. Jun 10, 2008.
17. APAR PK62236 reports that SMF 116 records for long running threads
can be corrupted by statistics from a different queue.
16. APAR PK65203 reports that SMF 115 records for Version 6 do not
include GETS/PUTS via the new internal SPIGET/SPIPUT calls,
causing major reduction in MQGET/MQPUT counts between releases.
15. APAR OA24361 corrects high CPU time in RMF I address space when
VSTORE is specified to monitor an address space's virtual storage
usage, and the address space has lots of subtasks sharing the same
subpool. May 14, 2008.
14. APAR OA25063 confirms that SMF 42 subtype 6 records are NOT written
for the SMSPDSE and SMSPDSE1 address spaces, because they are not
full function. The APAR is OPEN, so it is not clear if this will be
corrected, i.e, for all not-full-function-ASIDs (those that started
before SMF had completed its initialization, and identifiable
because they write SMF 30 interval subtype 6 records instead of 2.)
May 13, 2008.
13. The IEF374I step termination message EXT xxxxK value records the
virtual storage used above the line, and is useful to prove that
OUT OF MEMORY errors were the result of site restrictions or due
to the absence of a REGION parameter on the JOB statement, when
that EXT xxxxK value showed only 32M was used.
The message syntax is VIRT xxxxK SYS xxxxK EXT xxxxK SYS xxxK
and this note is only about the last two fields in the message.
This note revised after IBM provided documentation, May 14, 2008.
The IEF347I message SYS xxxxK value previously was observed to have
a value limited by the size of the private area, typically 10MB, and
the sum of the SYS xxxxK value and the EXT xxxxK value matched the
value that SAS reports for Total Memory on the SAS log.
The IEF032I message replaced IEF374I with content unchanged.
But a job was observed to have recorded a SYS value of 516,208K, or
over 504MB; that job had a REGION=300MB limit on the JOB statement,
and its EXT value was 180MB, so that job used 180MB of the 300MB
REGION limit, plus the 504MB outside that REGION limit, for a total
of 180MB+504MB=684MB total virtual storage!
The IBM "LOOK AT" documentation for IEF374I message states that the
SYS xxxxK value is the high-watermark that the address space used
from the extended LSQA, the extended SWA, and the "extended high
private area", which in this message, refers to 'authorized' private
subpools. When it is talking about 'user' region subpools it uses
the term "user region of the private area".
The EXT xxxxK is reported from TCTELWM which is for user region
subpools. The value in TCTELWM cannot (and does not) exceed the
REGION value (except that a REGION value less than 16M will always
get 32M Above the Line, and, of course, a user's requested REGION
can be altered in the site's IEFUSI exit.
The SYS xxxxK is reported from TCTEHWM which is just LSQA and SWA
(authorized private subpools), not user region subpools. That value
of 504MB is recorded in SMF type step termination records, SMF30EAR,
is recorded in SMF type 30 step termination records, MXG variable
LSQSZHI, documented as the Local System Queue and SWA areas above
the 16MB line.
Further examination of another site's SYSLOG showed that almost all
jobs reported 9990K for the SYS xxxxK value, but there were six jobs
with values over 100MB, the largest being 940MB, so this is not a
unique observation at one site's accidentally observed job.
However, it is unclear if there is a real problem here:
If that extended private area can be page fixed (for example, SORT
packages can page-fix half of the REGION size), this additional and
uncontrolled virtual storage allocation could definitely impact the
real storage usage of the entire system.
But, this virtual storage allocation may be caused by LE, Language
Edition, which allocates storage heaps and is known to over-allocate
when sizes are not correctly specified; fortunately, LE's heap area
cannot be page-fixed by a sort products. I have an open query with
IBM support about any potential impacts of this allocation; this
note will be updated when more is known. May 14, 2008.
May 22: Additional information from IBM in reply to my questions:
-Is the SYS, i.e. the Extended LSQA or Extended SWA area, fixed
memory?
LSQA subpools can be 203-205 (DREF), 213-215 (DREF), 223-225 (FIXED)
233-235 (FIXED), and 253-255 (FIXED). DREF and FIXED are always
backed in real. With DREF, RSM can change the frame that backs that
data. The SWA subpools are 229-230, 236-237 and 249, which are all
pageable.
-Isn't the purpose of REGION= to limit virtual storage allocated?
Yes
-If so, isn't this over-allocation a defect?
I'd call it "working as designed"; the design is based on an
assumption of well-behaved programs when it comes to applications
that are using authorized subpools such as LSQA and SWA (and common
storage, too).
-Is there any real cost to large virtual storage allocations?
If you exceed the availability of virtual storage addresses in an
address space, you'll get an ABEND878, for example. Of course, if
the virtual is getting backed in either real or aux, you can also
end up with shortages in those kinds of storage as well.
-If the step record shows no PAGEOUTS, does that guarantee that the
pages were never initially backed on auxiliary storage, i.e., no
physical I/O for paging if never referenced/stored?
We only back virtual pages in auxiliary storage if we need to do
page replacement when the system is real storage constrained. If the
step record shows no PAGEOUTS it is an indication that virtual is
not getting backed on AUX. Also, read the Subpools Attributes Table
8-1 in MVS Diagnosis Reference for more details on how storage is
backed in real. For example, "Virtual storage is first backed by
central storage when it is referenced or when it is page-fixed by a
program using the PGSER macro. The location of the central storage
backing this subpool depends on the value of the LOC parameter on
the GETMAIN, STORAGE, or CPOOL macro invocation used to obtain the
storage."
-I had not contacted the vendor of this new application, yet, as
I didn't want to cause them alarm unless there is an exposure.
The software vendor should be aware of which LSQA and/or SWA
subpools they are using, as this is authorized storage, and use of
it (especially ELSQA fixed and/or DREF) should be used
carefully.
-z/OS 1.12 message IEF032I/33I replaced IEF374I/276I.
12. zIIP and zAAP measurements when they are faster than CPU engines,
a/k/a "knee-capped" CP (a/k/a GP) engines of slower speed.
When specialty engines are faster than the speed of your CPs, there
is a normalization factor to convert the recorded seconds to their
NORMALIZED (EQUIVALENT) time, as if they had executed on your CPs.
Eg: If the Normalization Factor is 3, then one second of raw zIIP
time becomes three seconds of normalized time, i.e., that one
second on zIIP would have taken three seconds on the GP engine.
In all MXG datasets whose data comes from address spaces or Service/
Reporting Classes, and all transaction records from address spaces:
ALL ZIIP/ZAAP CPU TIME DURATION VARIABLES ARE NORMALIZED.
Those variables have 'ZIP' or 'IFA' in their labels for identity.
The records are SMF 30, 72, 89, 101, 110, and 120, RMF III ZRB,
and IMS56FA log records, which create these MXG datasets: TYPE30_4,
TYPE30_5, TYPE30_V, (and hence datasets PDB.STEPS, PDB.JOBS and
PDB.SMFINTRV, created from 30s), TYPE72GO (and hence RMFINTRV),
TYPE89 and TYPE892, DB2ACCT, CICSTRAN, TYPE1209, ZRBRCDS, ZRBRCDR,
and IMS56FA, and vendor product records that replicate these data
sources. All of the Service Units are segregated by Engine Type.
- IN RMF 70 HARDWARE RECORDS ZIIP TIMES ARE NOT NORMALIZED. -
The zIIP/zAAP normalization factors are only provided in the
SMF 30, 72, 79, 89, and 120, and RMF RCD records.
However, the IBM RMF reports present these data quite differently.
This system has the normalization factor, R723NFFS=569/256=2.222,
that is, one second of zIIP is equal to 2.222 seconds of CP time.
====================================================================
MXG Dataset TYPE72GO dataset values:
====================================================================
SERVICE CPUUNITS ZIPUNITS CPUTCBTM CPUZIPTM
3,932,091 1,793,920 2,137,167 178.92 213.16
====================================================================
RMF WORKLOAD REPORT:
====================================================================
Under "SERVICE TIMES", the RMF "CPU" value of 392.9 seconds is the
total of the real CPU time on CP engines, plus the NORMALIZED CPU
time on the zIIP and zAAP engines; it is NOT the CPU "TCB" time.
( 392.9 = 178.92 + 213.16 "RMF CPU" = CPUTCBTM + CPUZIPTM )
But also under "SERVICE TIMES", the RMF "IIP" (zIIP) value of 96.1
seconds is the UN-NORMALIZED, raw, seconds on the zIIP engine.
And the RMF "AAP" value for zAAPs is also the UN-NORMALIZED value.
And under "SERVICE", the RMF "CPU" value of 3931K is the TOTAL
SERVICE units from CPs, zIIPs, and zAAPs.
REPORT BY: POLICY=OWL WORKLOAD=CSSDDF
TRANSACTIONS ---SERVICE---- SERVICE TIMES ---APPL %---
AVG 0.23 IOC 0 CPU 392.9 CP 4.98
MPL 0.23 CPU 3931K SRB 0.0 AAPCP 0.00
ENDED 51 MSO 0 RCT 0.0 IIPCP 0.07
END/S 0.01 SRB 0 IIT 0.0
#SWAPS 0 TOT 3931K HST 0.0 AAP N/A
EXCTD 0 /SEC 1092 AAP N/A IIP 2.67
AVG ENC 0.23 IIP 96.1
====================================================================
While the workload datasets have normalized CPU time, in all of the
"hardware" datasets, TYPE70, TYPE70PR, ASUM70PR etc., the CPU times
for the zIIP and zAAP engines are the raw seconds of CPU Dispatch
Time on those engines, and is NOT normalized. As a result, then,
the total ZIPACTTM recorded in TYPE70 for the above system for the
day was 10,887 seconds, but the total CPUZIPTM in TYPE72GO for the
day was 23,079 seconds.
Those 10,887 raw hardware seconds would be 24,190 normalized seconds
so the zIIP capture ratio at this site is 23079/24190 = 95.4%.
11. Increased uncaptured CPU time and elongated elapsed time on z10 for
zip-engine-using jobs with back-level z/OS 1.7 or 1.6 are reported
after OA20135 was applied are corrected in APAR OA24462. Same error
for z/OS 1.8 is corrected in APAR OA21991 and does not occur on 1.9.
10. APAR PK63170 finally has DB2 setting the SMFxFLG in the SMF header
to indicate that the subtype value in the header is valid, BUT ONLY
for SMF 100 and SMF 101 records; DB2 failed to also set the flag for
the SMF 102, with over 350 subtypes, but (apparently) DB2 was not
willing to move the 102 subtype (IFCID) into the header from the DB2
Product segment at the end of the record; that enhancement request
has been sent to IBM DB2 developers, so there's always a chance it
might happen!
9. APAR OA22341 reports correction to RMF Monitor III CPC Report, only
for intervals in which logical processors were varied online or
offline. The MSU and physical utilization counts were too low,
because online and dispatch times were not considered during these
changing intervals. Now they are.
8. Understanding RMF Workload Manager report - Excellent IBM Discussion
Source..........: CA ASKQQA
Last updated....: 20080331
PROBLEM DETAILS:
I have a few questions:
1) Is there a way to determine quickly how much CPU each SERVICE
CLass is using?
2) Recently we had a sharp increase in our CPU running the same
workload as last week.
12/26/2007
CPU SYSA (55%), SYSB(54%)
01/02/2008
CPU SYSA (39%), SYSB(74%)
We are not having any problems, but we did see SYSB spiking into
the high 90% within the a given 15minute interval, but it average
out to 74% for the fifteen minutes (10:00 to 10:15). Looking at
TMON it shows us that Service Classes (CICSABK and CICSAAT)
increased on SYSB. When I looked at the RMF workload manager
report here is what I see. For the APPL% or CP% IT WAS 324.9.
VELOCITY MIGRATION: I/O MGMT 86.3% INIT MGMT 86.3%
---RESPONSE TIME--- EX PERF AVG --USING%-- -----
HH.MM.SS.TTT VEL INDX ADRSP CPU I/O TOTAL
GOAL 60.0%
ACTUALS
SYSB 86.3% 0.7 6.0 38.1 25.1 10.0
------ EXECUTION DELAYS % ------------- ---DLY%-- -CRYPTO%- %
I/O CPU AUX UNKN IDLE USG DLY QUIE
XMEM
7.6 2.3 0.1 0.0 26.8 0.0 0.0 0.0
Is the USING% for CPU the actual CPU% that this service classes was
using during this 15 minute interval?
We are trying to assess what caused the CPU to spike this week.
There is no additional workload added year end will not be
processing until next week.
Does the APPL% or CP% correlate to actual CPU use?
IBM RESPONSE:
In the RMF WLMGL Report, the field APPL% CP is the sum of the cpu
times (tcb, srb, rct, iit, hst) divided by the reporting interval.
An engine can theoretically be dispatched for the entire interval,
so his is like saying the percentage of an engine. For example, if
APPL% CP is 324.9, that's like saying 3 and a quarter of engine's
worth of cpu resource. So you can quickly scan the APPL% values by
srvclass, to see which srvclass had increased usage of cpu resource
during the SYSB cpu usage spike. Once you've identified the
srvclass which had increased 'APPL% CP' drastically (comparing to
interval from a good normal time), you can go back to the WLM
policy to check what types of jobs get classified into that
srvclass that has grown.
CUSTOMER UPDATE:
Thanks for the quick reply. I do have another question. While
looking at SYSB CPU Activity report, it shows the LPAR MGMT time
being greater than 5 during several intervals (Highest at 6.50). My
question is: Here is the Partition Data Information:
MVS PARTITION NAME ZZ0202
IMAGE CAPACITY 620
NUMBER OF CONFIGURED PARTITIONS 10
NUMBER OF PHYSICAL PROCESSORS 12
CP 12
ICF 0
WAIT COMPLETION NO
DISPATCH INTERVAL DYNAMIC
--------- PARTITION DATA ----------------- -- LOGICAL
----MSU---- -CAPPING-- PROCESSOR-
NAME S WGT DEF ACT DEF WLM% NUM TYPE
ZZ0202 A 81 0 442 NO 0.0 12 CP
ZZ0201 A 5 0 15 YES 0.0 1 CP
ZZ0203 A 3 0 3 NO 0.0 1 CP
ZZ0204 A 5 0 6 NO 0.0 1 CP
ZZ0205 A 3 0 2 NO 0.0 2 CP
ZZ0206 A 3 0 4 NO 0.0 1 CP
ZZ0207 A 1 0 0 NO 0.0 1 CP
ZZ0208 A 1 0 0 NO 0.0 1 CP
ZZ0209 A 5 0 1 NO 0.0 2 CP
*PHYSICAL*
TOTAL
PARTITION PROCESSOR DATA --
----DISPATCH TIME DATA----
EFFECTIVE TOTAL
02.01.38.060 02.08.12.931
00.04.09.386 00.04.13.893
00.00.56.486 00.00.59.156
00.01.47.096 00.01.49.786
00.00.28.968 00.00.31.089
00.01.01.477 00.01.03.485
00.00.00.000 00.00.00.000
00.00.00.000 00.00.00.000
00.00.15.943 00.00.17.880
00.04.51.677
------------ ------------
02.10.17.420 02.21.59.902
-- AVERAGE PROCESSOR UTILIZATION PERCENTAGES --
LOGICAL PROCESSORS --- PHYSICAL PROCESSORS ---
EFFECTIVE TOTAL LPAR MGMT EFFECTIVE TOTAL
67.58 71.23 3.66 67.57 71.23
27.71 28.21 0.04 2.31 2.35
6.28 6.57 0.02 0.52 0.55
11.90 12.20 0.02 0.99 1.02
1.61 1.73 0.02 0.27 0.29
6.83 7.05 0.02 0.57 0.59
0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
0.89 0.99 0.02 0.15 0.17
2.70 2.70
------ ------ ------
6.50 72.38 78.89
Also from the CPU ACTIVITY you see
CPU ONLINE TIME LPAR BUSY MVS BUSY
NUMBER PERCENTAGE TIME PERC TIME PER
100.00 80.82 88.68
100.00 80.19 86.27
100.00 78.81 83.67
100.00 76.98 81.06
100.00 74.73 78.31
100.00 72.41 75.72
100.00 70.07 73.24
100.00 67.88 70.94
100.00 66.39 69.71
100.00 64.16 67.21
100.00 62.05 64.84
100.00 60.29 62.88
TOTAL/AVERAGE 71.23 75.21
Can you explain why we are seeing the LPAR Busy is less than MVS
Busy. Now the only difference between SYSA and SYSB is that SYSB
has one LPAR CAPPED. One other item. After working the numbers, it
shows me because the weight is (SYSB)810 out of 1090 (Total
weights). This tells me that I am guarantee 8.88% CP which to me is
9CPs, whereas all the rest of the LPARs (4 of them at 0.6 CP, 2 of
them at 0.4 and 2 at 0.1). Since we are capping one Active
Partition, what is the bottom line to this. Am I limiting my Main
LPAR by only giving it a weight of only 810. If I bump it up to 900
and ensure the rest equals 1000, this will ensure my main LPAR
would get at least 10-11 CP? Your thoughts.
IBM RESPONSE: MVS BUSY is a totally different view of cpu
utilization from LPAR BUSY (and it can be a little confusing at
first), so LPAR BUSY and MVS BUSY values won't necessarily match.
The MVS BUSY is the percentage of time this z/OS did not go into
a wait state. So MVS BUSY represents how busy the LPAR was, but it
doesn't show how much the LPAR has consumed its online logical
engines. You would look at the LPAR BUSY to determine this.
The LPAR BUSY is the percentage of this LPAR's online logical
CPs that the LPAR actually consumed. If the number of logical CPs
for an LPAR is equal to the number of physical CPs for the box,
then LPAR BUSY is like saying what percentage of the box the LPAR
is using.
PR/SM will distribute the cpu resources to all LPARs on the same
CEC based on their set WEIGHTs, regardless these LPARs are in the
same sysplex or not.
If the processor box is not 100% utilized, PR/SM would allow an
LPAR to use more than its weight % share, but only if there is some
other LPAR that does not have enough work to do to consume its full
weight % share. Because ZZ0202 is not being capped, PR/SM will
allow it to use more than its weight % share, if the processor box
is not 100% utilization and if there is some other LPAR that does
not consume its full weight % share.
The PHYSICAL PROCESSORS TOTAL is 78.89% so the processor box is not
100% utilized in this reporting interval. But I agree that if you
bump the weight of ZZ0202 to 900 out of total of 1000, this will
ensure ZZ0202 gets its 90% weight share of cpu resource, when the
processor box is pushing at 100% utilized.
Let me know if I can be of further assistance.
CUSTOMER UPDATE:
Thanks for the great explanation, but I am a little confused. When
I looked at the CPU Activity reports during two 15 minute periods
10:15 and 10:30, the LPAR Management is at 6.50 and 5.33. So this
tells me that HYPERVISOR is working hard compare to a maximum of
2.3 and 1.2 on the previous week same timeframe. As well since LPAR
Busy is less than MVS Busy, this tells me MVS did not get all of
its work done.
1) IS this true and is this why the LPAR Management is HIGH?
Further when I looked at the Partition Processor Data it says:
----DISPATCH TIME DATA----
EFFECTIVE TOTAL
02.01.38.060 02.08.12.931
00.04.09.386 00.04.13.893
00.00.56.486 00.00.59.156
00.01.47.096 00.01.49.786
00.00.28.968 00.00.31.089
00.01.01.477 00.01.03.485
00.00.00.000 00.00.00.000
00.00.00.000 00.00.00.000
00.00.15.943 00.00.17.880
00.04.51.677
2) Could you explain what effective versus Total means in these two
columns?
Finally, when I looked at: the AVERAGE PROCESSOR Utilization. You
will see ZZ0204 LPAR is 11.90 and 12.20 in the following table.
LOGICAL PROCESSORS
EFFECTIVE TOTAL
67.58 71.23
27.71 28.21
6.28 6.57
11.90 12.20
1.61 1.73
6.83 7.05
0.00 0.00
0.00 0.00
0.89 0.99
On previous reports this ZZ0204 LPAR was effective 4.4 and 4.5.
3) Can I assume that this means this LPAR was doing more work and
got the processor when it needed it?
4) Now because we only have one LPAR capped in the whole
enterprise, and it is sitting on this particular CPC. Does that
do anything bad on the way it handles the stealing and assigning
of Physical CP. BY the way, the capped LPAR is part of the Same
SYSPLEX, it is the Backup CMC LPAR?
IBM RESPONSE:
First to point out, the sum of your weights is 107, so the big LPAR
is actually defined to have 81/107 = .757 = 75.7 % of the box.
Also, 0.757 * 12 = 9.084 = 9 CPs, so the LPAR is already defined
to be able to use 9 CPs worth of CPU resource.
1) LPAR BUSY being less than MVS BUSY means that MVS dispatched
work onto one of its logical CPs, but PR/SM took away that
physical engine away from the logical engine to give to another
LPAR. Therefore MVS still thinks it has a logical engine (MVS
BUSY clock keeps ticking) but PR/SM knows that LPAR is no longer
running on a physical engine (LPAR BUSY clock is no longer
ticking). The LPAR MGMT column shows overhead of PR/SM
HYPERVISOR, yes. But a big difference between LPAR BUSY and MVS
BUSY does not necessarily mean a big difference in LPAR MGMT.
2) The difference between EFFECTIVE and TOTAL is LPAR MGMT time.
3) Yes the ZZ0204 LPAR must have had more work to do than the
previous interval you are comparing it with, since it used about
3x the amount of CPU compared with the previous interval.
4) Having a capped LPAR only means that the LPAR is not allowed to
use more than its weight % share of the box. It should not
greatly affect the Hypervisor overhead. I imagine that the same
LPAR was capped last week so we can rule out the capping as
being the cause of the increased Hypervisor overhead. The only
worry I would have about capping is that since the LPAR is in
the same sysplex as the other LPARs, is it able to get the
resources it needs so as not to affect sysplex-wide resources
like SYSTEMS level enqueues. I imagine it is getting enough
since it is still using less than its weight % share.
Let me mention the type of things that can cause higher LPAR
MGMT. You want to keep the total # of logical CPs low. When
the ratio of logical to real CPs increases (ie. more logical)
then the pr/sm dispatch interval is shortened. This is so that
pr/sm can give good response time to all the logical CPs. But
this causes extra overhead. Therefore you might want to look
at why you have so many small LPARs, and can you possibly
combine some of them so that we have less work to do in managing
one logical CP for each LPAR when the LPAR hardly ever does
anything. Also, make sure your HMC is not getting any more
messages to the HMC log compared to 'normal'. We saw a problem
some time back when LPAR MGMT went high (>10%), and it turned
out there was an IEFUSI exit issuing messages to the HMC
console. We spin waiting for access to the service processor to
deliver that message, and then do DIAG 44 instruction to tell
PR/SM that we are just spinning, and this caused the higher LPAR
MGMT.
7. APAR OA22993 reports a storage leak in SMSVSAM MMF processing when
RMF III collects SMF 42 records, due to IGWMCIDB control blocks
being left behind when keeping statistics on large numbers of data
sets, leading to ABEND 878 when Subpool 229 Protect Key 5 fills.
When RMF collects statistics, a new CIDB block is obtained each
time and is not freed when SMF42 records size is greater than
what fits in RMF provided space. Over time with alot of
statistics collection, SMSVSAM is filled with CIDB blocks which
eventually leads to ABEND878.
Note: You can put the SMSVSAM JOB name in VSTORE parameter in
your RMF Monitor I options (ERBRMFxx in PARMLIB) to enable
virtual storage monitoring and use TYPE78SP and TYPE78PA
data sets to track that JOB's virtual storage in sub pools
and in its private area, to detect this problem early.
6. CPU Parked Time Metric.
PR/SM data for LCPUADDR 5 in dataset TYPE70PR:
"Online Duration"
======================SMF70ONT===========================
299.97
Online, "Parked" Online,"Dispatched or Not Parked"
=====CPUPATTM====== ======= (SMF70ONT-CPUPATTM) =========
103.22 196.75
Online Online
"Dispatched" "Not Parked"
====LCPUPDTM==== ======PATWAITM======
96.80 99.96
MVS data for CPUID 5 in dataset TYPE70:
SMF70WAT
=ORIGWAIT=
0.0000
This data for LCPUADDR=5 shows a CP engine that was parked for 103
seconds of that 5 minute interval. RMF subtracts the SMF70PAT
parked duration from the SMF70ONT online duration to calculate the
Percent MVS Busy value. In this interval, ORIGWAIT was zero for
this engine, as MVS never entered the wait state on that engine,
so RMF calculates the MVS busy percent as:
PCTMVSBY= 100*(SMF70ONT-ORIGWAIT-SMF70PAT)/(SMF70ONT-SMF70PAT);
PCTMVSBY= 100*( 299 -0 -103) /(299 -103) = 100%
The IBM calculation of the PCTCPUBY, the LPAR CPU busy percent, is
NOT altered by parked time; PCTCPUBY=32%, calculated as
PCTCPUBY= 100*(LCPUPDTM/SMF70ONT); = 100 * (96 / 299 );
The "PATWAITM", the time when the CP engine is "not parked", is the
time when this CP engine could/should have been parked, but was
still online and not-dispatched, because the algorithm to park a
CPU only executes occasionally. It is not created in TYPE70PR.
MXG Change 26.191 implemented the change in PCTMVSVY calculation.
5. APAR OA23174 enables the use of zIIP engines for XRC.
4. APAR OA20921 reports incorrect total frames in TYPE71 for z/OS 1.8
systems with more than 128GB real storage. RMF reported 540,932
while D,STORE reported 540672.
3. Increase in I/O Counts:
APAR Identifier ...... II10752
ICF CATALOG PERFORMANCE PROBLEMS
Note 15 states:
Beginning with HDZ11G0 and in subsequent versions of DFSMS I/O
statistics for catalogs and the Catalog Address Space will appear
differently than earlier releases. Prior to z/OS 1.3 VSAM did the
I/O to VSAM data sets, including catalogs. Starting with HDZ11G0
VSAM uses Media Manager to do all I/O. Prior to HDZ11G0 VSAM
specifically omits the collection of Start-I/O or block counts
when accessing a catalog. Media Manager does not differentiate
between I/O to catalog or another type of data set. You may now
see higher I/O counts for Catalog Address Space I/O requests. The
actual I/O rates have not changed, simply the reporting of them.
2. Improve IDCAMS EXPORT processing of catalogs>
APAR Identifier ...... II10752
ICF CATALOG PERFORMANCE PROBLEMS
Note 16 states:
To improve IDCAMS EXPORT processing of catalogs, specify the
BUFND, BUFNI and BUFNO parameters. To specify BUFND and BUFNI you
will need to use the INFILE parameter for EXPORT. Sample JCL is
below:
//EXPRTCAT EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//INCAT DD DSN=MY.CATALOG,DISP=SHR,
// AMP=('BUFND=XXX','BUFNI=YYY')
//OUTCAT DD DSN=MY.EXPORTED.CATALOG,DISP=(NEW,CATLG),
// UNIT=SYSDA,SPACE=(CYL,(10,10)),BUFNO=ZZ
//SYSIN DD *
EXPORT MY.CATALOG -
INFILE(INCAT) -
OUTFILE(OUTCAT) -
TEMPORARY
/*
For BUFND (XXX) use the number of CI's per CA for data component
of the catalog. For BUFNI, compute the number of index records by
dividing the High Used RBA of the index component by the index
component CISIZE and add a value of 5 to 10 to that calculation.
For BUFNO (ZZ) use a value in the range of 30 to 40.
1. APAR PK56492: ITCAMfWAS Version 6 generated huge count of SMF ID=92
subtype 10/11 records. APAR provides a PTF to disable the function
that was erroneously creating those records.
0. zIIP and zAAP measurements when they are faster than CPU engines.
When specialty engines are faster than the speed of your CPs, there
is a normalization factor to convert the recorded seconds to their
NORMALIZED (EQUIVALENT) time, as if they had executed on your CPs.
In all MXG workload datasets, TYPE72GO and RMFINTRV, (and TYPE30),
all time variables for zIIPs and zAAPS are NORMALIZED seconds, and
all of the service units are segregated by engine type.
However, the IBM RMF reports present these data quite differently.
This system has the normalization factor, R723NFFS=569/256=2.222,
that is, one second of zIIP is equal to 2.222 seconds of CP time.
====================================================================
MXG Dataset TYPE72GO dataset values:
====================================================================
SERVICE CPUUNITS ZIPUNITS CPUTCBTM CPUZIPTM
3,932,091 1,793,920 2,137,167 178.92 213.16
====================================================================
RMF WORKLOAD REPORT:
====================================================================
Under "SERVICE TIMES", the RMF "CPU" value of 392.9 seconds is the
total of the real CPU time on CP engines, plus the NORMALIZED CPU
time on the zIIP and zAAP engines; it is NOT the CPU "TCB" time.
( 392.9 = 178.92 + 213.16 "RMF CPU" = CPUTCBTM + CPUZIPTM )
But also under "SERVICE TIMES", the RMF "IIP" (zIIP) value of 96.1
seconds is the UN-NORMALIZED, raw, seconds on the zIIP engine.
And the RMF "AAP" value for zAAPs is also the UN-NORMALIZED value.
And under "SERVICE", the RMF "CPU" value of 3931K is the TOTAL
SERVICE units from CPs, zIIPs, and zAAPs.
REPORT BY: POLICY=OWL WORKLOAD=CSSDDF
TRANSACTIONS ---SERVICE---- SERVICE TIMES ---APPL %---
AVG 0.23 IOC 0 CPU 392.9 CP 4.98
MPL 0.23 CPU 3931K SRB 0.0 AAPCP 0.00
ENDED 51 MSO 0 RCT 0.0 IIPCP 0.07
END/S 0.01 SRB 0 IIT 0.0
#SWAPS 0 TOT 3931K HST 0.0 AAP N/A
EXCTD 0 /SEC 1092 AAP N/A IIP 2.67
AVG ENC 0.23 IIP 96.1
====================================================================
While the workload datasets have normalized CPU time, in all of the
"hardware" datasets, TYPE70, TYPE70PR, ASUM70PR etc., the CPU times
for the zIIP and zAAP engines are the raw seconds of CPU Dispatch
Time on those engines, and is NOT normalized. As a result, then,
the total ZIPACTTM recorded in TYPE70 for the above system for the
day was 10,887 seconds, but the total CPUZIPTM in TYPE72GO for the
day was 23,079 seconds.
Those 10,887 raw hardware seconds would be 24,190 normalized seconds
so the zIIP capture ratio at this site is 23079/24190 = 95.4%.
IV. DB2 Technical Notes.
4. APAR PK90013 provides enhancements to a batch reporting program,
DSN1SMFP, that supports "Common Criteria", an international standard
that helps to ensure security of computer systems in a network
environment. A Common Criteria-compliant environment is very
restrictive and is not intended for use by most DB2 customers.
The DSN1SMFP program reads these DB2 IFCID records:
* 0003: Accounting - DDF Data by Location (security-relevant
fields only)
* 0004: Trace Start
* 0005: Trace Stop
* 0023: Utility Start
* 0024: Utility Change
* 0025: Utility End
* 0083: An Identify Request End
* 0106: System Parameters (security-relevant fields only)
* 0140: Audit Authorization Failures
* 0141: Audit DDL Grant/Revoke
* 0142: Audit DDL Create/Alter/Drop
* 0143: Audit First Write
* 0144: Audit First Read
* 0145: Audit DML Statement
* 0269: Trusted Connection
* 0270: Trusted Context Create/Alter
* 0350: SQL Statement
and apparently writes each IFCID to a separate DD. If you nave
need of DSN1SMFP reporting from MXG, please provide an example
report, and MXG will be enhanced to match the report. However,
I believe as a minimum, you can use
%READDB2(IFCIDS= 3 4 5 23 24 25 83 106 140 141 142 143 144 145 269
270 350);
%VMXGPRAL(DDNAME=WORK,NOBS=MAX);
to print ALL of the variables from each of those IFCIDs.
3. APAR PK69111 reports "millions of" IFCID 173 (SMF 102) records being
written, currently no PTF but a local fix of "Stop RLF". Jul 20, 08.
2. DB2 SMF 102 IFCID 142 ALTER records are not written for all alters.
At present, only ALTERs where the AUDIT attribute is changed are
audited. Changes such as the addition of a column, VARCHAR length,
etc, are not currently written to SMF. DB2 support commented that
they do have an upcoming design change request for DB2 V9 that will
change the audit behaviour for the ALTER TABLE such that any ALTER
of an audited table will be audited, including ALTERs to add
columns, but no date has been announced. May 22, 2008.
1. APAR PK62743 for Websphere for Z/OS 510 reports increased zAAP CPU
and Elapsed Runtime Increases. The CPU and runtime increases are
directly related to the number of times a resource lookup is done as
the application runs. Under LOCAL FIX: If possible, change the
application code to do less resource lookup calls. (Caching resource
data often helps reduce the number of resource lookup calls, FYI.)
V. IMS Technical Notes.
VI. SAS Technical Notes.
9. IBM APAR OA25725 required for SAS ITRM if some files are stored in a
zSeries File System (zFS): SAS, and several Customers of SAS ITRM
3.1.1, have discovered, and IBM has corrected, a problem in the zFS
file system component of the z/OS operating system. This problem is
fully documented in the following Usage Note:
Usage Note 16333: Possible corruption of SAS IT Resource
Management aggregation table if it is stored in a zSeries File
System (zFS) available at
http://support.sas.com/kb/16/333.html.
ALL consumers of SAS ITRM 3 are encouraged to obtain and apply APAR
OA25725 at their earliest possible convenience.
8. SAS Note 32065 lists all z/OS dataset names used by SAS V9.2. 28May.
The following is a description of all the physical data sets that
are created when installing SAS version 9 on z/OS. You may not
have all of these data sets because some only are created if you
license specific SAS products. This list applies to SAS 9.0
through SAS 9.1.3. The data sets are slightly different in SAS 8.2
and SAS 9.2.
SAS Technical Support highly recommends that you not delete any of
these data sets, even if you know you will never use them. Future
updates or adding additional products to this image may fail if
the image is not complete. If you want to save DASD space, then we
recommend that you archive any unused data sets to tape instead of
deleting them.
Files that make up the SAS System
** &prefix is the prefix specified at the time of your install.
** For more information on Languages / Encodings see the last
section in the SAS Installation Guide for z/OS
&prefix.BAMISC - Base Miscellaneous PDS
&prefix.CLIST - Where generated CLISTS are written
&prefix.CNTL - Install CNTL data set
- If you installed using the wizard, it will be called
&prefix.Vxxxxxxx.CNTL where xxxxxxx is based on the
julian date of the installation.
&prefix.CNTL.RENEW - Contains SID/setinit information
&prefix.CTMISC - SAS/Connect Miscellaneous PDS
&prefix.DBCS.LIBRARY - Double Byte Character Set Load Library
&prefix.DBRM - SAS/Access to DB2 Miscellaneous PDS
&prefix.DQ.* - SAS/Data Quality data sets
&prefix.xxyy.SASHELP - SASHELP library (xx=language,yy=encoding)
- Allocated to SASHELP DD in CLIST and PROC
&prefix.xxyy.SASMSG - SASMSG library (xx=language,yy=encoding)
- Allocated to SASMSG DD in CLIST and PROC
&prefix.xxyy.XREG.TXT - Registry source (xx=language,yy=encoding)
- The registry is built during the install
&prefix.GRMISC - SAS/Graph Miscellaneous PDS
&prefix.ITMADPT.* - SAS Solution adapters files
&prefix.ITRM.* - IT Resource Management files (ITRM)
&prefix.LIBRARY - SAS Load Library
- Allocated to the STEPLIB DD in the JCL proc, and to tasklib
in the CLIST
&prefix.NEWS - News data set, echoed into SAS LOG
&prefix.PROCLIB - Generated JCL procs are written here
&prefix.SAMPLE - Sample library - contains source code
&prefix.SAMPSIO - Sample data library
- SAS data library that contains the data that the programs
in the SAMPLE library use
&prefix.SASC.TRANSLIB - SAS/C transient library
&prefix.*.TTF - True type font files
&prefix.SASSAML - Used with SAS/Share product
&prefix.SEMISC - SAS/Session Miscellaneous PDS
&prefix.TKMVSENV - Default settings for environment vars
- The TKMVSENV member is allocated to the TKMVSENV DD in
CLIST/JCL Proc
&prefix.TOOLKIT.* - SAS/Toolkit files
&prefix.USAGE.HFAUD - Hot Fix Audit files
&prefix.USAGE.LIBRARY - SAS Note library
&prefix.USAGE.ZAPS - Zap library
&prefix.WEB.TAR - USS TAR files
- Used during installation
&prefix.yy.AUTOLIB - Autocall library (yy=encoding)
- Allocated to SASAUTOS DD in CLIST and PROC
&prefix.yy.ITRM.* - ITRM SAS data libraries (yy=encoding)
&prefix.yy.MAPS - SAS/Graph Maps data set (yy=encoding)
&prefix.yy.SRVCFG - Config files for various servers
&prefix.yy.SRVCLIST - generated CLISTS for various servers
&prefix.yy.SRVCNTL - Misc stuff for various servers
&prefix.yy.SRVENV - Environment variable settings for servers
&prefix.yy.SRVPARM - parm files for various servers
&prefix.yy.SRVPROC - procs for various servers
&prefix.yy.SRVREXX - REXX execs for various servers
&prefix.yy.TARFILES - USS tar files
- Used during the install
&prefix.yy.TESTS - Installation validation files
- Used after the install by the VALID jobs
&prefix.yy.XREG.TXT - Registry source (yy=encoding)
- The registry is built during the install
&prefix.ZT.*.GFONT - Asian language graphic fonts
7. Support for SAS Version 9.2: WARNING/RETURN CODE issues RESOLVED!
Previous MXG Versions run without error with SAS Version 9.2.
and
MXG Version 26.03 or SAS Hot Fix F9BA07 eliminated the new WARNING.
SAS V9.2 Hot Fix F9BA07 corrects the WARNING/RETURN CODE issues that
were introduced in SAS 9.2, that were reported in this Note prior to
August 20, 2008, and that were also circumvented in MXG 26.03.
SAS Note SN-031850 discusses the original problem, but that Hot Fix
restores SAS 9.2 to the prior 9.1.3 behavior; INSTALL THIS HOT FIX!
SAS V9.2 does set CONDITION CODE 4 for all WARNING messages, whereas
only some WARNINGs previously set a non-zero RETURN CODE But, prior
to the Hot Fix (and without MXG 26.03 revisions) a new message
"WARNING: Multiple lengths specified for variable XXXXXX"
set condition code 4 (17,000 times in the MXG QA with first V9.2!).
HOWEVER: EVEN THEN, THE V9.2 OUTPUT DATASETS WERE PERFECTLY VALID,
AND THOSE MESSAGES HAVE NO REAL IMPACT ON MXG OUTPUT.
Changes 26.189, 26.090, 26.078, 26.065 and 26.060 have V9.2 details.
Additionally, SAS changed the DSNAMES of some of their datasets in
their z/OS JCL procedure; see Change 26.193.
6. Using SAS/ITRM's %cpdupchk macro to remove duplicate input records
can be very CPU-intensive, when there are lots of output datasets
created, as this example using TYPETNG (over 200 datasets) shows:
With %cpdupchk macro, cpprocess step took:
real time 10:14:17.77
cpu time 5:52:36.51
Without %cpdupchk macro, cpprocess step took:
real time 4:07:35.24
cpu time 2:14:21.26
5. SAS error message
LIBRARY xxxx IS NOT IN A VALID FORMAT FOR ACCESS METHOD SASE7
resulted at one site because the RECFM=FS had not been set in the
DCB when the library was created. Usually, DISP=NEW datasets will
get their DCB from SAS, but site SMS allocation defaults may need
to be overridden to specify RECFM=FS for SAS datasets.
4. SAS format SIZEKMG displays bytes differently than MXG's MGBYTES.
The SAS format SIZEKMG prints byte values with K, M, etc suffix,
as MXG's MGBYTES format has done for years, but SAS rounds values,
so 32255 bytes (31.4KB) is 31KB with SIZEKMG and 31KB with MGBYTES.
so 32256 bytes (31.5KB) is 32KB with SIZEKMG and 31KB with MGBYTES.
so 32767 bytes (31.9KB) is 32KB with SIZEKMG and 31KB with MGBYTES.
so 32768 bytes (32.0KB) is 32KB with SIZEKMG and 32KB with MGBYTES.
If you prefer the rounding up in your printed reports, you can use
this old style macro definition:
MACRO MGBYTES SIZEKMG %
and all of the MGBYTES. references in MXG source code will instead
be compiled as SIZEKMG. and you'll have rounded values printed.
Jan 29, 2008.
3. SAS REALMEMSIZE is not useful option for MVS.
The REALMEMSIZE option was designed primarily for use on operating
systems such as Windows or Unix where an individual user's virtual
memory can exceed real memory. On MVS that situation is unlikely.
REALMEMSIZE is probably not a useful tuning option on MVS so it is
recommended to be left at its default value. The option
specification is only for compatibility with other systems.
REALMEMSIZE is not adjusted to the real memory size because the
concept doesn't really apply on MVS, given that virtual memory is
capped by the REGION specification and we don't even know how much
real memory is on the system. The option is really about REAL
memory, not virtual memory.
Jan 25, 2007, provided by SAS Developers.
2. MXG "compatibility or support" with SAS/ITRM.
There is often confusion with the term "support" or "compatibility"
between SAS/ITRM Versions and MXG Software Versions. In Cary, SAS
executes the current SAS/ITRM (now 2.7) every night with the newest
MXG Version (now 25.11), and it has been years since there has been
any execution issues; often, you must "drop in" the newest
MXG Version to support a new operating system release, and you can
always do that without installing a new ITRM Version, so we believe
that MXG and ITRM are always mutually "compatible".
With regard to ITRM "supporting" new MXG variables or new datasets
in ITRM output "PDB" data libraries, the ITRM Version does impact:
Only the MXG variables and datasets that existed in the MXG Version
that was used to update the ITRM dictionary will be kept in ITRM
output datasets (not even the DETAIL will have new vars/datasets).
However: you can use the MXG EXPDBOUT exit to PROC COPY any MXG
dataset from WORK to a separate LIBNAME; the EXPDBOUT
exit is taken after SMF has been written to WORK and
before ITRM has processed them.
The ITRM 2.7 dictionary was created from MXG 24.04, but a hotfix is
coming soon with an ITRM dictionary based on Sep, 2007's MXG 25.08.
1. Use of SAS SPD Engine (SPDE) with MXG.
MXG makes extensive use of SAS Views which are not supported with
the SAS SPD Engine (SPDE), per SAS 9 documentation on The SPD
Engine. Also, consider that a zFS (as well as hiperspace, for that
matter) is a fixed-size at SAS initialization time and do not honor
the concept of getting additional "extents" depending on an
individual process requirement. Scott Barry posting, 21Dec2007.
VI.A. WPS Technical Notes.
7. Comparison of SAS V9.2 and WPS 2.3 on z/OS:
BUILDPDB and ASUMs with 448 MegaByte input SMF file;
z/OS Step Totals:
SAS V9.2 WPS 2.3 RATIO
mm:ss sec mm:ss sec
Total CPU time 03:45 225 08:30 510 2.26
Total Elapsed time 08:00 480 16:58 1018 2.10
Total EXT Memory 104 MegaBytes 188 MegaBytes
Total SYS Memory 11 MegaBytes 504 MegaBytes
Total Memory 115 MegaBytes 692 MegaBytes
SMF Data Step
mm:ss sec mm:ss sec
Elapsed time 01:12 72 03:27 207 2.85
SMF Read Rate 6.22 MB per sec 2.16 MB per sec
See MVS Technical Note 13 in this newsletter for additional
information on the virtual storage in the EXT and SYS fields
of the IEF374I step termination message.
-z/OS 1.12 message IEF032I/33I replaced IEF374I/276I.
6. With the below five exceptions, the MXG QA test did complete with
WPS 2.2.2 Build 8792 on Windows earlier this year.
5. WPS 2.2.2 Build 8792 failed in ANAL30DD because the %RUN; statement
is not supported, leading to these error messages:
%RUN;
WARNING: Apparent invocation of macro "RUN" not resolved
ERROR: Expected a statement keyword : found "%"
The %RUN; statement was unintended; it was supposed to be just RUN;
but (unbeknownst to me before now), there is a %RUN; statement that
is used to terminate a %INCLUDE *; statement for terminal input.
WPS Issue 5050, corrected in WPS 2.2.2 GA Build 9037, Mar 10, 2008.
4. WPS 2.2.2 Build 8792 failed in ANALRMFR with a compiler error inside
a %macro expansion:
+ SMF70SPN SMF70CIN SMF70STN SORTNUM; TITLE DATA=TEMP70S ;
+! PROC PRINT DATA=TEMP70S HEADING=H;
+! WHERE (SMF70CIN='IFA' OR SMF70CIN='IIP');
+! RUN; DATA _NULL_;
+! RETAIN PCTEFVT0-PCTEFVT9 PCTEFVTA PCTEFVTB PCTEFVT;
ERROR: Found "DATA" when expecting one of ANGLE, A, COLOR, C,
FONT, F, HEIGHT, H, JUSTIFY, J, ROTATE, R
WPS Issue 5049, corrected in WPS 2.2.2 GA Build 9037, Mar 10, 2008.
3. WPS 2.2.2 Build 8792 failed in ANALCISH in the $macro processor with
ERROR: The %IF condition is invalid. The condition was:
&REP EQ CICDMR OR &REP EQ CICDMG AND &SWDOMN EQ 0
Discovered Feb 9.
WPS Issue 5048, corrected in WPS 2.2.2 GA Build 9037, Mar 10, 2008.
2. WPS 2.2.2 Build 8792 executed MXG QA tests but these members do not
execute for the reason noted:
ANALAVAL - PROC CALENDAR does not exist in WPS.
ANALCICS - HBAR statement in PROC CHART is not supported.
ANALCISH - ERROR: THE %IF CONDITION is invalid.
ANALMONI - HBAR statement in PROC CHART is not supported.
ANALMPL - VREVERSE in PLOT statement is not known in PROC PLOT.
ANALTAPE - VREVERSE in PLOT statement is not known in PROC PLOT.
ANALPATH - OVERPRINT option in PUT statement is not supported.
ANALPRNT - HBAR statement in PROC CHART is not supported.
ANALRMFR - ERROR: Found "DATA" when expecting Angle.
ANALSMF - HBAR statement in PROC CHART is not supported.
ANAL30DD - %RUN;- Expected a statement keyword, found "%"
ANAL80A - PROC REPORT not known.
1. WPS 2.2.1, the then current GA Version, failed in MXG QA test,
with ERROR: Illegal length 40 supplied for format
(an internal WPS "format", not related to an MXG Format).
This has been corrected (Bug 5004) in WPS 2.2.2 Build 8792,
but only for z/OS; as of Feb 9, there is no z/OS Build 8777+.
WPS Issue 5004, corrected in WPS 2.2.2 GA Build 9037, Mar 10, 2008.
VII. CICS Technical Notes.
1. MXG updates to an ASKQQA Item:
How can I exclude specific fields from performance-class monitoring
records; we don't use any web related transactions, so how can I
exclude those fields from my SMF 110 subtype 1 CICSTRAN records?
Excluding unused fields in the MCT can reduce the size of the CMF
performance records written as SMF 110 records and is very commonly
done by MXG users.
a. There are some sample MCTs in SDFHSAMP that show how to exclude
fields. The names of the members are of the form DFHMCTxx. One
these, DFHMCTF$, is designed for a file owning region (FOR) where
there would be no web work done so it does show the web entries
being excluded. However, it does exclude other items that you
would not want to exclude in a TOR or AOR.
b. The groups that you might exclude that are related to the web
include DFHDOCH (document handler), DFHSOCK (assuming you do not
use any CICS TCPIPSERVICEs) and DFHWEBB (web support). There may
be individual fields in some other groups that relate to the web so
you could review the other groups.
Other groups not specifically related to the web that are often
candidates for exclusion are DFHCBTS (business transaction
services), DFHDATA (IMS and DB2), DFHEJBS (EJBs) and DFHFEPI
(FEPI).
VIII. Windows NT Technical Notes.
IX. z/VM Technical Notes.
X. Incompatibilities and Installation of MXG vv.yy.
1. Incompatibilities introduced in MXG 25.yy (since MXG 24.24):
See CHANGES.
2. Installation and re-installation procedures are described in detail
in member INSTALL (which also lists common Error/Warning messages a
new user might encounter), and sample JCL is in member JCLINST9 for
SAS V9.1.3 or JCLINST8 for SAS V8.2.
XI. Online Documentation of MXG Software.
MXG Documentation is now described in member DOCUMENT.
XII. Changes Log
--------------------------Changes Log---------------------------------
You MUST read each Change description to determine if a Change will
impact your site. All changes have been made in this MXG Library.
Member CHANGES always identifies the actual version and release of
MXG Software that is contained in that library.
The CHANGES selection on our homepage at http://www.MXG.com
is always the most current information on MXG Software status,
and is frequently updated.
Important changes are also posted to the MXG-L ListServer, which is
also described by a selection on the homepage. Please subscribe.
The actual code implementation of some changes in MXG SOURCLIB may be
different than described in the change text (which might have printed
only the critical part of the correction that need be made by users).
Scan each source member named in any impacting change for any comments
at the beginning of the member for additional documentation, since the
documentation of new datasets, variables, validation status, and notes,
are often found in comments in the source members.
Alphabetical list of important changes after MXG 24.24 now in MXG 25.01:
Dataset/
Member Change Description
See Member CHANGES or CHANGESS in your MXG Source Library, or
on the homepage www.mxg.com.
Inverse chronological list of all Changes:
Changes 25.yyy thru 25.001 are contained in member CHANGES.