Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!natinst!sequoia!uudell!pensoft!robin From: robin@pensoft.UUCP (Robin Wilson) Newsgroups: comp.unix.aix Subject: Re: Making A request to IBM Summary: Complete description of IBM software support. Message-ID: <3300@pensoft.UUCP> Date: 20 Mar 91 16:08:40 GMT References: <1296@dkunix9.dk.oracle.com> <1991Mar15.123532.8036@odi.com> <5958@awdprime.UUCP> Organization: Pencom Software, Austin, TX Lines: 182 OK here it is... the complete (almost) description of how IBM software support works. The customer is supposed to call the Systems Engineer (SE) for ANY problems with their system. The SE is responsible for Problem Determination (PD) and Problem Source Identification (PSI). Often times the SE is experienced in systems other than the IBM AIX systems, so this PD/PSI is sometimes not very complete when done by a "less expereinced" SE. This is one area where IBM sometimes has trouble. If the SE is either not experienced enough, or is unable to resolve your problem, he is supposed to determine whether the problem is a "DEFECT" or "HOW-TO". Unfortunately, someone with little experience, usually doesn't know enough to determine if it is a "DEFECT" or "HOW-TO". So they call the channel that provides the fastest response: AIX Software DEFECT Support Level 2. For a second, lets assume they really did know that the problem was "HOW-TO" and they followed the proper channel to resolve the problem. HOW-TO problems will go through the following chain of people. The SE contacts the Area-Specialist. If the AS cannot solve the problem, he directs the SE to the National Technical Support Center. The SE can contact this center through electronic mail ONLY, so this method usually takes several days to resolve a problem. (IBM is working on the possiblity of making the Tech Support group accessable by phone, but right now that is not a possibility.) If the NTSC cannot resolve the problem, they will contact Level 2 DEFECT support (to find out if "Anybody has heard of this problem"), and if Level 2 can't help, NTSC will contact Level 3 (change team), and finally NTSC will contact development. NOTE: there is a distinct difference between Level 3 (CT) and Development. Development writes the NEW code for the "next" or "future" releases, and CT maintains the existing release(s). Now here's what happens when the problem goes from SE to Level 2 DEFECT support. (This does not assume that the problem is a HOW-TO, but either way it is started the same at Level 2.) IBM Level 2 Software Defect Support (L2) is just what the name implies; DEFECT support. They are contacted by calling the 1-800-237-5511 number. The people answering the phone are at one of several regional support centers. They are Level 1 support. And they receive calls for ALL IBM systems and OS's. They ask the called a few questions: "customer number", "type of machine", "operating system", etc. Then the level 1 person routes the information (entered into a database system that is distributed across the nation) to the proper L2 group for the product (well, ususally they do... sometimes they accidentally route calls to the wrong group, but usually a callback solves that). If your product happens to be the RS/6000 and AIX V.3, they route you to Austin, Texas, and live transfer your call to a Level 2 representative here in Austin. (RT and AIX 370, and AIX PS/2 also get routed to Austin, but their calls are not live transfered, they Level 2 person must call the customer back.) The person answering the call is a FULL Level 2 support person, who has been assigned to answer incoming calls at that particular time (or who just saw the light blinking on the wall -- that indicates an incoming call has not yet been answered -- and decided to grab the call). NOTE: Just for clarification L2 for the RS/6000 takes over 250 calls every day so sometimes it takes several minutes to get to a specific call during a busy time. IBM is committed to provide instant response to all calls, but sometimes the system they use hits a glitch (when this happens they usually are quick to make adjustments). Anyway, when the L2 person takes the live call, they have no way of knowing what the person who is calling is ahving trouble with, so they start by asking questions. Sometimes, you get lucky and happen to have your call answered by someone knowledgable in your specific problem, but more often the first person you talk to will take down basic information and queue your problem over to someone who works in the group that best understands your problem. These people will review the basic information provided by the call taker, and then proceed to resolve the problem. This ususally requires contacting the customer for more information, running testcases, attempting to re-create your problem locally etc. If the problem is a "HOW-TO" question, Level 2 is required to send the customer back to the SE. When I left L2 (about 2 months ago) we averaged about a 60-40 HOW-TO to DEFECT ratio. So you can see, that often times DEFECT support spends significant amounts of time determining if a problem is HOW-TO, and then contacting the customer to have them call the SE back. When L2 has sufficiently tested a problem to determine that "there appears to be a defect", they will pass the problem on to the Change Team (CT). The CT (also called Level 3 (L3)) will then read through the problem record and evaluate the problem as a possible code defect. Try to remember that by-and-large the L2 person is less knowledgable (although not in all cases) that the CT person, so some of the problems are rejected by CT as "User Errors" (which is functionally the same as "HOW-TO"). Basically the CT can close a problem in the following ways: USE: User error. The customer is not using the program as it was intended/documented. IDD: Documentation error (IDD is IBM document group). This is sometimes used instead of a USE when the documentation is unclear, but the code was not intended to be used like the customer was attempting to use it. PER: Programming Error (in the IBM supplied code). This indicates that a software DEFECT was found, and corrected. PRS: Permanent Restriction. There is a software DEFECT (or code error) but it is not reasonable to fix it at this time. (It may be fixed in a later release.) SUG: Suggestion. The code is working as designed, but the requested change is being evaluated for a possible future enhancement. MCH: Machine Error. The provided debug information indicates a hardware error. This can either be a hardware design defect, or a specific defective piece of hardware. UR5: Unreproducable at the described level. Basically this is only used when the problem is clear (ie. this program should do this but instead it does this...) There may be a few more, but these are the most widely used closing codes. When a DEFECT is corrected, CT reviews the code change, and then builds the code change into an update. The update is then tested by the Regression Test Lab, to see if the update has any defects. Of course the regression test is not perfect, so problems sometimes slip through... Then the update is tested by the CT people who made code fixes. Each person tests their own fixes. Some programs require special equipment, and cannot be tested in the lab environment at IBM, so they are sent off to the customer to test before the Regression Tests begin (just the specific program that was failing). The customer then tests the new version, and the CT person merely verifies that the code the customer tested is the same as the "REAL" update. Some other proceedural thingys... When Level 1 takes your call, they create a PMR, and then queue that PMR to L2. Level 2 is responsible for the PMR until it is resolved. When Level 2 decides that the problem is a possible DEFECT, they create an APAR. The APAR is sent to CT, and a copy of the PMR is sent to the CT person who will work the APAR. CT is responsible for the resolution of the APAR. When a PMR is created, it is given a PRIORITY. This indicates the desired responsiveness that the customer requires on the PMR FROM LEVEL 2. This priority is set by the customer. It is a number from 1-4, and indicates the following: 1) - 1 hour contact required... 2) - 2 hour contact required... 3) - 1 day contact required... 4) - 1 week contact required... NOTE: there is no requirement that the problem be serious for the customer, only that the customer be contacted within the specified time period. On the live transfers, this number is irrelavent until the problem is queued up to the Subject Matter group for the problem. Then this number is used to determine the priority the call takes in getting a Level 2 response. When an APAR is created, it is given a Severity. This is also a number from 1-4, but it is not determined by the customer. This number is determined by the level 2 person who creates the APAR, and is based on several criteria: 1) - The customer's machine is not operational. This requires 24 hours- a-day response. CT must work round-the-clock to provide a workaround solution to get the customer minimally operational. 2) - The customer's operations are seriously impacted. The CT must provide a "code fix" (for DEFECT problems) within 10 days. 3) - The customer is not seriously impacted, but the problem is affecting customer operations minimally. The CT must provide a "code fix" within 26 days. 4) - Reserved for DOCUMENTATION errors. The IDD group must accept the problem and agree to a documentation change within 40 days. The L2 person will attempt to work with the customer to get the proper severity set for a problem, but usually these are the criteria that he/she must meet in order for CT management to work the problem. NOTE: the SEV1 24 hour response is intended to be for a "workaround" to the problem. That means that CT will spend their efforts trying to find the FASTEST method to get the customer operational (minimally). Sometimes this will involve the actual code fix, but often it does not. Once the customer is operating again, the proble should be lowered to Sev2. Sorry for the length of this posting, but I hope this clears up some of the mystery behind IBM software support. +-----------------------------------------------------------------------------+ |The views expressed herein, are the sole responsibility of the typist at hand| +-----------------------------------------------------------------------------+ |UUCP: pensoft!robin | |USNail: 701 Canyon Bend Dr. | | Pflugerville, TX 78660 | | Home: (512)251-6889 Work: (512)343-1111 | +-----------------------------------------------------------------------------+