Newsgroups: comp.archives Path: utzoo!utgpu!news-server.csri.toronto.edu!ox.com!msen.com!emv From: buck@nrl-cmf.UUCP (Loren Buchanan) Subject: [unix-admin] Re: Network Queuing System (NQS) Message-ID: <1991Jun20.132659.29048@ox.com> Followup-To: comp.unix.admin Keywords: NQS, Cray, SGI, Sun, VAX, IBM, Stardent Sender: emv@msen.com (Edward Vielmetti, MSEN) Reply-To: buck@caligula.nrl.navy.mil (Loren Buchanan) Organization: Naval Research Laboratory, Washington, DC References: <336@ra.nrl-cmf.UUCP> X-Original-Date: 14 Jun 91 19:46:10 GMT Date: Thu, 20 Jun 1991 13:26:59 GMT Approved: emv@msen.com (Edward Vielmetti, MSEN) X-Original-Newsgroups: comp.unix.admin Lines: 268 Archive-name: unix/batch/nqs/1991-06-14 Archive: pemrac.space.swri.edu:/public/convexug/nqs.tar.Z [129.162.150.4] Original-posting-by: buck@nrl-cmf.UUCP (Loren Buchanan) Original-subject: Re: Network Queuing System (NQS) Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) This is the response document to the questions I posed about NQS last week. I have filtered out most of the noise (and in one case most of the meat). It appears as though we will start with the code from COSMIC, 382 East Broad St., Athens GA 30602, or if you want to call, John A. Gibson, Director, (404) 542-3265. Does anyone have any experience with the Sterling or General Atomics versions they would care to share with the rest of us? From: bernhold@qtp.ufl.edu >Are there reasons to not use NQS? Most of the vendors your listed don't (to my knowledge) offer NQS with their systems. You'll have to get it elsewhere. It is not PD. It was developed on contract from NASA with public funds. It is sold to try to recover costs via NASA's COSMIC distribution center. The cost of the original version of NQS (_not_ what you'll get from Cray!) last time I checked was $6000. Don't know if you'd get some kind of deal for being "family". The current commercial versions of NQS are very nice -- much advanced over the one to be had from COSMIC (that may have changed by now -- see below), but either should probably be workable. The one thing you'll probably need which isn't in the older version is the ability to specify a remote username to run under (verified with .rhosts, etc.). Otherwise, there is no facility (in the original NQS) for one userid to submit the job to run under another userid on the remote machine. Given a knowledge of the communication protocol between NQS daemons, this shouldn't be hard to implement in the old code (I say that without having looked at the old code!). Cavaet: We are running the original NQS only and haven't yet tried to speak to a machine running a more current commercial verion -- who knows what may have changed in the protocols! About the different versions: The original is definitely available from COSMIC. With some work, we got it to run on our Suns and FPS. When I asked for information on NQS a while ago, I was told that a) the original is being upgraded -- bugs fixed, perhaps _some_ enhanced capabilities and this may be at COSMIC by now; b) there is a brand new development, NQS II beginning, which is to rewrite the whole thing from scratch to address needs which didn't exist when NQS (I) was designed -- mostly distributed computing, I think. Since NQS is going to be a POSIX standard too, I imagine, but don't know for sure, that NQS II will become POSIX-compliant. I think NQS II is expected to be available from COSMIC also, but I don't know the time frame. I don't know the legality of it, but there used to be a copy of the original NQS available from the Convex User Group archive on permac.space.swri.edu. I checked on it a while after its existance had been widely announced on the net, and it was still there -- so either noone who cares heard about it or noone cares or someone is being stubborn in not removing it. Take it as you will. I would like to head any more up-to-date information -- particularly on (a) vendors planning to support NQS and (b) updated versions of NQS and where to obtain them. From: jones@hermes.chpc.utexas.edu You should run NQS on the cray. You can get a version of NQS from COSMIC (at an one time price) that you can run on your SIG, SUN, VAX and Stardent. You may have do some porting. Its not hard once you understand the source, I ported it to AIX in about two days, but it will take at least a month of work to get to the point where you can do this. The nice thing about the COSMIC version is you can do what you want with it so long as you don't give to foreigner. (You will also have to modify it to understand cray's tape conventions.) STERLING SOFTWARE also sells NQS. They sell it by CPU's and they also have do maintenance on NQS. I don't know yet if they support the CRAY tapes conventions. They have ported it to AIX. You can also check out RQS from cray. It allows you to submit jobs to the cray NQS and get the output files back. Bill Jones From: nash@ucselx.sdsu.edu (Ron Nash) Here in San Diego, the Cray runs EZBATCH. Here is the manual: [[[with large chunks of the manual deleted]]] EZBATCH Scope EZBATCH discusses the basics of using the Net- work Queuing System (NQS), the UNICOS batch facility. Last Revision May 30, 1991 Documentation To view this document at your terminal, use the interactive SDSC utility doc: doc view ezbatch For a list of other doc options, including printing your documents, enter doc and respond to the prompts, or see the doc man page. Consulting For questions about or problems with any SDSC hardware, software, or facilities, please call the SDSC consultants at (619)534-5100 between 0800 and 1700 Pacific time. To send your questions online, enter the following and respond to the prompts: mailx consult or use your local mail utility to send your question via Internet mail to the following Internet address: consult@y1.sdsc.edu (c) 1991 General Atomics. General Atomics gives authorized users of the San Diego Supercomputer Center (SDSC) permission to make copies of this document. Authorized users include academic, industrial, and government researchers with SDSC accounts as well as officials of the National Science Foundation and the University of California. This material may not be used for commercial purposes. Permission for any other use of this material and by any other party must be obtained from General Atomics. Table of Contents Page Documentation Conventions..................................... 1 Introduction.................................................. 2 NQS Requests............................................. 3 NQS Output............................................... 3 NQS Queues.................................................... 4 Batch Queues............................................. 4 Standard Queues..................................... 4 Queues for Large Disk Requirements.................. 5 Test Queue.......................................... 5 Queues for High or Low Priority..................... 5 Table of Batch Queues............................... 7 Pipe Queues.............................................. 8 Table of Pipe Queues................................ 9 Choosing a Queue..............................................10 Choosing a Priority......................................10 Determining Your Job's Memory Requirements...............11 Determining Your Job's Local Disk Requirements...........12 NQS Commands..................................................13 The qsub command.........................................14 Submitting Scripts with Command Options.............14 Useful qsub Options.................................14 Example qsub Command Line...........................16 Specifying qsub Options in the Shell Script.........16 Submitting Shells Interactively.....................17 Message after Successful Submission.................18 Submission Example..................................18 Useful Shell Flags..................................18 The qsmart Utility.......................................20 The qsmart Command Line.............................20 Example qsmart Command Line with Options............21 Interactive qsmart Example..........................21 The qstat Command........................................23 The qstat Command Line..............................23 Default qstat Display...............................24 Using qstat to Examine Your Jobs....................27 Using qstat to Examine the Queue Complexes..........28 The qdel Command.........................................30 The qlimit Command.......................................31 The qmsg Command.........................................32 The qrank Command........................................33 Ranking by Time Submitted and Priority..............33 The qrank Command Line..............................34 Default qrank Display...............................35 Displaying a Single Request.........................36 Displaying Queues and Primary Complexes.............36 Revision History..............................................38 INTRODUCTION You can run jobs under UNICOS on the Cray Y-MP in three different ways: interactively in the foreground, interactively in the background, and in batch. The Network Queueing System (NQS) is the UNICOS batch facility, which will help you make the best use of SDSC system resources. By submitting your jobs to the batch queue, you allow NQS to schedule your job according to the resources requested and to run it when those resources are available. By redistributing the load on the system over a 24- hour period, this scheduling of jobs balances the load during the day and prevents the machine from idling late at night when the number of interactive jobs reaches a minimum. NQS also lets you o Stretch your allocation. When you run jobs interactively in the foreground or background, you are charged two times the amount of CPU time you use. By running in batch, you can reduce the amount you are charged for each job. o Checkpoint your program. Jobs run in NQS are automatically checkpointed. After a sytem shutdown (or crash), checkpointed jobs continue to run from the last checkpoint rather than from the beginning, which can save you from excessive charges and time delays caused by rerunning your entire job. o Run jobs that are too large or too small to be run interactively. Interactive jobs are limited to 6 Mwords of memory, 20 CPU minutes, and 60 Mwords of disk space. By using NQS, you can run jobs that require up to 6000 CPU minutes, 32 Mwords of memory, and 1000 Mwords of disk space. o Continue running your jobs after you logout. Interactive jobs, including those run in the background terminate when you logout (unless you specify nohup on the command line). Thus endeth the summary or responses (thanks to all who responded, even if none of your message ended up in this one). B Cing U Buck -- Loren Buchanan (buck@caligula.nrl.navy.mil) | #include NRL Code 5842, 4555 Overlook Ave. | #include Washington, DC 20375 (202) 767-3884 | #include Phone tag, America's fastest growing business sport. -- comp.archives file verification pemrac.space.swri.edu -rw-r--r-- 1 root ops 1051724 Jan 22 1990 /public/convexug/nqs.tar.Z found nqs ok pemrac.space.swri.edu:/public/convexug/nqs.tar.Z