Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!usc!ucla-cs!uci-ics!zardoz!tgate!ka3ovk!drilex!axiom!linus!alliant!lewitt From: lewitt@Alliant.COM (Martin E. Lewitt) Newsgroups: comp.arch Subject: Re: Double Width Integer Multiplication and Division Summary: VLIW assembly can be easy Message-ID: <3243@alliant.Alliant.COM> Date: 4 Jul 89 09:33:25 GMT References: <1035@aber-cs.UUCP> <1370@l.cc.purdue.edu> <2274@wyse.wyse.com> Reply-To: lewitt@Alliant.COM (Martin E. Lewitt) Organization: Alliant Computer Systems, Littleton, MA Lines: 36 In article <2274@wyse.wyse.com> stevew@wyse.UUCP (Steve Wilson xttemp dept303) writes: --- much deleted --- >Seriously, there are machines that were NEVER meant to be programmed in >assembly! Try programming a VLIW machine in assembly. Chances are that unless >your the guy that architected it, or wrote the compilers for it, you can't. >(I've worked on such a box, at best it certainly ain't easy ;-) --- some deleted --- >Steve Wilson I'm not sure what machines out there are giving you the impression that VLIW assembly is difficult, but that wasn't my experience at all. I found programming the FPS AP-120B array processors and their 64 bit mini-super follow-on products, simple and elegant, especially compared to CISCs such as the 68000 and the VAX. I don't think I'd want to try the 8086. With the FPS products, I seldom had to consult an architecture or assembly manual. I just kept a diagram of the architecture in front of me and pictured the data being latched onto this bus or into that register or functional unit. There were a few simple rules to follow, and most of the restrictions made sense, like don't latch two things onto the same bus in the same instruction. At the end you counted your instructions and you knew how fast you were running. On the CISCs, there are multiple cycle instructions and addressing modes, instruction alignment and resource dependency stalls, etc. At the end of a task, I still find myself wondering if I missed some instruction or addressing mode which might have been faster. Maybe some VLIWs out there are more difficult because they are pushing the technology harder, trying to encode more in an instruction word or something. They might sacrifice some of the generality that their bus structure diagram would lead you to believe was there. I'm curious about these experiences with other VLIW architectures.