Path: utzoo!mnetor!uunet!portal!cup.portal.com!Howard_Reed_Johnson From: Howard_Reed_Johnson@cup.portal.com Newsgroups: comp.sys.ibm.pc Subject: Re: Passing pointers in C to 8086 assembler Message-ID: <4889@cup.portal.com> Date: 28 Apr 88 08:13:21 GMT References: <7759@ihlpa.ATT.COM> Organization: The Portal System (TM) Lines: 195 XPortal-User-Id: 1.1001.3570 This is from ihnp4.uucp!ihlpa!jimx (Jim Harris): > I'm getting myself confused, and need some pointers (pun intended) > on passing pointers from a C function to an 8086 assembly > language procedure. This may be old hat to experienced 8086 / C programmers. Judging from ihnp4.uucp!ihlpa!jimx's recent posting, he'd be content to have his assembly procedure working under just the small memory model. I could talk about segment registers and other memory models, but I'll leave that for another occasion. However, one needs to distinguish between the code segment and data segment found in a typical small-model .EXE program. Just remember to put data in the data segment, not the code segment. Typical Microsoft segment info looks like this: _TEXT segment byte public 'CODE' public _foo _foo proc near ; ... ret _foo endp _TEXT ends _DATA segment byte public 'DATA' local_msg db 80 dup (0) _DATA ends DGROUP group _DATA CGROUP group _TEXT assume cs:CGROUP, ds:DGROUP Here's his ASM code, with modifications: _foo proc near push bp mov bp, sp push di push si So far, so good. Looks pretty standard. Graphically, it looks like this: +---------------------------------------+ msg[80] | parameter data array | | | | . . . | =AAAA +---------------------------------------+ +---------------------------------------+ bp+4-> | data parameter: address (offset) =AAAA| =BBBB +---------------------------------------+ bp+2-> | return address (offset) | +---------------------------------------+ bp -> | saved previous value of bp register | =CCCC +---------------------------------------+ | saved previous value of di register | +---------------------------------------+ sp -> | saved previous value of si register | +---------------------------------------+ > Here's where I get confused. My pointer (16-bit offset, I believe) > is at [bp+4]. How do I address the second character of msg[]? mov bx, [bp+4] mov al, [bx+1] ; 2nd char at offset 1 in 0-origin array > [bp+5] is one address above [bp+4], right? Or is it the address > pointed to by bp+5? Jim, you're out in left field. > If I want msg[4], how do I address it? mov bx, [bp+4] mov al, [bx+4] ; offset 4 in 0-origin array At this point, it is important to distinguish between pointers and data. A pointer contains an address which in turn can be used to "reference" data at various (different) locations. C: char dat_var = 'a'; char *ptr_var = &dat_var; register char al; al = *ptr_var; ASM: dat_var db 'z' ptr_var dw dat_var mov bx, ptr_var ; mov bx, word ptr ptr_var ; implied in previous line mov al, [bx] ; "reference" 'z' via a pointer ; mov al, byte ptr bx ; equivalent to previous line Conversely, to generate a pointer for later use, one needs to "de-reference" a variable to determine what it's address is: C: ptr_var = &dat_var; al = *ptr_var; ASM: mov bx, offset dat_var mov word ptr ptr_var, bx mov al, [bx] ; mov al, dta_var ; at this point, equivalent to prev. line When an ordinary variable such as dat_var has it's address stored into a pointer variable and later used (referenced), the ordinary variable is being affected through an "alias". (Look for the word "alias" in your compiler's tutorial on it's code optimizations). ex: does [si+cx] = [bp+4+cx]?? Data can be "referenced" via addresses stored in a limited set of registers, as well as through ordinary variable references. Acceptable register combo's are: [bx], [bx][si], [bx][di], [bp], [bp][si], and [bp][di]. Period. Therefore, both [si+cx] and [bp+4+cx] are illegal. Even if they were legal, the two would usually reference different memory locations. Going back to your code: mov si, [bp+4] ; oops mov di, local_msg mov di, offset local_msg ; need to dereference ; oops mov cx, 4 mov bx, 4 ; let's try a different register ; oops mov al, [si+cx] ; oops mov [di+cx], al mov al, [si+bx] mov [di+bx], al ... > or graphically: > ____ ____ > bp+4 -->|____|-->[bp+4]---->|____| > |____| |____| > |____| |____| > |____| |____| > bp+4+4 -->|____|-->[bp+8]-->? |____|<--[bp+4][4] == [bp+4+4]??? "Near" pointers occupy 2 bytes, so there could be no more than 3 pointers between the stack locations bp+4 and bp+8: bp+4, bp+6, bp+8. It would be a faux pas (bug) to reference pointers at bp+5 or bp+7. > I think we can agree that obviously [bp+8] is not the same thing > as [bp+4]+4. But the book I have seems to imply that [bp+4][4] is > the same as [bp+4+4], which I have trouble accepting. So what is > the solution? Microsoft got wierd on us when they put together the syntax for ASM/MASM. Any time you see brackets [] in an operand, there is only ONE level of "reference" indirection. [bx][si] does NOT mean we're doing 2-dimensioned arrays. Try these series of canonicalizations for clarification: [bp+8] <==> [ bp +8 ] [ bp +8 ] <==> ptr bp +8 [bp+4]+4 <==> [ bp +4 ] +4 [ bp +4 ] +4 <==> ptr bp +4 +4 ptr bp +4 +4 <==> ptr bp +8 [bp+4][4] <==> [ bp +4 ] [ +4 ] [ bp +4 ] [ +4 ] <==> ptr bp +4 +4 ptr bp +4 +4 <==> ptr bp +8 [bp+4+4] <==> [ bp +4 +4 ] [ bp +4 +4 ] <==> ptr bp +4 +4 ptr bp +4 +4 <==> ptr bp +8 -6[bx][si] <==> -6 [ +bx ] [ +si ] -6 [ +bx ] [ +si ] <==> ptr -6 +bx +si > If you have a good reference for passing pointers to assembly language, > please let me know. I have "Supercharging C with Assembly Language", > by Chesley and Waite. I haven't found where (if) they deal with this, > so even a page reference would help. The more I look at this, the > more confusing it seems to get! I'd consider this article a good start. Another way to study this is to generate an assembly listing from your compiler. Microsoft C can do this via "msc /Fc foo.c" or "cl /Fc /c foo.c". I've found the following book to be useful for both novices and experts (who are somewhat new to DOS programming): Campbell, Joe Crafting C tools for the IBM PC's Prentice-Hall, Englewood Cliffs, NJ 07632 QA76.8.I2594C36 1986 ISBN 0-13-188418-2 If you thought this was bad, wait 'till you deal with segmented addressing headaches. Better yet, try writing useful OS/2 device drivers with the contortions of it's pain-in-the-neck "protected mode"-without-context-info!