Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!sdd.hp.com!wuarchive!uunet!mcsun!hp4nl!star.cs.vu.nl!kjb From: kjb@cs.vu.nl (Kees J. Bot) Newsgroups: comp.os.minix Subject: Re: [source] #! in MM -- take 2 Message-ID: <10033@star.cs.vu.nl> Date: 23 May 91 09:25:46 GMT References: Sender: news@cs.vu.nl Lines: 74 I'm posting my comments to Klamer's second try at an #! implementation in MM to remind you about my implementation of #! that I posted on May 13. So far, I have only received comments from Klamer on my version telling me that it is slower than his, because it makes two more calls to FS. Apart from being a little bit slower, using my version is still the easiest way to fix the bugs in Klamer's version. klamer@mi.eltn.utwente.nl (Klamer Schutte) writes: >Here is the second version of my #!interpreter patch for mm/exec.c. >This version has all known bugs fixed. Except for not doing setuid and this other "feature". >One feature (bug ???) remains: i keep alignment from the data argv[] and >envp[] point to intact. There (migh ???) be a tradition of having this >data in the form of strings with only 1 \0 in between. >Where is the manual page for execve(2) ???? Or does POSIX(*) say anything >about this? I know of three places to look for the proper format of the initial stack: - The old V7 manuals under exec(2), written when users were not considered too stupid to know such things. - The source code of execve(2). - The source code of ps(1). The ps(1) source contains this interesting comment: /* * Get_args inspects /dev/mem, using bufp, and tries to locate the initial * stack frame pointer, i.e. the place where the stack started at exec time. * It is assumed that the end of the stack frame looks as follows: * argc <-- initial stack frame starts here * argv[0] * ... * NULL (*) * envp[0] * ... * NULL (**) * argv[0][0] ... '\0' * ... * argv[argc - 1][0] ... '\0' * envp[0][0] ... '\0' * ... * [trailing '\0'] * Where the total space occupied by this original stack frame <= ARG_MAX. * Get_args reads in the last ARG_MAX bytes of the process' data, and * searches back for two NULL ptrs (hopefully the (*) & (**) above). * If it finds such a portion, it continues backwards, counting ptrs until: * a) either a word is found that has as its value the count (supposedly argc), * b) another NULL word is found, in which case the algorithm is reiterated, or * c) we wind up before the start of the buffer and fail. * Upon success, get_args returns a pointer to the conactenated arg list. * Warning: this routine is inherently unreliable and probably doesn't work if * ptrs and ints have different sizes. */ I decided to go over Klamer's patch with a fine comb this time. (I wish someone would do that with my patch, with a mental -pedantic flag on.) - ALIGN align to a multiple of 2, execve to a multiple of sizeof(char *). - The interpreter is found relative to '/'. (Move the first tell_fs(CHDIR, ...) inside the do loop.) - Setuid bits on the script are still ignored. (Wouldn't it be nice to allow people to explore the security risks of a setuid script?) - Change 'know' to 'now' in patch_stack. (-pedantic) - The old argv[0][] is not removed from the initial stack. - The ALIGN(len) is still at the wrong place. Try moving only the strings by disp bytes, then move the pointers by argc*sizeof(char *) bytes. Do an ALIGN(disp) just before the return. - If stk_bytes is close to ARG_MAX then the last few environment variables may be truncated. - Read_header returns 0 when there is nothing behind #!. - The size_ok function may return something other than -100. -- Kees J. Bot (kjb@cs.vu.nl) Systems Programmer, Vrije Universiteit Amsterdam