Path: utzoo!attcan!uunet!mcsun!ukc!inmos!conor@lion.inmos.co.uk From: conor@lion.inmos.co.uk (Conor O'Neill) Newsgroups: comp.std.c Subject: Hash functions on pointers Message-ID: <13983@ganymede.inmos.co.uk> Date: 24 Jan 91 11:15:48 GMT Sender: news@inmos.co.uk Reply-To: conor@inmos.co.uk () Organization: INMOS Limited, Bristol, UK. Lines: 65 I am curious as to whether it is possible to write a _portable_ _ANSI C_ hash function from pointers to integers. In this sense, portable means that it should work and be vaguely efficient on different machines; I do not mean that a given pointer should map to the same integer on different machines (indeed, since pointer representations may be different, this would be impossible). What I require is a function which maps a pointer onto a number between 0 and N-1. The function may be (indeed, would be expected to be) many-to-one. Obviously the vacuous int hash (void *ptr) { return 0; } isn't really good enough, even though it could be considered a hash function. The problem is that the effect of casting a pointer to an integer is implementation defined; thus an implementation could, for example, always return 99. ANSI standard, section 3.3.4: ``A pointer may be converted to an integral type. The size of integer required and the result are implementation-defined. If the space provided is not long enough, the behaviour is undefined'' This seems to indicate that there must exist an integral type (which may be a different type for each implementation) to which a pointer may be converted. (This does not imply that information may not be lost). Is there any way of determining what this type is? Do the other parts of the standard indicate that long int would always be enough? Or maybe unsigned long int? Neither seems clear to me. Ideally there would have been defined something like ptrint_t, by analogy with size_t, which would be an integral type which is `big enough'. Presumably `integral type' means a standard type; is it permissible to require an extended type such as long long int if that were supported? The standard continues: ``An arbitrary integer may be converted to a pointer. The result is implementation-defined. (Footnote 42).'' This seems to be less relevant for the suggested hash function, except for the footnote (which does not constitute part of the standard): ``Footnote 42: The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.'' This indicates that a `quality of implementation' factor is that there is some sensible relationship between the `value' of the pointer and the integral value; otherwise an implementation could satisfy the standard by always returning 99 when casting a pointer to an integer. Given all the above, is the following hash function legal and portable? (I.e. will it have `undefined' effect in any implementation?) (Assume that N is less than INT_MAX). int hash (void *ptr) { unsigned long int x = (unsigned long int)ptr; return x % N; } --- Conor O'Neill, Software Group, INMOS Ltd., UK. UK: conor@inmos.co.uk US: conor@inmos.com "It's state-of-the-art" "But it doesn't work!" "That is the state-of-the-art".