Xref: utzoo comp.unix.wizards:24904 comp.unix.programmer:1583 comp.mail.sendmail:3041 comp.unix.aix:4531 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!airgun!markw From: markw@airgun.wg.waii.com (Mark Whetzel) Newsgroups: comp.unix.wizards,comp.unix.programmer,comp.mail.sendmail,comp.unix.aix Subject: domain query subroutine res_search Keywords: sendmail domain res_search Message-ID: <934@airgun.wg.waii.com> Date: 15 Apr 91 19:51:12 GMT Organization: Western Geophysical, Houston Lines: 86 I am working with another programmer on porting the IDA sendmail to the IBM RT running AIX 2.2.1. (yes its yucky IBM, but it works and it's paid for :-) So far so good on making it work, but we may have found a bug with the AIX at the latest maint level, and code that works on one RT (2705 level) won't work on another RT (1773 level) at a higher maint level. The problem area of code is dealing with domain server queries in the routine domain.c, in particular it is using the res_search system subroutine. I can't find any documentation about this routine, and referencing both AIX V3 (RS6000) and SUNOS (4.0.1) and CONVEXOS, none of these systems also have documentation about this system subroutine. I can find res_init, res_mkquery, res_send, but not res_search. What is happening is res_search, called for looking for MX records is returning a -1 return code and h_errno is set with TRY_AGAIN (value 2) rather than NO_DATA (value 4). The NO_DATA value indicates that the host record is valid, but no records of the requested type could be found. The nameserver is reachable, and a piece of test code that queries with type = T_A work properly and return a valid query record, but types of T_MX fail, with this TRY_AGAIN failure. This causes sendmail to defer the mail, waiting for a nameserver positive response. We currently do not have many MX records in our nameserver and the system name that is being queried, does not have any MX records on file. Here is the code fragment from domain.c from the IDA sendmail: [some code deleted] typedef union { HEADER qb1; char qb2[PACKETSZ]; } querybuf; extern int h_errno; querybuf answer; [some code deleted] errno = 0; n = res_search(host, C_IN, T_MX, (char *)&answer, sizeof(answer)); if (n < 0) { if (tTd(8, 1)) printf("getmxrr: res_search failed (errno=%d, h_errno=%d)\n", errno, h_errno); switch (h_errno) { # ifndef NO_DATA # define NO_DATA NO_ADDRESS # endif /* NO_DATA */ case NO_DATA: case NO_RECOVERY: /* no MX data on this host */ goto punt; case HOST_NOT_FOUND: /* the host just doesn't exist */ *rcode = EX_NOHOST; break; case TRY_AGAIN: /* couldn't connect to the name server */ if (!UseNameServer && errno == ECONNREFUSED) goto punt; /* it might come up later; better queue it up */ *rcode = EX_TEMPFAIL; break; } Any pointers on what may be wrong? Where is this routine discussed, is all these systems documentation lacking? I am going to report this to IBM, but with an undocumented routine, it may be tricky. As I indicate, on a different system, all works ok. PS. I have verified the /etc/resolv.conf file to verify proper contents, it is identical to other systems at our site, and other hostname lookups are correctly working (telnet, rlogin, host, ect..). I have tested this on the RS6000 and also get the h_errno=4 just like the 2705 level RT. Thanks for any light you can shed on this funny routine and its orgins. Markw -- Mark Whetzel My comments are my own, not my company's. Western Geophysical - A division of Western Atlas International, A Litton/Dresser Company DOMAIN addr: markw@airgun.wg.waii.com UUNET address: uunet!airgun!markw