Path: utzoo!utgpu!watserv1!watmath!uunet!mcsun!ukc!mrccrc!dcurtis From: dcurtis@crc.ac.uk (Dr. David Curtis) Newsgroups: bionet.molbio.genome-program Subject: Combining 2-point data Summary: Long and rambling Message-ID: <505@tin.crc.ac.uk> Date: 26 Feb 91 11:18:46 GMT Reply-To: dcurtis@crc.ac.uk (Dr. David Curtis) Organization: MRC Human Genome Resource Centre Lines: 66 The problem is simple: how much evidence can we adduce for the regional localisation of e.g. a disease gene based on linkage data using a number of other markers? This must be commonly addressed, but I don't seem to be having much luck with the literature. One approach: do a full multipoint analysis e.g. with LINKMAP to provide a location score. This is limited with polymorphic markers and large families to say three markers + the disease gene. What can I do with the data from the other nearby markers which I cannot include in the analysis? Or do lots of multipoints using different subsets of markers - but each time I am only using some of the available information, and anyway which of the multipoints should I "trust" to be giving the "correct" answer? Other approaches would in general look at some way to combine all the two-point data, but which is the best way and what is the strength of the evidence thus obtained? I looked at the paper by Olson and Boehnke (Am J Hum Genet 47:470-482, 1990) comparing different algorithms to order the markers, but this did not really tell me much about the degree of evidence that the disease gene was or was not linked to a group of markers known to be linked to each other. Morton and Andrews, in their paper on MAP (Ann Hum Genet 53:263-269, 1989), describe how they order loci and then say that "Global support for a locus expresses the evidence on chromosome assignment as sigma Z(thetaE) where thetaE is the recombination rate expected between the locus and another marker on the chromosome, and the summation is over all syntenic markers. Although the lods from the same data set are dependent, this has remarkably little effect on significance levels." This sounds like exactly what I need, except I cannot believe that it is correct. If I understand what they are saying it is that having found the best position for the markers and disease locus I can calculate global support for the disease locus being linked to the other markers by summing the lod scores of the disease locus with each of the other markers at the distance which separates them on the new map. It seems to me that this could easily give a large overestimation of the global support, and I think their second sentence is incorrect - I think that the fact that the data are dependent could have a large effect on signicance levels. I believe Edwards mentioned this point in a letter to Nature in 1989 (although as he was writing in the context of a multipoint analysis I did not agree with him entirely). If we had a small family and a highly informative (in fact say completely informative) marker which gave a small positive lod score with a disease at a certain distance, then we would expect that if we studied another extremely informative marker in the same family which was tightly linked to the first (in fact say another polymorphism of the first) then we would get the same positive lod score at the same distance. So we know that studying the new polymorphism would (if I have understood Morton's approach correctly) double the lod score. But clearly we have not doubled our evidence in favour of linkage, which still stands where it was before. Intuitively, the closer linked and more informative are two markers, the less independent information one gives over and above the information from the first. What are people's views on this subject - as I say it must be a common problem. How can we best utilise data from all markers to judge the extent of evidence in favour of disease gene being located approximately in a particular region? Dave Curtis Academic Department of Psychiatry, Janet: dc@UK.AC.UCL.SM.PSYCH Middlesex Hospital, Elsewhere: dc@PSYCH.SM.UCL.AC.UK Mortimer Street, London W1N 8AA. EARN/Bitnet: dc%PSYCH.SM.UCL@UKACRL Tel 071-636 8333 Fax 071-323 1459 Usenet: ...!mcsun!ukc!mrccrc!D.Curtis Brought to you by Super Global Mega Corp .com