Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!texsun!convex!news
From: patrick@convex.COM (Patrick F. McGehearty)
Newsgroups: comp.lang.fortran
Subject: Re: Distributed processing in FORTRAN
Message-ID: <1991Mar08.143537.28618@convex.com>
Date: 8 Mar 91 14:35:37 GMT
References: <MCCALPIN.91Mar7205707@pereland.cms.udel.edu>
Sender: news@convex.com (news access account)
Reply-To: patrick@convex.COM (Patrick F. McGehearty)
Organization: Convex Computer Corporation, Richardson, Tx.
Lines: 54
Nntp-Posting-Host: mozart.convex.com

In article <MCCALPIN.91Mar7205707@pereland.cms.udel.edu> mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes:
Some info on a generalized method of using Fortran File I/O
to share data in an application.

>(1) The application is an explicitly integrated, three-dimensional
>geophysical fluid dynamical model using multiple domains each being
>integrated by an independent process.  On a single cpu, these
>processes communicate via files, using a code section that looks
>like:
>	DO TIME=1,INFINITY
>	    Integrate the equations one time step
>
>	    OPEN(out_unit,file=out_file,form='unformatted')
>	    REWIND(out_unit)		! don't count on this by default!
>	    WRITE(out_unit) boundary data for other processes
>	    CLOSE(out_unit)		! to force flushing the buffers
>
>	    OPEN(in_unit,file=in_file,form='unformatted')
>	    REWIND(in_unit)
> 8888	    READ(in_unit,ERR=8888,END=8888) boundary data from other processes
>	    CLOSE(in_unit)
>
>	    Swap time levels
>	END DO


My questions are: 
(1) How much data is being written? (kilobytes or megabytes?)
	If kilobytes, then the data should be able to stay in the
	buffer cache for most Unix file systems.
(2) How long does this "integrate the equations" time step take?
	The combined answer of 1 & 2 strongly determine whether
	overhead will dominate computation.  If the WRITE/READ can
	be done in a small fraction ( < 0.01 ) of the integration
	time, then there is a reasonable opportunity for moderate
	degrees of parallelism.
(3) A properly implemented parallel Unix OS will implicitly provide
	the functionality of the "sgiap(0)" syscall whenever the
	READ operation would block.  The call should be unnecessary.


>This modified version of the code runs with an efficiency of in excess
>of 98% for two processes on the same cpu.  The next step is to find
>out if similar functions are available on other O/S's and to start
>running the code using NFS to share the files between machines.

NFS has higher overheads and lower throughput than direct mounted disk I/O.
If the ratio computed in (2) is small enough, and the granularity of
computation is greater than seconds, then great parallelism is available.

>--
>John D. McCalpin			mccalpin@perelandra.cms.udel.edu
>Assistant Professor			mccalpin@brahms.udel.edu
>College of Marine Studies, U. Del.	J.MCCALPIN/OMNET