Practical: Working environment ------------------------------ 1. Your working directory: cd ~/CGV/{nr} with {nr} = number of your group (01..19) 2. Choose your task: {task} (out of 01 02 04) 3. Fetch your skeleton: cp ~/CGV/skel/cgv_{task}.F90 . 4. Add your code, compile, run and test it (correct result?, same as serial result?) 5. If your task works: extract your part (from /*== task_ii begin ==*/ to /*== task_ii end ==*/ ) into cgvp{task}.F90 6. When all groups have finished, everyone can check the total result with: ls -l ../*/cgvp*.F90 cat ../00/cgvp00.F90 ../*/cgvp08.F90 ../*/cgvp01.F90 ../*/cgvp02.F90 ../*/cgvp03.F90 ../*/cgvp04.F90 ../*/cgvp05.F90 ../*/cgvp06.F90 ../*/cgvp07.F90 > cg_all.F90 Caution: - duplicate parts must be selected by hand ({nr} instead of *) - missing parts may be fetched also from ../source/parts/cgvp{task}.F90 7. Compile and run cgv_all.F90 - on T3E: f90 -o cgv_all cgv_all.F90 -lm (compile) fpart (to look whether there are free CPUs) mpirun -np 2 ./cgv_all (run parallel) - on many other platforms f90 -o cgv_all cgv_all.F90 -lm -lmpi (compile) or mpif90 -o cgv_all cgv_all.F90 -lm (compile) mpirun -np 2 ./cgv_all (run parallel) mpirun -np 4 ./cgv_all -n 100 -m 100 (run parallel with larger dataset) - on non-MPI platforms: f90 -Dserial -o cgv_all cgv_all.F90 -lm (compile) ./cgv_all (run serial) Practical: Options ------------------ Compile-time options [default]: -Dserial compile without MPI and without distribution [parallel] -DWITH_MEMOPS use memcpy and memset functions instead of loops for memory copy and set operations [loops] -DFASTMEMXS run through the physical area during matrix-vector-mult calculating two rows in one loop for better memory bandwidth usage Run-time options [default]: -m {m} vertical dimension of physical heat area [4] -n {n} horizontal dimension [4] -imax {iter_max} maximum number of iterations in the CG solver [500] -eps {epsilon} abort criterion of the solver for residual vector [1e-6] -twodims choose 2-dimensional domain decomposition [1-dim] -mprocs {m_procs} choose number of processors, vertical, (-twodims needed) -nprocs {n_procs} and horizontal [given by MPI_Dims_create] -prtlev 0|1|2|3|4|5 printing and debug level [1]: 1 = only || result - exact solution || and partial result matrix 2 = and residual norm after each iteration 3 = and result of physical heat matrix 4 = and all vector and matrix information in 1st iteration 5 = and in all iterations Example: mpirun -np 4 ./cgv_all -m 200 -n 200 -twodims