 Parallel Numerics
 overview
 Partial Differential Equations
 Solution Methods for Partial Differential Equations
 different aspects of parallelism
 Parallelism in hardware view
 Parallelism in software view
 Parallelism in performance view
 Parallelism in algorithmic view
 parallel operations on independent sets of data 1
 Vectorization: multiple gather
 Vectorization: scattering data
 F90: array syntax
 F90: index vectors
 F95: forall ... end forall different from do ... enddo
 parallel operations on independent sets of data 2
 recursive task generation 1
 recursive task generation 2
 functional decomposition 1
 functional decomposition 2
 neighbourhoods
 Finite Differences
(Approximation of Differential Operators by Differences)
 Finite Differences 2
 Matrix for regular grid with 5points stencil
 Finite Volumes (Flux Balance over neighbouring sides)
 finite volume and finite differences
 Finite Volume
 Triangular Grid with Relation Matrix
 Finite Elements (local matrices form a global matrix)
 unstructured self adaptive grids: CEQ 1
 unstructured self adaptive grids: CEQ 2
 unstructured for self adaptive grids: CEQ 3
 unstructured self adaptive grids: CEQ 4
 domain decomposition: overlapping grids and communication
 domain decomposition: properties
 domain decomposition: communicated surface for cubus
 regular grid partitioning of regular grid
 domain decomposition: irregular grid partitioning of regular grid
 domain decomposition: recursive Spectral Bisection
 domain decomposition:
 Recursive Spectral Bisection
 Recursive Spectral Bisection
 Partitioning Software
 Large Systems
 Large Systems
 two different approaches
 main directions of linear solvers
 direct solvers
 simple iterative procedures: GaussSeidel
 simple iterative procedures: Jakobi
 Krylov space algorithms
 Conjugate Gradient Squared (CGS)
 BiCGSTAB(2)
 operations of Krylov space algorithms
 operations of Krylov space algorithms
 preconditioning
 MILU for general matrix
 overlapping domain decomposition
 non overlapping domain decompostion
 nonoverlapping Domain Decomposition
 nonoverlapping DD: several 2D domains
 nonoverlapping DD:
 nonoverlapping DD: separation of inner and boundary nodes
 nonoverlapping Domain Decomposition
 nonoverlapping DD: Schur complement
 nonoverlapping DD:
 nonoverlapping DD: dividing boundary nodes
 nonoverlapping DD: Preconditioner 1
 nonoverlapping DD: Preconditioner 2
 nonoverlapping DD: reference
 coloring: defining independent sets
 coloring of a graph: any neighbour has different color
 coloring
 Multigrid 1
 Multigrid 2
 Hierarchy of triangles
 hardware and implementation
 High Performance for Fast Methods
 parameters of performance limits
 bandwidth
 latency
 overcoming latency
 Latency from processor to various locations
 dependency of bandwidth and latency
 dependency of bandwidth and latency 2
 BandwidthPerformance relation for multiblock grid with cubes
 Example multiblock technique (optimal case)
 distributions in quadratic patches
 distribution in line patches
 High Performance Systems
 ASCI White IBM[2000]
 EarthSimulator Project: end 2001
 programming techniques
 getting processor performance by loops with arrays
 flexible but expensive: linked list 1
 flexible but expensive: linked list 2
 nice but expensive: recursive calls
 recursive calls 2
 modifying a hierarchical grid
 inhibiting processor performance
 good for performance
 cache reuse for heat transfer: different implementations on SGI R10000
 object oriented programming
 Formulation of flexible neighbourhoods
 discrete operators and sparse matrices
 handling sparse matrices
 Sparse matrix example
 Sparse matrix: row ordered example
 Row ordered and jagged diagonal matrices:data structure
 Row ordered: matrix vector multiplication
 Sparse matrices: jagged diagonal example
 jagged diagonal matrix: matrix vector multiplication
 Row ordered and jagged diagonal matrix diagonal matrix formulation: performance
 Conclusion
 Books and URLs 1
 Books and URLs 2
 End
