Effective Bandwidth (b_eff) Benchmark

The algorithm of b_eff (version 3.1)

The effective bandwidth b_eff measures the accumulated bandwidth of the communication network of a parallel and/or distributed computing system. Several message sizes, communication patterns and methods are used. The algorithm uses an average to take into account that in real applications short and long messages result in different bandwidth values.

Definition of the effective bandwidth b_eff:

b_eff = logavg ( logavg_{cartesian pattern} (sum_L (max_mthd (max_rep ( b(cartes.pat.,L,mthd,rep) )))/21 ),

logavg_{random pattern} (sum_L (max_mthd (max_rep ( b(random pat.,L,mthd,rep) )))/21 )

)

with

b(pat,L,mthd,rep) = L * (total number of messages of a pattern "pat") * looplength / (maximum time on each process for executing the communication pattern looplength times)
Each measurement is repeated 3 times (rep=1..3). The maximum bandwidth of all repetitions is used (see max_mthd in the formula above).
Each pattern is programmed with three methods. The maximum bandwidth of all methods is used (max_mthd).
The measurement is done for different sizes of a message: The message length L has the following 21 values:
L = 1B, 2B, 4B, ... 2kB, 4kB, 4kB*(a**1), 4kB*(a**2), ... 4kB*(a**8)
with and 4kB*(a**8) = L_max and L_max = (memory per processor) / 128
and looplength = min( 300, L_Max / L ).
The average of the bandwidth of all messages sizes is computed (sum_L(...)/21).
A set of cartesian and random pattern is used (see details section below).
The average for all cartesian and the average of all random pattern is computed on the logarithmic scale
(logavg_{cartesian pattern} and logavg_{random pattern}).
Finally the effective bandwidth is the logarithmic average of these two values
(logavg(logavg_{cartesian pattern}, logavg_{random pattern}).

Details of the algoritm:

Each node sends sends in each measurement messages to one or more nodes. Three cyclic cartesian topologies and the following communication patterns are used:
- 1-dimensional x-direction
- 2-dimensional x-direction
- 2-dimensional y-direction
- 2-dimensional x-direction and y-direction
- 3-dimensional x-direction
- 3-dimensional y-direction
- 3-dimensional z-direction
- 3-dimensional x-direction and y-direction and z-direction
If the number of nodes is larger than 8 then the size of the each topology is reduced until each dimension is larger than 1. If the number of nodes is larger than 4 then the size for the 2- and 3-dimensional topology is reduced to the next even number. If the dimension of a direction is 1 then this direction is omitted. Patterns without directions longer than 1 are omitted.
For the random patterns the same ring-communication as in the 1-dimensional topology is used, but the processes get random ranks.
The average is computed in two steps to guarantee that the cartesian and random patterns are weighted the same.

Background

First approach from Karl Solchenbach, Hans-Joachim Plum and Gero Ritzenhoefer [1,2] was based on the bi-section bandwidth.

Due to several problems a redesign was done by the b_eff group. This redesign tries not to violate the rules defined by Rolf Hempel in [3] and by William Gropp and Ewing Lusk in [4].

Output of the b_eff Benchmark

Each run of the benchmark on a paricular system results in an output file. the last line of this output file reports e.g.

b_eff = 9709.549 MB/s = 37.928 * 256 PEs with 128 MB/PE on sn6715 hwwt3e 2.0.4.71 unicosmk CRAY T3E

This line reports

the effective bandwidth b_eff of the whole system,
the effective bandwidth of each processor (or node),
the number of processors (or nodes),
the memory of each processor (or node),
the output of uname -a.

The previous sections of the output file are mainly all measurement values of b(pat,L,mthd,rep) and some analysis tables. A full description of the output file is available here.

Sourcecode

Draft: b_eff.c (version 3.1)

References:

[1]: Karl Solchenbach: Benchmarking the Balance of Parallel Computers. SPEC Workshop on Benchmarking Parallel and High-Performance Computing Systems (copy of the slides), Wuppertal, Germany, Sept. 13, 1999.
[2]

Links

Pallas Effective Bandwidth Benchmark MPI at HLRS HLRS Navigation HLRS
This page: www.hlrs.de/mpi/b_eff/

Rolf Rabenseifner

b_eff =	logavg	(	logavg_{cartesian pattern}	(sum_L (max_mthd (max_rep ( b(cartes.pat.,L,mthd,rep)	)))/21 ),
			logavg_{random pattern}	(sum_L (max_mthd (max_rep ( b(random pat.,L,mthd,rep)	)))/21 )
		)

Effective Bandwidth (beff) Benchmark

The algorithm of beff (version 3.1)

Definition of the effective bandwidth beff: