DASH
0.3.0
|
Static Public Member Functions | |
static std::vector< double > | unit_weights (const TeamLocality_t &tloc) |
Shared memory bandwidth capacities of every unit factored by the mean memory bandwidth capacity of all units in the team. More... | |
Definition at line 63 of file LoadBalancePattern.h.
|
inlinestatic |
Shared memory bandwidth capacities of every unit factored by the mean memory bandwidth capacity of all units in the team.
Consequently, a vector of 1's is returned if all units have identical memory bandwidth.
The memory bandwidth balancing weight for a unit is relative to the bytes/cycle measure of its affine core and considers the lower bound ("maximum of minimal") throughput between the unit to any other unit in the host system's shared memory domain.
This is mostly relevant for accelerators that have no direct access to the host system's shared memory. For example, Intel MIC accelerators are connected to the host with a 6.2 GB/s PCIE bus and a single MIC core operates at 1.1 Ghz with 4 hardware threads. The resulting measure (bytes/cycle) is calculated as:
Mpk = 6.2 GB/s Cpk = 1.1 Ghz * 4 = 4.4 G cycles/s BpC = Mpk / Cpk = 5.63 bytes/cycle
The principal idea is that any data used in operations on the MIC target must be moved over the slow PCIE interconnect first. The offload overhead therefore reduces the amount of data assigned to a MIC accelerator, despite its superior ops/s performance.
Definition at line 98 of file LoadBalancePattern.h.