Optimal Computer Crash Performance Precaution
Efraim Laksman, Håkan Lennerstad, Lars Lundberg
Abstract
For a parallel computer system with m identical
computers, we study optimal performance precaution for one possible
computer crash. We want to calculate the cost of crash precaution in
the case of no crash. We thus define a tolerance level r
meaning that we only tolerate that the completion time of a parallel
program after a crash is at most a factor r + 1 larger
than if we use optimal allocation on m - 1
computers. This is an r-dependent restriction of the set
of allocations of a program. Then, what is the worst-case ratio of the
optimal r-dependent completion time in the case of no
crash and the unrestricted optimal completion time of the same
parallel program? We denote the maximal ratio of completion times
f(r,m) - i.e., the ratio for worst-case
programs. In the paper we establish upper and lower bounds of the
worst-case cost function f(r,m) and characterize
worst-case programs.
Full Text: PDF PostScript