Thursday, August 22, 2019

Notes on Pareto optimal Cloud Bursting

Pareto-Optimal Cloud Bursting
 Farahabady, YLee, and  Zomaya
October 2014


focus of cloud bursting is on performance to cost ratio.
BoT applications are chosen- horizontally scalability

degree of performance is not strongly correlated with cost.

2 conflicting objectives:
1. minimizing cost(c)
2.  maximizing performance by minimizing makespan (t)

Schedule -
not possible to achieve best values for both criteria simultaneously.
sigma-bar - is said to be dominated by a schedule sigma if both the cost and time metrics are worser than or equal to the latter.

sigma star - a schedule that is not dominated by other schedule.  It is called pareto-optimal schedule.

Pareto frontier - set of all feasible pareto-optimal points.

Each user possesses a utility function, U (c,t) - represents value of a particular schedule from the user's interest.

**Actual user preferences are difficult to obtain...
** complex utility function makes the scheduling problem intractable.

Main contribution : allows users to recognize the best performance to cost ratio
the framework presented is also capable of effectively dealing with uncertainties in application or resource performance.

Problem formulated as binary integer programming problem.

- k different resource types with speeds (s1,.... sk) and cost(c1,..., ck)
- no. of resources that can be leased by the public cloud - Lk
- no of homogeneous resources that are available in the private cloud is Lv

BoT Application model
 - application B, consists of n tasks all available at scheduling time t=0
   (execution of the task is dependent on the clock spee,hence it is CPU intensive.
- the running time of task j is denoted by Pj

** preemption is not considered

If task j is assigned to run on resource i,
processing time = Pj / si
cost of execution  = (ci X Pj) / si

- |B|R - represents the amount of time taken by a BoT application to complete in a reference machine.

Objective function

 min z = theta Cost + (1 - theta) Time : theta is between 0 and 1

value of theta - helps user make better realization of the optimization goal.


  • theta = 1 :  generates a solution that all tasks run in resources of the most cost effective type, i.e. with the highest values of si / ci

eg. speed is 1000 MIPS, cost is $2.00 per hour
1000/2 = 500

speed is 500 MIPS, cost is $1.00 per hour
500/2 = 250


  • theta = 0 : leads to a load balance solution among available resources.

                 each resource of type i receives a load proportional to the value of si / sum of the total number of resources X the speed


Let E denote a set of pareto points denoted by the user
Let Tbest and Tworst be the best and worst makespan achievable for the execution of a given BoT application.













Writing pseudocode

I had no idea writing pseudo code will be so difficult.  I'm still not sure whether it's more appropriate to write an algorithm or pseudocode.

https://link.springer.com/content/pdf/bbm%3A978-1-4471-5173-9%2F1.pdf

So there's this link from Springer on how to write a pseudo code, I think that's a good start.  There was a review some few years back that I'm mixing up various conventions when writing a pseudo code.
Then I tried to follow some conventions and it was just too complicated.

Let's see if I'm able to write with this convention.

Also I feel an algorithm is better

Notes on Algorithms and Complexity

From a 1993 handbook on OR and MS

Sequencing and Scheduling.

Definition:optimal allocation of scarce resources to activities over time.

Algorithms and Complexity

Complexity Theory : provides a mathematical framework where mathematical problems can be classified as "easy" or "hard"

NP Complete problem :  hardest problem in NP, meaning in that if it is solvable, then each problem in NP would be solvable.
so P=NP
NP-completeness - strong evidence that a polynomial time algorithm does not exist.

A problem is NP-complete if every problem is NP and it reduces to NP-complete

Optimization problem is NP Hard - if the associated decision problem is NP-complete.

optimization problem is NP Hard - meaning it is impossible to always find an optimal solution quickly.
possible to have some approximation algorithms.

Arbitrary precedence constraints, release dates, deadlines

if release dates are specified, makes the problem strongly NP Hard.

when jobs have to meet deadlines : shown to be NP- Hard by Lawler

Many have developed elimination criteria. i.e., of it meets certain criteria for tasks, then schedule, etc.
if elimination criteria are met then the problem can be approximated to polynomial time.

The rest of the chapter examines each type of scheduling problem ones with uniform processors, heterogeneous processor, job-shop.

There is a reference to Ibarra and Kim from which I'd somewhat understood the scheduling problem.