The -server problem is one of the most important and most well-studied problems in the field of online algorithms. The goal of this blog is to describe the problem to you and give you some flavor of the underlying techniques used in the recent breakthrough by Sebastien Bubeck, Michael Cohen, James Lee, Aleksander Madry and me. (Note that three of them are members of ADSI!) For a detailed blog with proofs, please see here.
In this problem, we control the movement of a set of servers on a fixed metric space of vertices. In each iteration, we are given a request . If there is no server at that location, we must choose a server, move it there and pay the movement cost of this server. Our goal is to minimize the total distance of all servers’ moves. This problem is very general and is originally proposed to model problems related to cache management.
We call an algorithm -competitive if for any request sequence, its total movement is at most where is the optimal movement cost if the whole request sequence is given in the beginning. A priori, it is unclear that any competitive algorithm exists because we are comparing against a powerful algorithm that knows the whole request sequence and is free to make any moves.
To illustrate the learning aspect, consider a -server problem on a graph with 3 vertices . Suppose that , for and the request is
Since and are so far away, the optimal strategy for this sequence is to put a server on and a server on . However, if appears less frequently than once per iterations, a better strategy would put both servers on most of the time. Therefore, any memoryless algorithm is not competitive and that we indeed need to learn something.
One main focus for the -server problem is to achieve competitive ratio independent of because of the common case . Two of the major open problems is to find a -competitive deterministic algorithm and a -competitive randomized algorithm for the -server problem. For the deterministic problem, Koutsoupias and Papadimitriou gave a deterministic algorithm that achieves a competitive ratio of on any metric space. However, there has not been too much progress on the randomize problem. In particular, there is no competitive algorithm except for simple classes of graphs such as weighted complete graphs. Due to its importance and the gap between the lower and upper bound, the following “easier” problem had been a major target of the field. And in my opinion, this was the most important conjecture in online algorithms.
(Weak randomized -server conjecture) There is a randomized algorithm for the -server problem on any graph with competitive ratio .
The previous best competitive ratio is (by Bansal, Buchbinder, Naor and Madry in 2011). In our paper, we give an algorithm with competitive ratio on hierarchically separated trees (HST, a much larger classes of graphs) and as a corollary on general graph. Soon after our paper, James Lee developed upon our paper and gave an algorithm with competitive ratio , finally resolving the weak randomized -server conjecture!
Online Learning and Mirror Descent
Our algorithm is based on mirror descent with a multiscale entropy. So, let me describe an online learning problem and the mirror descent algorithm for it.
In this problem, we are given a convex set . In the iteration, we select a vector , then the adversary selects a vector in and we receive a loss for that iteration. Our goal is to minimize the regret (the difference between your loss and the loss of the optimal fixed strategy)
This problem can be solved by mirror descent:
where is the step size, is a convex function on called mirror map and the Bregman divergence associated to defined by
When is very close to , and hence mirror descent is simply moving towards direction while making sure the point is in and it is not too far from the previous point in norm.
For me, a general wisdom, when facing a new learning problem, is to check if mirror descent or some of its variant is good. See my favorite example here.
Weighted Complete Graph
Unfortunately, applying mirror descent to the -server problem is not as easy as picking a good mirror map as I wished. Let me first describe the algorithm for the complete graphs with the metric of the form . For this and many other graphs, it is known how to turn a fractional solution that is feasible
to an integral solution with the movement cost bounded by (a natural continuous definition of movement cost). Therefore, it suffices to propose a fractional algorithm (the first prerequisite for applying mirror descent).
Our algorithm is motivated by the competitive-algorithm by Bansal, Buchbinder and Naor in 2007. To make its relation to mirror descent clear, I describe our process here as a discrete process with an infinitesimally small step size . Instead of working on the fractional server , our algorithm is defined on the fractional anti-server :
- Let .
- Let .
- When the request for the vertex arrives
- While .
- where is the coordinate vector at .
- While .
In short, when the request arrives at , we run the mirror descent with the cost until all anti-mass leaves the coordinate .
The reason of using is to put a “cost” at coordinate that forces all anti-mass at leaves (equivalently, attracting servers to move to ). Since is a simplex, the standard choice of the mirror map is . However, this mirror map is not suitable because its gradient blows up on the boundary. Following the idea in (BBN07), we shift all variables by in the mirror map.
Since the step size is infinitesimally small,
Using this, one can show that the algorithm is moving the anti-mass from coordinate to coordinate with a rate proportionally to . Namely, the algorithm tends to move the server from vertices with smaller weight and less server mass to the request. One can show this algorithm has a competitive ratio.
Hierarchically Separated Tree
In general, it suffices to solve the -server problem on HST metrics (BBNM11). This reduction would cost a factor in the competitive ratio. Given a tree with vertex weights , we define the metric on the leaf by where is the least common ancestor of and . We call this tree is a HST if whenever is a child of and we call the metric space a HST metric.
Two main questions to be decided are:
- How do we represent the solution?
- What is the mirror map we use?
One natural choose is to represent anti-mass using
where is the root of the tree and is the parent of . The main issue of this representation is that it cannot distinguish between the case we need exactly 1 server in a subtree or we need 2 servers with 50% probability. Another small issue is that some constraint is never active. Using the fact that the algorithm is always moving servers to the request until , one can show that and are never active.
Fixing these issues, we have a new representation
where are sets of pairs with . The first set of constraints ensures that there are only servers in total and the second set of constraints ensures that the number of servers in the children of a vertex is at most the number of server at .
The mirror map we pick is a generalization of the mirror map on a simplex
where the shift .
Except for some technical issues, our algorithm for HST is the same as the algorithm described for the complete graph except this new and mirror map .
The proof of the classical mirror descent is clean, short and optimal. Unfortunately, our proof is slightly longer (i.e. few pages) and the bound we get for HST does not sound optimal. We hoped to get the ultimate algorithm for -server at least for HST, however, our algorithm seems to not be the one from the BOOK. So, what is the algorithm from the BOOK for the -server problem (at least for HST)?
On the other hand, it is interesting to see if HST is necessary for an -competitive algorithm on general graphs (or for a path). Maybe I am too naive on this and HST is indeed the right tool?