Sunday, August 21, 2011

Piecewise-linear Functions Redux

A recent question on OR-Exchange about step functions in mathematical programs made me decide to update an earlier post about piecewise-linear functions.

In the prior post I dealt with continuous piecewise-linear functions (of a single variable) in mathematical programs. As a quick recap, if the function represents a diseconomy of scale (convex in the objective of a minimization problem or concave in the objective of a maximization problem), as in the first plot below, it can be modeled strictly with linear inequality constraints (no binary variables needed). Note that convexity implies continuity everywhere except possibly the endpoints (where you can have a jump discontinuity and still be convex or concave). The method I previously described requires continuity at the endpoints as well.

If the function is continuous but lacks the appropriate convexity/concavity, as in the next plot (which is neither convex nor concave), the all-inequality trick fails. It may also fail for convex or concave functions that appear in constraints, either because the convexity of the function does not match the direction of the inequality (you have $f(x)\ge b$ where $f()$ is convex or $f(x)\le b$ where $f()$ is concave), or because $f()$ is tied to other variables in the constraints in such a way that there is a diseconomy of scale to the direct contribution of $f()$ to the objective value but an economy of scale overall (overestimating the value of convex $f()$ or underestimating the value of concave $f()$ allows the other variables to take values that should be infeasible given the value of $x$). If $f()$ appears in any constraints and you are not absolutely certain that it matches the directions of the constraints in which it appears and represents an overall diseconomy of scale, treat it as an arbitrary continuous piecewise-linear function, as discussed next.
Again recapping the previous post, for an arbitrary continuous piecewise linear function $f(x)$ with break points $a_{1}\lt a_{2}\lt\cdots\lt a_{n+1}$ and values $f_{i}=f(a_{i})$ at the break points, we introduce continuous variables $\lambda_{1},\dots,\lambda_{n+1}$, add the constraint $\lambda_{1}+\cdots+\lambda_{n+1}=1$, and tell the solver that the $\lambda$ variables form an SOS2 set. We then add the constraint $x=a_{1}\lambda_{1}+\cdots+a_{n+1}\lambda_{n+1}$ and replace $y=f(x)$ with $y=f_{1}\lambda_{1}+\cdots+f_{n+1}\lambda_{n+1}$. If the solver does not understand SOS2 sets, it is necessary to introduce binary variables (as described in the previous post).

Step functions, such as the one plotted below, are not continuous, so nothing from the earlier post applies to them. Suppose we have break points $a_{i}$ as above, with $f(x)=f_{i}$ when $a_{i}\le x\lt a_{i+1}$. The first problem we confront is that mathematical programs abhor strict inequalities. To a mathematical programming solver, there is no difference between $x\lt a_{i+1}$ and $x\le a_{i+1}$ unless $x$ is a discrete variable (i.e., there is a largest possible value for $x$ strictly less than $a_{i+1}$. That will not be the case here, so we will simply have to deal with the fact that if the solver decides it wants $x=a_{i+1}$, it will set $f(x)$ arbitrarily to whichever of $f_{i}$ and $f_{i+1}$ makes for the better objective result.

To model a step function, we use an SOS1 rather than SOS2 set. Add binary variables $u_{1},\dots,u_{n}$ constrained by $u_{1}+\cdots+u_{n}=1$, declare them to be an SOS1 set, and include the constraints \begin{eqnarray*} \sum_{i=1}^{n}a_{i}u_{i} &\le & x &\le &\sum_{i=1}^{n}a_{i+1}u_{i}\\ y & &= && \sum_{i=1}^{n}f_{i}u_{i}. \end{eqnarray*} (Explicit declaration of the SOS1 set may not be necessary; some solvers recognize SOS1 sets automatically, and some may not exploit the useful properties of an SOS1 set even if the set is identified for them by the modeler.)

One last point: both the SOS1 and SOS2 methods require that $x$ have a finite upper bound ($a_{n+1}$). That upper bound will show up as a coefficient in the constraint matrix. The larger it is, the more likely the model is to be numerically unstable. So avoid the temptation to set $a_{n+1}$ equal to the largest floating point value your compiler can handle. Infinity is not your friend here.

Hmm. This is all a bit dry.  Maybe a soundtrack would help? 

No comments:

Post a Comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.