You are right that the code uses L2 optimization instead of L1. For the L1 optimization, I believe GPT people use the iterative reweighted least-squares (IRLS) algorithm. To understand IRLS, we need to start from the linear algebra solution for the L2 problem:
Following the nomenclature in the GPT paper, let the "primal image" be a vector v_p, and the "gradient images" be vectors v_x and v_y. The L2 optimization seeks for an image v such that:
alpha * (v-v_p)^2 + (Dx*v - v_x)^2 + (Dy*v - v_y)^2
is minimized. The Dx and Dy here are the finite difference operators which can be represented as matrices. From calculus 101 we know that if there is a minimal value it must be at where the derivative is 0, so we can solve the L2 optimization by solving the following linear system:
(alpha * I -(Dx^T*Dx +Dy^T*Dy)) v = alpha * v_p - (Dx * v_x + Dy * v_y),
where I is the identity matrix (the Dx^T*Dx and Dy^T*Dy are the laplacian operators, I think this is what you were mentioning). Although the matrices are enormously large, they are sparse so it is solvable by using iterative techniques like conjugate gradient.
Now, for the L1 optimization we need to minimize the L1 cost function:
alpha * |v-v_p| + |Dx*v - v_x| + |Dy*v - v_y|,
where the squared differences are replaced by absolute differences. The derivatives trick is no longer directly applicable because of the absolute values. The strategy used by the IRLS algorithm is to cast back the problem to the previous L2 optimization by minimizing the following cost function:
alpha * (v-v_p)^2 * w_p + (Dx*v - v_x)^2 * w_x + (Dy*v - v_y)^2 * w_y.
Here the weight w_p, w_x, w_y are vectors and * represents dot product. If we can set
w_p = 1/|v-v_p|, w_x = 1.0/|Dx*v - v_x|, w_y = 1.0/|Dy*v - v_y|,
the cost function becomes the original L1 cost function. Furthermore, we can solve this cost function, again, by using the derivative trick. The corresponding linear system is:
(alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
Unfortunately, to set w we need the final image v, but to solve the image v we need the weight w, which is a chicken and egg problem. The solution to the problem is to start from an initial guess and gradually converge to correct answer. The complete IRLS algorithm looks like this:
Let w_p, w_x, w_y = 1
-- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
-- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
(eps is a small number to prevent division by zero)
And that's it. I hope this is clear enough for you to implement L1 optimization. I did not put it in the code because I feel that this is not the core component of the algorithm, and I did not find any open source implementation of this...
P.S. It would be great if we can have the beloved latex support back :p