Thursday, June 25, 2009

Matrices and Fibonacci 3

Continuing our series on creating a Matrix class for Scala, now that we have the basics laid down, let’s go back to the Fibonacci problem. We need to do exponentiation of matrices. To do that, we first need to get multiplication going.

While I have no intention of explaining how to multiply matrices – the web is full of resources if you aren’t familiar with it – it’s important to restate the problem:

To find element of (row i, column j) of the matrix resulting from AxB, I need to multiply the nth element of row i of matrix A by the nth element of column j of matrix B, for all elements of said row and column, and then add the results. An initial, misguided, attempt might go like this:


(for(row <- 0 to rows
col <- 0 to B.cols
) yield A(i,col)*B(row,j)) reduceLeft (_+_)


The problem here is that you are multiplying each element of the row by every element of the column. What we want to do is to “pair” the first element of the row with the first element of the column, the second element of the row with the second of the column, and so forth, and then multiply each pair. This pairing is called “zip” in Scala. You’ll find “zip” – and the occasional variation, such as zipWithIndex – on sequence-like collections, such as List and Array. The zipping thing goes like this:


_row(i) zip other._col(j)


That will create an Array of tuples of Int: Array[Tuple2[Int,Int]], or simply Array[(Int,Int)]. Tuples can be thought of as a small “package” of objects. It’s not a collection – you do not traverse it or iterate through it. Also, each component of it has its own type, as you can see. The only thing you can do with tuples, aside from passing it around, is accessing a tuple’s element. You do this with the methods “_1”, “_2”, “_3”, etc.

Next, we need to do _1*_2 for each element of that array. You can’t do "map (_._1 * _._2)", as this translates into "map ((x1, x2) => x1._1 * x2._2)", but map only passes a single argument: the tuple. You can do this, then: "map (t => t._1 * t._2)". You can, but I won’t… Instead, I’ll pass "(_*_)", an anonymous function that takes two parameters and return the result of their multiplication, to a factory of functions.

Factories, as anyone familiar with OOP might know, are methods which create new instances of objects. So, what that has to do with a function? Well, a function in Scala is an object that implements the method “apply”. You might wonder if MatrixInt isn’t a function, then, as it implements apply itself. In fact, it could be one, if we chose to extend MatrixInt with the trait Function2 (two arguments to apply).

At any rate, we’ll use a function factory to produce a new instance of a function, based on (_*_). This factory, called Function.tupled (method “tupled” on singleton object Function), takes a function which receives multiple parameters, and converts it into a function which receives a single tuple instead. After that, we can apply the reduceLeft. Our computation, then, will go like this:


this._row(i) zip other._col(j) map Function.tupled(_*_) reduceLeft (_+_)


The method, then, is thus defined:


def *(other: MatrixInt): MatrixInt =
if (cols != other.rows)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, other.cols, (row, col) => {
this._row(row) zip other._col(col) map Function.tupled(_*_) reduceLeft (_+_)
}))


If we do away with the check to see if the matrices are compatible in size, and make allowance for the verbosity of the identifiers being used (like row and col instead of I and j), it comes down to a one-liner. Truly:


def *(o: MatrixInt) = new MatrixInt(createArrays(rows, o.cols, (i,j) => _row(i) zip o._col(j) map Function.tupled(_*_) reduceLeft (_+_)))


The only thing left to do is exponentiation. We could do it as a one-liner too:


def ^(n: Int) = (0 until n).foldLeft(unitMatrixInt(rows))((acc,_) => acc*this)


The method foldLeft is similar to reduce. But where reduce can be thought of as (((a+b)+c)+d)+e, foldLeft is more like this:


var acc = ?
acc = op(acc,a)
acc = op(acc,b)
acc = op(acc,c)
acc = op(acc,d)
acc = op(acc,e)


The most important difference between foldLeft and reduceLeft is that the result of foldLeft, and its accumulator, might be of a different type than the elements the operation is applied upon. With reduceLeft, once you applied acc = op(a,b), if acc where of a different type, then op(acc,c) would be a different method. It might work with some implicits or subtypes, but not for entirely different types.

Now, I’m not using the range for nothing, except counting the number of times I want to multiply the matrix. Acc is initialized with “unitMatrixInt(rows)”, which would be a method producing a “unit” matrix of the appropriate size. The “1” of matrices. Then I keep multiplying “this” against this accumulator.

I could create a list of copies of “this”, and then multiply them with reduceLeft too. It would go like this:


def **(n: Int) = List.fill(n)(this) reduceLeft (_*_)


Much simpler, but I wanted to mention foldLeft somehow. :-)

While simplicity has its merits, there is a simple optimization when doing exponentiation by taking advantage of common factors. I went for it in my final implementation:


def **(n: Int): MatrixInt =
if (rows != cols n < 0 =""> unitMatrixInt(rows)
case 1 => this
case 2 => this * this
case odd if odd % 2 == 1 => this ** (odd - 1) * this
case even => this ** (even / 2) ** 2
}


I'll make a note on the method name, here. I chose ** as this is one of the common names for such an operation on computer languages. Another, less common, name is ^. I prefer the later, but it has a very low precedence. In fact, this very method would be broken as written, because "(odd - 1) * this" would have precedence over "this ** (odd - 1)", which would completely invert the logic and cause a compilation error.

Now, aside from , ^, &, <>, = and !, :, +, -, * and / and % (ordered by precedence, by the way), all other special characters have a higher precedence. If I called it "@", for instance, then "a * b @ c" would behave as one would expect it too. Unfortunately, no one would expect "@" to be an exponentiation character. But if I were to make extensive use of this library in formulas and expressions, I might well go for something like that, to prevent unexpected errors due to precedence.

Anyway, we need to define unitMatrixInt too. We’ll do so in a companion object. This way, users of this package can create unix matrices if they want to. To create this method, we’ll need a more general factory method, though. We already have most of what we need in createArrays.

But before I get into that, I want to revise my class definition. Presently, this is what we have:


class MatrixInt(array: Array[Array[Int]]) {
private val self = clone(array)
}


I receive a matrix, and then replicate it internally, so I can be sure it won’t be changed outside my class – and I take care of not changing it inside. But there is a downside to it: “array” continues to be part of my object (see footnote 1)! It becomes a private reference inside my object, and this has some unfortunate consequences. For one thing, since each instance of my class makes reference to the arrays used to build them, those arrays never get deallocated. Now think about all references to “new MatrixInt” inside the class. I use that for every operator. That means I keep TWO copies of a MatrixInt’s array in memory, even when the copy used for construction gets immediately discarded.

This presents a conundrum.On one hand, if the instances of my class are immutable, then I need the full array to begin with. On the other hand, if I receive an array from an external source, I need to copy it before using, as the original array might get changed by whoever passed it to me.

We solve this with the help of a private constructor and an object companion. A private constructor is, of course, a constructor that cannot be accessed from outside the class. The private constructor thing is easy:


class MatrixInt private (array: Array[Array[Int]]) {


Now, how to access it? We cannot define another constructor inside MatrixInt with the same signature – receiving an array of array of Int as parameter – because it would conflict with the first. What we do, then, is use an object companion.

Simply put, an object companion is a singleton object – declared with the “object” keyword – which has the same name as a class. That grants both the object and the class visibility of each other’s private definitions.

We’ll then define an “apply” method in that object companion. As you may remember, the “apply” method receives special treatment from the Scala compiler, which enables you to do away with the name – apply – and just pass the parameters to the instance of your object. Since we have to call our object companion “MatrixInt”, then MatrixInt(…) will be converted by Scala into MatrixInt.apply(…). It looks like the instantiation of a new object, only it lacks the “new” keyword.

Let’s define it, then. First, let’s remove the “val self” from the class, and just receive it in its private constructor. As that constructor will only be accessed through MatrixInt’s object companion, that will be safe.


class MatrixInt private (self: Array[Array[Int]]) {
import MatrixInt._


The import statement makes it possible to access the methods of the object companion without preceding them with “MatrixInt.”. Being a companion object makes the private methods accessible, but does not bring them into scope automatically.

Now, for the object companion, let’s create it, the factory for MatrixInt, and a couple of helpful factories. We’ll also transfer the “clone” and “createArrays” methods to it, as this method does not make reference to a MatrixInt’s own data, so it doesn’t need to be part of each instance. The definition of toArray needs “MatrixInt.” prepended to “clone” to avoid an ambiguous reference, but nothing else should change.


object MatrixInt {
import MatrixInt._
implicit def toMatrixInt(m : Array[Array[Int]]) = MatrixInt(m)

def apply(array: Array[Array[Int]]): MatrixInt = new MatrixInt(clone(array))
def apply(rows: Int, cols: Int): MatrixInt = MatrixInt(rows, cols, 0)
def apply(rows: Int, cols: Int, value: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, ((_,_) => value)))
def apply(rows: Int, cols: Int, f: (Int,Int) => Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, f))

def unitMatrixInt(n: Int) = MatrixInt(n, n, (row,col) => (if (row == col) 1 else 0))

protected def createArrays(rows: Int, cols: Int, f: (Int, Int) => Int) =
for((i: Int) <- (0 until rows).toArray) yield for((j: Int) <- (0 until cols).toArray) yield f(i,j) protected def clone(a: Array[Array[Int]]) = createArrays(a.size, a(0).size, (row, col) => a(row)(col))
}


We can now create a MatrixInt by using MatrixInt(Array(Array(…))), but we didn’t – and won’t – change that in the class. The companion object factory produces a clone of the array, just as we wanted. But the arrays produced by MatrixInt’s methods don’t get exposed, so there is no risk in using them. So, by keep using the private constructor inside MatrixInt itself, we prevent unnecessary copies of the arrays produced by each of MatrixInt’s operators.

We’ll finish today with Fibonacci, but we’ll continue later on. I’ll add some methods for completeness to MatrixInt, and then look into MatrixDouble and solve some linear equations with it.

After that, we’ll briefly explore ways to make our Matrix class parametrizable, though I’m still in doubt about performance. Still, I’m reading the comments, and I plan to address them one way or another in these posts.

So, the Fibonacci!


def fibonacci(n: Int) = (MatrixInt(Array(Array(1,1),Array(1,0))) ** n)(1,0)


Footnote 1:

A reader has brought to my attention that, actually, "array" is not being preserved. I haven't found in Scala specification anything about this, so I'm putting it down to optimization. But because it is not explicitly documented, I advise against taking advantage of it.

Do note, please, that if you declare "array" to be protected, val or var, or if any method makes use of it, then it will be preserved in the object.

4 comments:

  1. Hi,

    I'm interested in how you format the Scala code above.

    ReplyDelete
  2. Syntax Highlighter, the Alex Gorbatchev version. The Google Code version, though top on Google results, is out of date. I don't know why development ceased on Google Code.

    http://alexgorbatchev.com/wiki/SyntaxHighlighter

    ReplyDelete
  3. >But there is a downside to it: “array” continues to be part of my object

    I'm trying to verify that. I take this code
    class MatrixInt(array: Array[Array[Int]]) {
    private val self = clone(array)

    protected def createArrays(rows: Int, cols: Int, f: (Int, Int) => Int) =
    for((i: Int) <- (0 until rows).toArray)
    yield for((j: Int) <- (0 until cols).toArray)
    yield f(i,j)

    protected def clone(a: Array[Array[Int]]) =
    createArrays(a.size, a(0).size, (row, col) => a(row)(col))
    }

    And the I run
    >scalac MatrixInt.scala
    >javap -p MatrixInt
    And I get
    public class MatrixInt extends java.lang.Object implements scala.ScalaObject{
    private final int[][] self;
    public MatrixInt(int[][]);
    public int[][] clone(int[][]);
    public int[][] createArrays(int, int, scala.Function2);
    private int[][] self();
    public int $tag() throws java.rmi.RemoteException;
    }

    Only one field, private final int[][] self;
    am I missing something?

    ReplyDelete
  4. Indeed, you are correct. Scala has optimized away the "array" field. I can't find any remarks on the specification specifying (or denying) this behavior.

    Obviously, if you do reference "array" inside a method, or declare "array" to be protected/val/var, then it will be preserved.

    I'll put a remark on the article, but, as the specification is silent on this issue, I think it best not to rely on this behavior.

    Thank you for the information, though!

    ReplyDelete