Tuesday, June 30, 2009

Matrices 5

Our Matix class is now almost complete. There are few methods I intend to add to it, and one I intend to change. Most of this is pretty straight-forward, though, so I’ll be quick about it.

First, matrix transposition:

def transpose = new MatrixInt(createArrays(cols, rows, (row,col) => this(col,row)))

Pretty trivial. Next, determinant. To compute the determinant of a matrix, though, I need to compute some co-factors of it. The co-factor method is the only one I’m unhappy about. My original implementation was something like this:

for((i: Int) <- (0 until rows).toArray; if i != row)
yield for((j: Int) <- (0 until cols).toArray; if j != col)
yield f(i,j)

That’s a pretty good implemention, but then I went and factored that code out of it and into createArrays. Unfortunately, there is no “if” in createArrays, and I’m not willing to add them for the sake one one method, as this can have some performance impact, and createArrays is used everywhere.

I could create a “createArraysCond” or something. That would do, but there is only one method that would make use of it. I’m wary of premature refactoring. :-)

So I ended with the method below. Also, the determinant:

def cofactor(row: Int, col: Int) = new MatrixInt(createArrays(rows-1, cols-1, (i,j) =>
this(i + (if (i < row) 0 else 1), j + (if (j < col) 0 else 1))))

protected def sign(row: Int, col: Int) = if ((col + row) % 2 == 0) 1 else -1

lazy val determinant: Int =
if (rows != cols)
throw new UnsupportedOperationException
else rows match {
case 1 => this(0,0)
case 2 => this(0,0)*this(1,1) - this(1,0)*this(0,1)
case n => (
_row(0).zipWithIndex
map Function.tupled((value, col) => value * cofactor(0, col).determinant * sign(0, col))
reduceLeft (_+_)
)
}

I factored out “sign” because it is used in another method, and makes the code much clearer. Since our matrix is immutable, I made determinant “lazy val”. It’s a val because I need compute it only once – it’s value won’t change. It’s lazy because, that way, it won’t get computed until someone actually uses it. Otherwise, all matrices would have their determinant computed at creation time. This way, if you never need the determinant of a matrix, it won’t ever get computed.

The last methods I want to create for Matrix are “adjugate”, which will compute the adjugate, or adjoint, matrix, and “inverse”, which will compute the inverse of a Matrix. Since we are dealing with integers, computing the inverse will be a bit more difficult than if we were using doubles. Anyway, let’s do it. First, let’s define a “/” operator between matrix and integer. We’ll also define a “%” operator, which is pretty useful when dividing integers.

def /(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) / n))

def %(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) % n))

Next, I’ll do the adjugate:

def minor(row: Int, col: Int) = cofactor(row, col).determinant

lazy val adjugate: MatrixInt = new MatrixInt(createArrays(cols, rows, (row, col) =>
minor(col, row) * sign(col, row)
))

This is a bit confusing because I’m inverting at the same time I’m computing the minors. This saves a transposition step, though.

Finally, I’ll compute the inverse matrix, after asserting the matrix does have an integer inverse:

def isDivisibleBy(n: Int): Boolean =
(this % n).self flatMap (row => row) forall (_ == 0)

lazy val inverse: Option[MatrixInt] =
if (determinant == 0 || !isDivisibleBy(determinant))
None
else
Some(adjugate / determinant)

We finally have something new to look at. In the method isDivisibleBy, we first use the method “flapMap” to unnest the arrays, producing an Array[Int] containing all values of the matrix. Then we use “forall” in that. The method “forall” returns true if, and only if, the function passed to it (“_ == 0”) is true for all elements of the collection on which “forall” was invoked.

Or, in other words, “forall” will check that (this % n) results in a matrix of zeros.

The other interesting thing here is the use of Option. If you haven’t seen Option before, you probably haven’t seen much Scala code before. Option is Scala’s way of saying “this is not available”. An Option is a class, with a subclass called “Some”, and a singleton object called “None”. So where, in other languages, you might see a “null” value, in Scala the library will usually return a None instead.

On the other hand, instead of returning the value you want, it returns Some(value). This way, you need to test to see if the value you want is actually available, and then get it. There are many ways of doing that, and it pays to look up the API for Option.

The very last thing I’ll do with MatrixInt is change the exponentiation to be able to receive negative values, and we’ll see here one way of getting the value for an Option:

def **(n: Int): MatrixInt =
if (rows != cols)
throw new UnsupportedOperationException
else n match {
case 0 => unitMatrixInt(rows)
case 1 => this
case 2 => this * this
case negative if negative < 0 =>
(this ** negative.abs).inverse getOrElse (throw new UnsupportedOperationException)
case odd if odd % 2 == 1 => this ** (odd - 1) * this
case even => this ** (even / 2) ** 2
}

We’ll finish this installment with a couple of methods to solve systems of linear equations. It is tempting to create the methods inside MatrixInt itself, but solving linear equations is something you can do using matrices, not a matrix operation in itself.

I’m not sure what a good abstraction would be. We might just create the functions inside an object, and import them. Instead, I’ll create a small class to do that. First, let’s get a toString method. This method is quick and dirty, and could use some improvements. In particular, we could align the columns, like MatrixInt does, and we could treat negative numbers to avoid showing things like “+ -5”. That, I’ll leave as an exercise! :-)

class LinearEquations(val m: MatrixInt, val r: MatrixInt) {
require(m.rows == r.rows && r.cols == 1)

override def toString =
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled(_+"x("+_+")")
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")
}

To this class, we’ll add two solvers. First, through the inverted matrix, which should be pretty easy. We’ll also add a method which prints the equation with the unknowns replaced with the solution, so we can easily verify it by eye:

lazy val solutionInverse: Option[MatrixInt] =
if (m.inverse == None)
None
else
Some(m.inverse.get * r)

def solveInverse =
if (solutionInverse == None)
"There is no unique solution"
else
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled((value, col) => value+"*"+solutionInverse.get(col,0))
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")

We can already see the potential for refactoring. Now, the Cramer’s Rule. This is more difficult, because we need to compute each element of the solution. Though a bit more complex, it isn’t terribly so:

lazy val solutionCramer: Option[MatrixInt] = {
val vector = r.toArray.flatMap(x => x)
if (m.determinant == 0)
None
else
Some(MatrixInt(
for((col: Int) <- (0 until m.cols).toArray)
yield Array(
m.replaceCol(col, vector).determinant / m.determinant
)
))
}

def solveCramer =
if (solutionCramer == None)
"There is no unique solution"
else
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled((value, col) => value+"*"+solutionCramer.get(col,0))
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")

More refactoring potential. We’ll finish this installment with an example of LinearEquations in action, followed by the complete source code for what we have done so far.

scala> m
res5: Array[Array[Int]] = Array(Array(5, -4), Array(6, -5))

scala> r
res6: Array[Array[Int]] = Array(Array(2), Array(1))

scala> new LinearEquations(m, r)
res2: LinearEquations =
5x(0) + -4x(1) = 2
6x(0) + -5x(1) = 1

scala> res2.solveInverse
res3: java.lang.String =
5*6 + -4*7 = 2
6*6 + -5*7 = 1

scala> res2.solveCramer
res4: java.lang.String =
5*6 + -4*7 = 2
6*6 + -5*7 = 1

And now, the source code:

package com.blogspot.dcsobral.matrix

class MatrixInt private (private val self: Array[Array[Int]]) {
import MatrixInt._

val rows = self.size
val cols = self(0).size

private def _row(n: Int): Array[Int] = self(n)
private def _col(n: Int): Array[Int] = self map (_(n))

def row(n: Int): Array[Int] = _row(n) map (x => x)
def col(n: Int): Array[Int] = _col(n)

def apply(i: Int, j: Int): Int =
if (i >= rows || j >= cols || i < 0 || j < 0)
throw new IndexOutOfBoundsException
else
self(i)(j)

def update(row: Int, col: Int, newValue: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (row == i && col == j) newValue else this(i,j)))

def update(what: Dimension, where: Int, newValue: Array[Int]): MatrixInt =
what match {
case Row => replaceRow(where, newValue)
case Column => replaceCol(where, newValue)
}

def replaceCol(col: Int, newValue: Array[Int]): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (col == j) newValue(i) else this(i,j)))

def replaceRow(row: Int, newValue: Array[Int]): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (row == i) newValue(j) else this(i,j)))

override def equals(other: Any): Boolean = other match {
case that: MatrixInt =>
that.canEqual(this) && self.deepEquals(that.self)
case _ => false
}

def canEqual(other: Any): Boolean = other.isInstanceOf[MatrixInt]

def *(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) * n))

def /(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) / n))

def %(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) % n))

def +(other: MatrixInt): MatrixInt =
if (rows != other.rows || cols != other.cols)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) + other(row,col)))

def -(other: MatrixInt): MatrixInt =
if (rows != other.rows || cols != other.cols)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) - other(row, col)))

def *(other: MatrixInt): MatrixInt =
if (cols != other.rows)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, other.cols, (row, col) =>
this._row(row) zip other._col(col) map Function.tupled(_*_) reduceLeft (_+_)
))

def **(n: Int): MatrixInt =
if (rows != cols)
throw new UnsupportedOperationException
else n match {
case 0 => unitMatrixInt(rows)
case 1 => this
case 2 => this * this
case negative if negative < 0 =>
(this ** negative.abs).inverse getOrElse (throw new UnsupportedOperationException)
case odd if odd % 2 == 1 => this ** (odd - 1) * this
case even => this ** (even / 2) ** 2
}

def toArray = MatrixInt.clone(self)

def transpose = new MatrixInt(createArrays(cols, rows, (row,col) => this(col,row)))

def cofactor(row: Int, col: Int) = new MatrixInt(createArrays(rows-1, cols-1, (i,j) =>
this(i + (if (i < row) 0 else 1), j + (if (j < col) 0 else 1))))

protected def sign(row: Int, col: Int) = if ((col + row) % 2 == 0) 1 else -1

lazy val determinant: Int =
if (rows != cols)
throw new UnsupportedOperationException
else rows match {
case 1 => this(0,0)
case 2 => this(0,0)*this(1,1) - this(1,0)*this(0,1)
case n => (
_row(0).zipWithIndex
map Function.tupled((value, col) => value * cofactor(0, col).determinant * sign(0, col))
reduceLeft (_+_)
)
}

def minor(row: Int, col: Int) = cofactor(row, col).determinant

lazy val adjugate: MatrixInt = new MatrixInt(createArrays(cols, rows, (row, col) =>
minor(col, row) * sign(col, row)
))

def isDivisibleBy(n: Int): Boolean =
(this % n).self flatMap (row => row) forall (_ == 0)

lazy val inverse: Option[MatrixInt] =
if (determinant == 0 || !isDivisibleBy(determinant))
None
else
Some(adjugate / determinant)

override def toString = {
val maxNumSize = self.projection flatMap (_ map (_.toString.size)) reduceLeft (_ max _)
val numFormat = "%"+maxNumSize+"d"
def formatNumber(n: Int) = numFormat format n
val top = rows match {
case 1 => _row(0) map formatNumber mkString ("< ", " ", " >")
case _ => _row(0) map formatNumber mkString ("/ ", " ", " \\") // fix highlighter: "
}
val middle = rows match {
case 1 | 2 => Nil
case _ => self.toList.tail.init map (_ map formatNumber mkString("| "," "," |"))
}
val bottom = rows match {
case 1 => Nil
case _ => List(_row(rows - 1) map formatNumber mkString ("\\ ", " ", " /"))
}

top :: middle ::: bottom mkString "\n"
}
}

object MatrixInt {
import MatrixInt._
implicit def toMatrixInt(m : Array[Array[Int]]) = MatrixInt(m)

def apply(array: Array[Array[Int]]): MatrixInt = new MatrixInt(clone(array))
def apply(rows: Int, cols: Int): MatrixInt = MatrixInt(rows, cols, 0)
def apply(rows: Int, cols: Int, value: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, ((_,_) => value)))
def apply(rows: Int, cols: Int, f: (Int,Int) => Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, f))

def unitMatrixInt(n: Int) = MatrixInt(n, n, (row,col) => (if (row == col) 1 else 0))

protected def createArrays(rows: Int, cols: Int, f: (Int, Int) => Int) =
for((i: Int) <- (0 until rows).toArray)
yield for((j: Int) <- (0 until cols).toArray)
yield f(i,j)

protected def clone(a: Array[Array[Int]]) =
createArrays(a.size, a(0).size, (row, col) => a(row)(col))

sealed abstract class Dimension
case object Row extends Dimension
case object Column extends Dimension
}

class LinearEquations(val m: MatrixInt, val r: MatrixInt) {
require(m.rows == r.rows && r.cols == 1)

override def toString =
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled(_+"x("+_+")")
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")

lazy val solutionInverse: Option[MatrixInt] =
if (m.inverse == None)
None
else
Some(m.inverse.get * r)

def solveInverse =
if (solutionInverse == None)
"There is no unique solution"
else
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled((value, col) => value+"*"+solutionInverse.get(col,0))
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")

lazy val solutionCramer: Option[MatrixInt] = {
val vector = r.toArray.flatMap(x => x)
if (m.determinant == 0)
None
else
Some(MatrixInt(
for((col: Int) <- (0 until m.cols).toArray)
yield Array(
m.replaceCol(col, vector).determinant / m.determinant
)
))
}

def solveCramer =
if (solutionCramer == None)
"There is no unique solution"
else
(for (rowNumber <- 0 until m.rows;
row = m.row(rowNumber).zipWithIndex)
yield (row
map Function.tupled((value, col) => value+"*"+solutionCramer.get(col,0))
mkString (""," + "," = ")) + r(rowNumber,0))
.mkString("\n")
}

Sunday, June 28, 2009

Matrices 4

We have a reasonable Matrix class at this point, but it still has some deficiencies. There are a few common matrix operations we can add to it, and there are some problems related to the usability of the class. Let’s see them.

Equality

If you try “unitMatrixInt(2) == unitMatrixInt(2)”, you’ll get false as result. This happens because we did not provide an “equals” method, and the default “equals” method compares object references. Or, in other words, it tests if the objects are the same, not if the objects are equal. If that’s still unclear, see the following example:

scala> unitMatrixInt(2) == unitMatrixInt(2)
res8: Boolean = false

scala> val a = unitMatrixInt(2)
a: teste.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> val b = unitMatrixInt(2)
b: teste.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> val c = a
c: teste.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> a == a
res9: Boolean = true

scala> a == b
res10: Boolean = false

scala> a == c
res11: Boolean = true

Writing an equals method can be tricky, but I’m not going to discuss here why such a method must be written this or that way, as there are good references on the web. I usually recommend this one.

I will talk about the syntactical aspect of making such a method. First, though the examples show the method “==”, I keep talking about “equals”. That’s because “==”, as well as “!=”, in Scala is implemented in terms of the “equals” method. So if you want to change them, you implement “equals”.

Now, it’s not enough just to call it “equals”. It has to have the correct signature, which is to receive a single parameter of type Any, and return a Boolean. The method below will be good enough, with the addition of a canEqual method, and a small change in our class declaration:

class MatrixInt private (private val self: Array[Array[Int]]) {
...
override def equals(other: Any): Boolean = other match {
case that: MatrixInt =>
that.canEqual(this) && self.deepEquals(that.self)
case _ => false
}

def canEqual(other: Any): Boolean = other.isInstanceOf[MatrixInt]

We changed the constructor so that self got declared as a “private val”, otherwise it won’t be visible even to other objects of the same class. We could also get a copy of a MatrixInt’s array representation, and compare against that, but that can make equals quite slow. To get such a copy, though, we would use the “toArray” method, which I thought I had presented in the first installment of these posts, but I now find I didn’t. Let’s correct that oversight:

def toArray = MatrixInt.clone(self)

We produce a copy of the array, so that changes to it won’t affect this object. Anyway, now the test works as expected:

scala> val a = unitMatrixInt(2)
a: teste2.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> val b = unitMatrixInt(2)
b: teste2.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> val c = a
c: teste2.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> a == a
res12: Boolean = true

scala> a == b
res13: Boolean = true

scala> a == c
res14: Boolean = true


Updating

We also want to check to see it returns false for different matrices. To do that, though, we would like to produce a different matrix easily. Unfortunately, we have no way to do that. Well, we can generate unit matrices of different sizes, but if we just wanted to change a small value in one we have, we can do that.

Since our class is immutable, we do not expect it to actually change to something else, but when we do “1+1”, we do not change “1”, we just produce a new value. We can do the same thing, and enable individual cell changes in our matrix, like this:

def update(row: Int, col: Int, newValue: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (row == i && col == j) newValue else this(i,j)))

Here are some tests, then:

scala> val a = unitMatrixInt(2)
a: teste3.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> a == unitMatrixInt(3)
res20: Boolean = false

scala> a == unitMatrixInt(2)
res21: Boolean = true

scala> a == (a + a)
res22: Boolean = false

scala> a == a*2
res23: Boolean = false

scala> a == a * a // True! 1 == 1 * 1
res24: Boolean = true

scala> val b = a(1,0) = 1
b: teste3.MatrixInt =
/ 1 0 \
\ 1 1 /

scala> a
res25: teste3.MatrixInt =
/ 1 0 \
\ 0 1 /

scala> a == b
res26: Boolean = false

So, where’s “update”? It is actually used when declaring “b”. When you write “id(…) = value” in Scala, the compiler translates that to “id.update(…,value)”. So we could have also written this:

scala> val b = a.update(1,0,1)
b: teste3.MatrixInt =
/ 1 0 \
\ 1 1 /

Let’s also make it easy to change a whole column or a whole row. We can already do “a.row(0)” or “c.col(1)”, so doing “a.row(0) = …” would be nice. Scala has something similar, in that any method called “field_=” will be called if you write “id.field = …”. Unfortunately, if we try that here it won’t work, because “id.field” is already the name of a method which takes a parameter (the row or column). If it did not take any parameters, it would work. As it is, we’ll just have to make do with something else:

def replaceCol(col: Int, newValue: Array[Int]): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (col == j) newValue(i) else this(i,j)))

def replaceRow(row: Int, newValue: Array[Int]): MatrixInt =
new MatrixInt(createArrays(rows, cols, (i,j) =>
if (row == i) newValue(j) else this(i,j)))

And we can use it like this:

scala> val a = unitMatrixInt(3)
a: teste6.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 0 0 1 /

scala> a.replaceCol(0, a.col(2))
res39: teste6.MatrixInt =
/ 0 0 0 \
| 0 1 0 |
\ 1 0 1 /

There is one alternative, though, which I show below. I’m using the update method, and a symbol to identify what is being updated:

def update(what: Symbol, where: Int, newValue: Array[Int]): MatrixInt =
what match {
case 'row => replaceRow(where, newValue)
case 'col | 'column => replaceCol(where, newValue)
case _ => throw new IllegalArgumentException
}

scala> val a = unitMatrixInt(3)
a: teste7.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 0 0 1 /

scala> a('row, 1) = a.row(0)
res41: teste7.MatrixInt =
/ 1 0 0 \
| 1 0 0 |
\ 0 0 1 /

scala> a('column, 2) = a.row(0)
res42: teste7.MatrixInt =
/ 1 0 1 \
| 0 1 0 |
\ 0 0 0 /

Symbols are similar to String. The technical differences aren’t really much important – it’s their intent usage that differs. While strings are made to be shown to a user or received from them, a symbol is a piece of information exchanged between different parts of a program. They are very popular in some languages, but not so much on Scala.

One reason for their lack of popularity is that you don’t get any compile-time checks using them. If I misspelled “colunm” instead of “column”, the program would compile, and I’d get an exception at run-time – if I was lucky enough to catch it before the user.

Since we are using a statically typed language, we would like to catch such errors at compile-time. We can do that with Enumerations. Using a Scala enumeration is a bit more complex than other languages, as it’s not built in the language, but a library like anything else. It will provide just the checks we want, though.

First, we need to declare an object to be our enumeration. We’ll do this in the MatrixInt object-companion. We’ll also declare a type inside the object, to be used in our method declaration. It goes like this:

object Dimension extends Enumeration {
type Dimension = Value
val Row, Column = Value
}

“Value”, here, is both a subclass of Enumeration and a method which will return a new instance of type Value each time it is called. These instances will always be different between them. Note that when you declare “val Row, Column = Value”, Scala converts this into “val Row = Value; val Column = Value”, so “Value” gets invoked twice. If “Value” was called only once, then Row would be equal to Column, which is not something we want.

We might like to import the contents of the Dimension object inside the class too, so that we can write “what: Dimension” instead of “what: Dimension.Dimension” (or even “what: MatrixInt.Dimension.Dimension”, if we hadn’t imported object MatrixInt’s contents already). Simple line:

import MatrixInt.Dimension._

Now we can rewrite our method:

def update(what: Dimension, where: Int, newValue: Array[Int]): MatrixInt =
what match {
case Row => replaceRow(where, newValue)
case Column => replaceCol(where, newValue)
}

It’s important that Row and Column were declared capitalized – the first letter is uppercase. If we used “row” and “column”, then the match would simply assign “what” to “row”, like it did to “even” and “odd” in “**”, and execute the first statement. Since we used capitalized identifiers, the case statement compares “what” to the value of Row, as if it was a literal, and assigns nothing.

Note also that we dropped the default case. One has to be careful, though, because the compiler won’t warn against missing case statements. Here are some usage examples:

scala> a(Row, 2) = a.row(1)
<console>:13: error: not found: value Row
a(Row, 2) = a.row(1)
^

scala> a(Dimension.Row, 2) = a.row(1)
res1: teste.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 0 1 0 /

scala> import Dimension._
import Dimension._

scala> a(Row, 2) = a.row(1)
res2: teste.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 0 1 0 /

Can we do something about not getting warnings about missing case statements? As a matter of fact, yes, we can. Let’s rewrite once again our method. This time, we’ll use a case objects. Let’s show it first, and then discuss it. First, we replace the whole Dimension object by this:

sealed abstract class Dimension
case object Row extends Dimension
case object Column extends Dimension

Since we don’t have any nested declaration, we can do away with the extra import line. As for the rest… the update method doesn’t actually need any changes! The type, Dimension, has been declared. Row and Column are singleton objects, capitalized so that they can be used inside case statements as if they were literals.

There are two new things here. The “sealed” keyword, and the “case” keyword. Anyone reading about Scala has probably seen case classes. A case class is a class mostly like any other, but with a few methods automatically created by the compiler. A case object is similar to case class, and used instead of them when a case class would not declare any parameter in its constructor.

The “sealed” keyword means any subclass of “Dimension” must be declared in the same file as it. This way, the compiler can know, without doubt, all subclasses of Dimension, and, thus, perform checks when they get used in match statements. Leave “case Column” out of “update”, and you’ll get this warning:

MatrixInt.scala:70: warning: match is not exhaustive!
missing combination Column

what match {
^
one warning found

The usage won’t differ much either, and, in fact, won’t require any import beyond the object-companion’s contents:

scala> val a = unitMatrixInt(3)
a: teste3.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 0 0 1 /

scala> a(Row,2) = a.row(0)
res15: teste3.MatrixInt =
/ 1 0 0 \
| 0 1 0 |
\ 1 0 0 /

You might wonder, then, why ever use an Enumeration instead of case objects. As a matter of fact, case objects do have advantages many times, such as here. The Enumeration class, though, has many Collection methods, such as elements (iterator on Scala 2.8), which returns an Iterator, map, flatMap, filter, etc.

We have seen three different ways of implementing our “update” method for rows and columns, each with different characteristics. The very thing that made symbols a bad choice for us – the inability to define what symbols are possible in advance – can be a necessary characteristic in other situations. This goes to the very heart of Scala’s philosophy: do not dictate a solution, but let the programmer find and use the optimal one for his needs.

In the next installment we’ll look into computing cofactors, determinants, inverse matrices and then use all that to solve linear equations. I’m afraid there won’t be almost anything new, but, on the positive side, it shouldn’t take long either, and we’ll end up with a reasonable complete Matrix class.

After that, we’ll look into branching Matrix into other types, and the trade-offs we must make.

Friday, June 26, 2009

A Mind is a Terrible Thing to Waste

I have just seen a presentation about Agilists and Architects, which resonated with an old thought of mine: the waste of experience in the software development process.

Writing software – programming – is something that takes a huge amount of experience to get right. You have to know the language, of course. You have to know it well enough to avoid getting things wrong because of an unknown feature or characteristic of the language. You have to know the language libraries, frameworks, and development tools. This is the very basic.

You have to know patterns, and you have to know anti-patterns too. You need experience to realize that a pattern apply to a problem, or that some solution is inching toward an anti-pattern.

You have to know about security, about protocols, about distributed systems, about parallel systems, about algorithms – of so many types I can’t even list them here. You have to know about databases, persistence, XML, MVC, DSL, regex, grammars, state machines… we could go on and on. You have to know usability.

You have to know about programming paradigms. That’s easy, there’s only four or five of them. Or maybe more, depending on who’s definition you are using. And, of course, each paradigm has many variations preferred by one language or another – and who has to deal with a single language these days?

You have to know computers, and experience here will bring you understanding of performance issues. Experience will also let you identify bugs, or avoid them altogether.

Then, of course, you have to know CMMI, PMBoK, FPA or any of dozens standards and best practices and processes for software development, quality, testing, team working, etc.

Finally, of course, you have to know how to communicate with other people, understand the business needs, know the existing systems, the explicit and implicit rules, and, if possible, why it exists in the state it is, why certain decisions were taken, what were the conditions that might change and make it interesting – or even necessary – to take another path, or to avoid certain paths.

Each of those things you know or don’t have a direct impact on the quality of the code you write. Sure, you can write software with only a small glimpse of all this knowledge, but many times you would have written better, more efficient and more maintainable code if you only knew there was a word to what you were doing. Whole classes might have crumbled to a single library call. You might have avoided code which is intrinsically insecure, protocols which are unfixable. Hell, you might have avoided trying to solve problems known to be unsolvable.

While deep knowledge is clearly the domain of experts in a field, all those domains come together through the hands of the generalist, who has to know something exists before he can use it.

So there are architects, who are meant to have all that experience, and lay down the trail for those inexperienced to follow. Only, of course, it doesn’t work that way. The sheer level of detail required in developing software preclude architects from being able to know or influence but a thin, though important, layer of all there is.

It couldn’t be other way, in truth – if it were possible to write a precise specification of what the software should do, what should be its structure, how it should go about doing what it is supposed to do, well, if it were possible to write such a specification, then we could feed it to the computer and be done with it. In fact, we can write those specifications – they are called “source code”. Source code is not what a computer executes. It’s a specification detailing how to produce the instructions the computer will follow to provide the service required.

In short, or, rather, in conclusion, programming experience results in code with fewer bugs, that satisfy more completely the users needs, easier to adapt to new needs, and in a shorter time.

So why the hell does any company even contemplates hiring “junior” programmers? No, let me rephrase that. Why is the average experience of programmers working – coding – for corporations so low? There is an obvious – to me – inversion of value going on there. And looking at companies that make software to be sold as service or at retail – as opposed to producing software to someone else’s specs – I don’t see much disagreement.

I can see a few reasons why that might be true, of course. First, just because someone acquires experience doesn’t mean he or she learns new stuff. And the transition from mainframe to personal computing pretty much proved that can be detrimental – when people cling to what they know against the unknown, instead of avoiding what they know to be wrong, or taking the better one of known alternatives.

So, corporations “solve” the problem by making experienced people into architects and managers. Being a manager has a quality of its own, and having experience writing software does not really add anything to it. Being an architect takes you too far from where you can make most use of your experience.

Hopefully, if you are lucky enough to work somewhere software architects have meaningful influence on the software development process, you might get programmers to avoid huge mistakes. And if your corporation is really lucky, you’ll do so without critically impairing the software development process with decisions detached from the reality of the problems being faced.

I was happy, then, with the ideas presented by Rebecca Parsons and Martin Fowler. It pains me to see so much experience and knowledge wasted because of misguided – or misinformed – processes.

Thursday, June 25, 2009

Matrices and Fibonacci 3

Continuing our series on creating a Matrix class for Scala, now that we have the basics laid down, let’s go back to the Fibonacci problem. We need to do exponentiation of matrices. To do that, we first need to get multiplication going.

While I have no intention of explaining how to multiply matrices – the web is full of resources if you aren’t familiar with it – it’s important to restate the problem:

To find element of (row i, column j) of the matrix resulting from AxB, I need to multiply the nth element of row i of matrix A by the nth element of column j of matrix B, for all elements of said row and column, and then add the results. An initial, misguided, attempt might go like this:


(for(row <- 0 to rows
col <- 0 to B.cols
) yield A(i,col)*B(row,j)) reduceLeft (_+_)


The problem here is that you are multiplying each element of the row by every element of the column. What we want to do is to “pair” the first element of the row with the first element of the column, the second element of the row with the second of the column, and so forth, and then multiply each pair. This pairing is called “zip” in Scala. You’ll find “zip” – and the occasional variation, such as zipWithIndex – on sequence-like collections, such as List and Array. The zipping thing goes like this:


_row(i) zip other._col(j)


That will create an Array of tuples of Int: Array[Tuple2[Int,Int]], or simply Array[(Int,Int)]. Tuples can be thought of as a small “package” of objects. It’s not a collection – you do not traverse it or iterate through it. Also, each component of it has its own type, as you can see. The only thing you can do with tuples, aside from passing it around, is accessing a tuple’s element. You do this with the methods “_1”, “_2”, “_3”, etc.

Next, we need to do _1*_2 for each element of that array. You can’t do "map (_._1 * _._2)", as this translates into "map ((x1, x2) => x1._1 * x2._2)", but map only passes a single argument: the tuple. You can do this, then: "map (t => t._1 * t._2)". You can, but I won’t… Instead, I’ll pass "(_*_)", an anonymous function that takes two parameters and return the result of their multiplication, to a factory of functions.

Factories, as anyone familiar with OOP might know, are methods which create new instances of objects. So, what that has to do with a function? Well, a function in Scala is an object that implements the method “apply”. You might wonder if MatrixInt isn’t a function, then, as it implements apply itself. In fact, it could be one, if we chose to extend MatrixInt with the trait Function2 (two arguments to apply).

At any rate, we’ll use a function factory to produce a new instance of a function, based on (_*_). This factory, called Function.tupled (method “tupled” on singleton object Function), takes a function which receives multiple parameters, and converts it into a function which receives a single tuple instead. After that, we can apply the reduceLeft. Our computation, then, will go like this:


this._row(i) zip other._col(j) map Function.tupled(_*_) reduceLeft (_+_)


The method, then, is thus defined:


def *(other: MatrixInt): MatrixInt =
if (cols != other.rows)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, other.cols, (row, col) => {
this._row(row) zip other._col(col) map Function.tupled(_*_) reduceLeft (_+_)
}))


If we do away with the check to see if the matrices are compatible in size, and make allowance for the verbosity of the identifiers being used (like row and col instead of I and j), it comes down to a one-liner. Truly:


def *(o: MatrixInt) = new MatrixInt(createArrays(rows, o.cols, (i,j) => _row(i) zip o._col(j) map Function.tupled(_*_) reduceLeft (_+_)))


The only thing left to do is exponentiation. We could do it as a one-liner too:


def ^(n: Int) = (0 until n).foldLeft(unitMatrixInt(rows))((acc,_) => acc*this)


The method foldLeft is similar to reduce. But where reduce can be thought of as (((a+b)+c)+d)+e, foldLeft is more like this:


var acc = ?
acc = op(acc,a)
acc = op(acc,b)
acc = op(acc,c)
acc = op(acc,d)
acc = op(acc,e)


The most important difference between foldLeft and reduceLeft is that the result of foldLeft, and its accumulator, might be of a different type than the elements the operation is applied upon. With reduceLeft, once you applied acc = op(a,b), if acc where of a different type, then op(acc,c) would be a different method. It might work with some implicits or subtypes, but not for entirely different types.

Now, I’m not using the range for nothing, except counting the number of times I want to multiply the matrix. Acc is initialized with “unitMatrixInt(rows)”, which would be a method producing a “unit” matrix of the appropriate size. The “1” of matrices. Then I keep multiplying “this” against this accumulator.

I could create a list of copies of “this”, and then multiply them with reduceLeft too. It would go like this:


def **(n: Int) = List.fill(n)(this) reduceLeft (_*_)


Much simpler, but I wanted to mention foldLeft somehow. :-)

While simplicity has its merits, there is a simple optimization when doing exponentiation by taking advantage of common factors. I went for it in my final implementation:


def **(n: Int): MatrixInt =
if (rows != cols n < 0 =""> unitMatrixInt(rows)
case 1 => this
case 2 => this * this
case odd if odd % 2 == 1 => this ** (odd - 1) * this
case even => this ** (even / 2) ** 2
}


I'll make a note on the method name, here. I chose ** as this is one of the common names for such an operation on computer languages. Another, less common, name is ^. I prefer the later, but it has a very low precedence. In fact, this very method would be broken as written, because "(odd - 1) * this" would have precedence over "this ** (odd - 1)", which would completely invert the logic and cause a compilation error.

Now, aside from , ^, &, <>, = and !, :, +, -, * and / and % (ordered by precedence, by the way), all other special characters have a higher precedence. If I called it "@", for instance, then "a * b @ c" would behave as one would expect it too. Unfortunately, no one would expect "@" to be an exponentiation character. But if I were to make extensive use of this library in formulas and expressions, I might well go for something like that, to prevent unexpected errors due to precedence.

Anyway, we need to define unitMatrixInt too. We’ll do so in a companion object. This way, users of this package can create unix matrices if they want to. To create this method, we’ll need a more general factory method, though. We already have most of what we need in createArrays.

But before I get into that, I want to revise my class definition. Presently, this is what we have:


class MatrixInt(array: Array[Array[Int]]) {
private val self = clone(array)
}


I receive a matrix, and then replicate it internally, so I can be sure it won’t be changed outside my class – and I take care of not changing it inside. But there is a downside to it: “array” continues to be part of my object (see footnote 1)! It becomes a private reference inside my object, and this has some unfortunate consequences. For one thing, since each instance of my class makes reference to the arrays used to build them, those arrays never get deallocated. Now think about all references to “new MatrixInt” inside the class. I use that for every operator. That means I keep TWO copies of a MatrixInt’s array in memory, even when the copy used for construction gets immediately discarded.

This presents a conundrum.On one hand, if the instances of my class are immutable, then I need the full array to begin with. On the other hand, if I receive an array from an external source, I need to copy it before using, as the original array might get changed by whoever passed it to me.

We solve this with the help of a private constructor and an object companion. A private constructor is, of course, a constructor that cannot be accessed from outside the class. The private constructor thing is easy:


class MatrixInt private (array: Array[Array[Int]]) {


Now, how to access it? We cannot define another constructor inside MatrixInt with the same signature – receiving an array of array of Int as parameter – because it would conflict with the first. What we do, then, is use an object companion.

Simply put, an object companion is a singleton object – declared with the “object” keyword – which has the same name as a class. That grants both the object and the class visibility of each other’s private definitions.

We’ll then define an “apply” method in that object companion. As you may remember, the “apply” method receives special treatment from the Scala compiler, which enables you to do away with the name – apply – and just pass the parameters to the instance of your object. Since we have to call our object companion “MatrixInt”, then MatrixInt(…) will be converted by Scala into MatrixInt.apply(…). It looks like the instantiation of a new object, only it lacks the “new” keyword.

Let’s define it, then. First, let’s remove the “val self” from the class, and just receive it in its private constructor. As that constructor will only be accessed through MatrixInt’s object companion, that will be safe.


class MatrixInt private (self: Array[Array[Int]]) {
import MatrixInt._


The import statement makes it possible to access the methods of the object companion without preceding them with “MatrixInt.”. Being a companion object makes the private methods accessible, but does not bring them into scope automatically.

Now, for the object companion, let’s create it, the factory for MatrixInt, and a couple of helpful factories. We’ll also transfer the “clone” and “createArrays” methods to it, as this method does not make reference to a MatrixInt’s own data, so it doesn’t need to be part of each instance. The definition of toArray needs “MatrixInt.” prepended to “clone” to avoid an ambiguous reference, but nothing else should change.


object MatrixInt {
import MatrixInt._
implicit def toMatrixInt(m : Array[Array[Int]]) = MatrixInt(m)

def apply(array: Array[Array[Int]]): MatrixInt = new MatrixInt(clone(array))
def apply(rows: Int, cols: Int): MatrixInt = MatrixInt(rows, cols, 0)
def apply(rows: Int, cols: Int, value: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, ((_,_) => value)))
def apply(rows: Int, cols: Int, f: (Int,Int) => Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, f))

def unitMatrixInt(n: Int) = MatrixInt(n, n, (row,col) => (if (row == col) 1 else 0))

protected def createArrays(rows: Int, cols: Int, f: (Int, Int) => Int) =
for((i: Int) <- (0 until rows).toArray) yield for((j: Int) <- (0 until cols).toArray) yield f(i,j) protected def clone(a: Array[Array[Int]]) = createArrays(a.size, a(0).size, (row, col) => a(row)(col))
}


We can now create a MatrixInt by using MatrixInt(Array(Array(…))), but we didn’t – and won’t – change that in the class. The companion object factory produces a clone of the array, just as we wanted. But the arrays produced by MatrixInt’s methods don’t get exposed, so there is no risk in using them. So, by keep using the private constructor inside MatrixInt itself, we prevent unnecessary copies of the arrays produced by each of MatrixInt’s operators.

We’ll finish today with Fibonacci, but we’ll continue later on. I’ll add some methods for completeness to MatrixInt, and then look into MatrixDouble and solve some linear equations with it.

After that, we’ll briefly explore ways to make our Matrix class parametrizable, though I’m still in doubt about performance. Still, I’m reading the comments, and I plan to address them one way or another in these posts.

So, the Fibonacci!


def fibonacci(n: Int) = (MatrixInt(Array(Array(1,1),Array(1,0))) ** n)(1,0)


Footnote 1:

A reader has brought to my attention that, actually, "array" is not being preserved. I haven't found in Scala specification anything about this, so I'm putting it down to optimization. But because it is not explicitly documented, I advise against taking advantage of it.

Do note, please, that if you declare "array" to be protected, val or var, or if any method makes use of it, then it will be preserved in the object.

Wednesday, June 24, 2009

Matrices 2

I’ll continue today with a little series for beginners. On my first post I forgot something important. One of the very first things I do with a class is create a toString method. This way, I get instant feedback on how my class is doing, as I paste it into the REPL, and try out the methods I defined.

So, let’s think of a toString method. Since all data in my class is stored in the private val “self”, we could simply delegate the task of producing a string:


override def toString = self.toString


That might do at first, but it would be unreasonable to expect something better formatted than that. First, let’s see if we can’t get the output in multiple lines, using mkString:


override def toString = self mkString “\n”


Unfortunately, the output of that looks like this:


res6: String =
[I@196a753
[I@1c36f46


This wouldn’t happen with a List, but an Array is implemented to be compatible with Java, and Java had Arrays have some intrinsic limitations, as they were present in the language from the very beginning, much before generics, and some design decisions about them proved to be unfortunate. Anyway, let’s fix that:


override def toString = self map (_ mkString “ “) mkString “\n”


Better, but we can improve on that still. For one thing, it would be much better if the columns were aligned. Now, to align the columns, we first must know how big a column can get. Or, in other words, we must know the size of the number with the biggest string representation in the matrix.

So, summing it up, we have to look through all of the matrix elements, get their sizes, and choose the biggest one. Getting a number’s size is easy: number.toString.size. Now, if we had a list, this would be easy, but we have an array of arrays. When we have nested collections and we want to unnest them, we use flatMap. For instance:


scala> Array(Array(15,0),Array(2,-30)) flatMap (x => x)
res19: Array[Int] = Array(15, 0, 2, -30)


The flatMap operation we used, (x => x), simply returns what it was given. If all you want is unnest the collections, this suffices. But, actually, we want the number sizes. Since “x” is an Array, not a number, we can’t do flatMap (x => x.toString.size). But we can replace the numbers with their sizes, and then unnest the arrays, like this:


scala> Array(Array(15,0),Array(2,-30)) flatMap (_ map (_.toString.size))
res20: Array[Int] = Array(2, 1, 1, 3)


Next, we want to get the biggest one. We have a collection of number sizes, and we want to reduce it to the single biggest one. We have two such methods in Scala: reduceLeft and reduceRight. Since “max” is commutative, both will produce the same result. But reduceLeft has a better performance, as it iterates over the collection in the order given. The final code should look like this, then:


val maxNumSize = self flatMap (_ map (_.toString.size)) reduceLeft (_ max _)


Pretty easy, eh? My first version had three lines, and was much more confusing to read. It helps stating what your problem is – you’ll often see you can do exactly that in Scala. So, we have a size, what do we do with it? Well, let’s just make sure all numbers are printed to that size. If we format them with “%2d”, for instance, every number will take at least two characters, with the left side being filled with a space as needed. To produce such formatted numbers, we’ll make use of RichString’s “format” method:


override def toString = {
val maxNumSize = self.projection flatMap (_ map (_.toString.size)) reduceLeft (_ max _)
val numFormat = "%"+maxNumSize+"d"
self map (_ map (numFormat format _) mkString “ “) mkString “\n”
}


That’s it! I added a “projection” there to make it a bit more performatic, as it doesn’t produce intermediate arrays. On Scala 2.8, that method gets replaced with “view”, with “projection” remaining as deprecated. Well, that was so easy I want to do more. First, I want to produce little vertical bars on the sides, to make clear the matrix boundaries, in case we are printing them side by side. That just requires changing the inner mkString to mkString(“| “, “ “, “ |”). But I also want to bend the bars up and down, to indicate the top and bottom lines, as well as produce something special for one-line matrices.

For all of them, we are just making small changes in the inner mkString. We have one for the top line, one for the intermediate ones, and one for the bottom line. We could make a list of all these lines, and then use mkString on the list, but we have to be careful, because the “bottom” line and the “intermediate” lines might not exist, and the top line has two different formats depending on whether the matrix has more than one line or not. Taking that into account, here is a much larger, but, hopefully, nicer toString method:


override def toString = {
val maxNumSize = self.projection flatMap (_ map (_.toString.size)) reduceLeft (_ max _)
val numFormat = "%"+maxNumSize+"d"
def formatNumber(n: Int) = numFormat format n
val top = rows match {
case 1 => _row(0) map formatNumber mkString ("< ", " ", " >")
case _ => _row(0) map formatNumber mkString ("/ ", " ", " \\")
}
val middle = rows match {
case 1 | 2 => Nil
case _ => self.toList.tail.init map (_ map formatNumber mkString("| "," "," |"))
}
val bottom = rows match {
case 1 => Nil
case _ => List(_row(rows - 1) map formatNumber mkString ("\\ ", " ", " /"))
}

top :: middle ::: bottom mkString "\n"
}


In a parting note, someone suggested one technique to be able to parametrize the class. There are many, but they are slower than using "AnyVal" types directly. If I happen to be mistaken, please do post about it at length!

Tuesday, June 23, 2009

Matrices and Fibonacci 1

When I was beginning to learn Scala, I decided one of my projects would be rewrite this stuff in Scala. The idea behind that is to compute Fibonacci Numbers using matrices. I won’t bother describing how it is done, as this is well covered on both of those links. Suffice to say that I need to computer the nth power of a 2x2 matrix of integers. This project was all but forgotten for a long time…

So it happened that just yesterday I was cleaning my to-do list, and I came upon that. I took a second look into it, and decided to take the plunge. Now, Raganwald uses an optimization which enables him to do the operations on vectors, but, by now, I actually thought the whole exercise was trivial and pointless.

I took a brief look into matrix libraries for Scala. I found the Scalala (as in Scala Linear Algebra) library, which does everything I need, very efficiently at that. My problem would be reduced to something like this (untested):


def fib(n) = n match {
case 0 => throw new IllegalArgumentException
case 1 => 1
case _ =>
val m = DenseMatrix(2,2)(1); m(1,1) = 0
val mPowered = m :^ n
mPowered(0,1)
}


Boring, wasn’t it? Yeah, I thought so too. Instead, I decided to write my own Matrix library. Pretty trivial stuff, but at least I’d take something out of it, I thought. So, I’ll talk about that: writing my Matrix library. If that will bore you, you might as well stop right here. I’ll show, though, some of the design decisions and constrains I faced. For someone new to Scala, it might be interesting read.

Well, then, where to start? With a class, of course. Ideally, I’d like to write something like this:


class Matrix[T]


This way, I could use matrices of Double for most use cases, of Ints or Longs for things like the Fibonacci problem, and BigDecimal for tough stuff. Unfortunately, we can’t do that. We need access to operators such as +, - and * over type T, and that means I need an upper bound for T defining these operators. Unfortunately, these types do not share an interface defining those elements. You see, they are not really “classes” as far as Java is concerned, and not being a proper class, they can’t share an abstract class or interface with these methods. Well, BigDecimal is a class, alright, but that’s it.

Because of this, I decided, instead, to define my class as MatrixInt. This was enough for Fibonacci, but later I created a MatrixDouble as well, just to finish implementing the most basic methods such a class would need.

There is a way around that. I can get away with a structural type, but that would be so slow as to defeat the whole purpose of a Matrix class. So I’ll skip it altogether.

The next design decision is how to store my data, and what constructor to use. Generally speaking, there are two strategies for storing a matrix: as a dense matrix or as a sparse matrix. A dense matrix is simply stored as an addressable, fixed-size, contiguously allocated buffer. Or, in other words, an array. A sparse matrix is used for matrices that are very large in their dimensions, but are mostly populated with zeros. Performance can go either way, depending on what kind of data and algorithms you are using those matrices for. For Fibonacci, dense is the way to go. Since it is also the easier way to go, I chose that way.

We also need to know how many dimensions the matrices may have. For simplicity sake, I chose bi-dimensional matrices, and that’s it. With all that in mind, my class thus began:


class MatrixInt(array: Array[Array[Int]]) {


I decided to take an array of proper dimensions as my constructor, instead of providing something special myself. With Scala 2.8, you can pass Array.ofDim(2,2) and be done with it. We’ll see something similar further down too.

Now, one important thing to consider is that an Array is a mutable object. I have decided to make my matrices immutable, so that I could use them just like I use Ints or Doubles. But, to ensure that is true, I can’t really on the array passed with the constructor, as it might change. So, first thing I do is copy it. Here:


private def clone(a: Array[Array[Int]]) =
for((i: Int) <- (0 until a.size).toArray)
yield for((j: Int) <- (0 until a(0).size).toArray)
yield a(i)(j)

private val self = clone(array)


This clone array method will help in places other than the constructor. For instance, if the user wants the Array representation of the matrix, I have to clone my internal representation before passing it on. But there are more interesting things about it. Let’s consider it…

One thing to notice about it, is that instead of having a single for/yield statement, I have two, the second being fed to the first’s yield. It is built this way because I want to create an Array of Arrays. Each yield will result in one collection. Two collections, two yield statements.

Then we have ranges: (0 until a.size) and (0 until a(0).size). A statement like (0 until 5) will produce a range which, when iterated over, will yield 0, 1, 2, 3 and 4. It can do a few other things too, but let’s not be bothered by them. Note that if you want an “inclusive”, range, one which will include the upper limit (5, in the example), you use “to” instead of “until”. Or you create a new range from the existing one with “inclusive”.

Next, “.toArray”. Though a range will do as far as generating indices, I want yield to return an Array. If you want yield to return collection X, pass a collection X to the for statement. Therefore, “.toArray” to convert the range into an array.

The final thing to consider is the last yield statement. Yield a(i)(j). As I started writing my methods, I got very similar patterns, with only the last yield changing. So I started to consider how to reuse that code. The thing to consider, here, is that the intent of this double for/yield pattern is to create an array of arrays of the proper size, and populate it with a function of my choosing. Thinking about it that way, a refactoring might be done as follow:


protected def createArrays(rows: Int, cols: Int, f: (Int, Int) => Int) =
for((i: Int) <- (0 until rows).toArray)
yield for((j: Int) <- (0 until cols).toArray)
yield f(i,j)

protected def clone(a: Array[Array[Int]]) =
createArrays(a.size, a(0).size, (row, col) => a(row)(col))


Here I changed from private to protected, as I decided if I ever subclassed it, there would be no point in hiding these methods. I do not publish them, though, because they are helper methods, instead of methods acting upon the object itself.

Next, a.size and a(0).size were a bit cryptic, weren’t they? When we work on the matrices algorithms, it would be preferable to call them by what they are:


val rows = array.size
val cols = array(0).size


We would also like to be able to get a particular row or a particular column. It seems getting a particular row would be easy. Just return the subarray:


def row(n : Int): Array[Int] = self(n)


There is a serious problem with that, though. I’m returning a mutable object from my own internal data! We could solve that with clone, but clone only worked for bi-dimensional arrays. Well, before we create new clone, let’s consider a simpler alternative:


private def _row(n: Int): Array[Int] = self(n)
def row(n: Int): Array[Int] = _row(n) map (x => x)


The private definition is here so we can avoid creating new arrays unnecessarily. Internally, we’ll make sure we never change the value of our arrays. The public method creates a new array by the magic of “map”. The method “map” is, in fact, used when you do a for/yield, along with flatMap when the for/yield has multiple iterators. So, in fact, the createArrays method is actually implemented as using map.

A map will take a collection and produce a new collection with a given mapping. In this case, our mapping is (x => x), or, in other words, given x, return x, producing a copy of that array. We define cols in a likewise fashion:


private def _col(n: Int): Array[Int] = self map (_(n))
def col(n: Int): Array[Int] = _col(n)


In this case, the private definition and the public definition are the same. Still, I decided to keep a private definition so that I could use _row and _col throughout my class. We’ll also define a method to access a particular element of our matrix:


def apply(i: Int, j: Int): Int =
if (i >= rows || j >= cols || i < 0 || j < 0)
throw new IndexOutOfBoundsException
else
self(i)(j)


The method “apply” has a special meaning in Scala, as the compiler translates object(parms) into object.apply(parms). This means that I will be able to access a particular element of a matrix m with “m(i,j)”. Of course, because a matrix has bounds, I must test for them, and throw an appropriate exception if need be.

Note that I do not declare an @throws annotation, though I might if I wished to. Also, note that I’m returning the result of the if/else expression, and its type should be Int. But in the if statement, I’m throwing an exception, so what gives?

As it happens, throwing an exception has type Nothing, which is subtype to everything. I’m not sure if the compiler actually assigns a type to “throw”, as this is a keyword. But if I used Scala’s library “error” function instead, the question of the type would be important. For methods and functions which never return, such as error, declaring its type to be Nothing makes it possible to make use of them inside statements whose return type is important.

We’ll stop today with just three more definitions. Let’s multiply a matrix by an integer, producing a new matrix, as well as add and subtract two arrays. Here is how:


def *(n: Int): MatrixInt =
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) * n))

def +(other: MatrixInt): MatrixInt =
if (rows != other.rows || cols != other.cols)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) + other(row,col)))

def -(other: MatrixInt): MatrixInt =
if (rows != other.rows || cols != other.cols)
throw new IllegalArgumentException
else
new MatrixInt(createArrays(rows, cols, (row, col) => this(row,col) - other(row, col)))

Monday, June 22, 2009

Catching Exceptions

A fellow blogger was complaining about Google Code returning an exception instead of null, which made his code look ugly. Not so, says I! Scala (and monads, I suppose) come to rescue:


def exceptionToLeft[T](f: => T): Either[java.lang.Throwable, T] = try {
Right(f)
} catch {
case ex => Left(ex)
}


exceptionToLeft(obj.methodWhichMightThrow()) match {
case Right(x) => // do stuff with x
case Left(ex) => println("Oops, got exception " + ex.toString)
}


Enjoy!

Update:

Scala 2.8 has library help for this, at scala.util.control.Exception. It does require one to specify the exception or exceptions to be caught, which is a good thing, and it has many more usage modes. You can just ignore the exception for computations returning Unit, you can get the result as Some(T) and transform the exceptions in None, etc.

Thursday, June 18, 2009

Equality & Scala 3

This one is a quickie. I forgot -- again -- to mention that Scala does provide a way around having to define your own equals method. To be more specific, case classes come with reasonably defined equals methods.

Wednesday, June 17, 2009

Equality & Scala 2

My efforts, yesterday, to come up with an Equatable trait that could ease a bit the knowledge and sheer drudgery of making a valid “equals” method met with unexpected difficulties. While tempted to replace the incorrect code with a correct one before the article got many hits, I decided that the problems I encountered and the mistakes I made taught a lesson by itself.

Anyway, I'd like first to make a few remarks that were missing from that post. One thing to notice here is that while this trait might be useful to some people, it is slower than a well thought-out equals definition. What I'm trying to do is see how much Scala let me make the job of creating valid equals methods both easier and safer.

It all comes down to the idea that, being equality in languages with subclassing so full of pitfalls, and the general solution to it being a well-defined pattern, those languages ought to be doing something about it at the language level. Or library level, if possible.

It does cross my mind that this problem might be much more efficiently and elegantly solved in languages which makes it possible to generate code at compile time, such as Lisp with its macros, and Ruby with access to the AST. This is one thing I miss in Scala, and while I understand the reasons for it and empathize with them, I do come up some roadblocks to a scalable language now and then.

That said, let’s analyze what happened. I tried to model my trait after the Hashable class. There are two things, though, that made my job harder than Hashable’s. First, I depended on super.equals, while hashCode doesn’t depend on its super. This becomes important as super.equals would make reference to definitions such as testSuperEquals or equateValues, and these definitions would be overridden in the subclass. Therefore, when Point3D.equals called Point.equals (its super.equals), the method Point.equals would make use of Point3D.equateValues and Point3D.testSuperEquals instead of Point.equateValues and Point.testSuperEquals.

This is difficult enough to get around, but it gets worse because, as opposed to hashCode, equals has to reference not one, but two objects. Calling a super of oneself is easy. Not so calling a super to a method on another object.

Another problem is the warning about type erasure. The case match never tests for type "This", so it is necessary to resort to reflection, to make sure we don't try to assign a superclass to a subclass. I didn't get any error on that because the hash test was returning false first. Well, fixed that, and changed the test code to produce constant hashCode.

The goal, then, is to make equals independent on any definition that might get overridden by later equals. Or, in other words, independent of any other definition related to the trait Equatable itself. Of course, it still will need to access members of the other object, and those might get overridden. That, however, is expected and shouldn’t have any influence on the equals method.

To begin with, let’s think how our equals definition would look like in the class. In the next to last definition, we had this:


override def equals(other : Any) = equalsTo[Point](other, true)

where equals was defined as

protected def equalsTo[This <: Equatable](other : Any, superEquals : => Boolean) : Boolean

This definition delegates to the class the task of calling super.equals – or passing “true” if not appropriate. We can’t do that inside equalsTo, because the equalsTo method is never overridden. “Super”, inside it, will have only one meaning. So this solution will do.

Now, how do we deal with equateValues? One obvious solution would be doing this:

protected def equalsTo[This <: Equatable](other : Any, equateValues : Seq[Any], superEquals : => Boolean) : Boolean

There are at least two reasons this isn’t going to work. First, while it solves the problem of equateValues on “this” object, as it get passed as a parameter on equals’ definition, it doesn’t solve it for “that” object! In fact, we now have no way of finding out what are the elements to be compared in the other object.

A second problem, though, might not be as obvious. We might depend on mutable data or data which isn’t computed yet at the time we define equals. Getting around that is possible, but would be much worse than defining an equals method by oneself!

So, what we’ll do is pass a function instead. A function which, given a “that” object, returns the sequence we need. Or, in other words, we want a object of this type:

(that : This) => Seq[Any]

The only problem with that is that “this” inside our trait does not have type This. We’ll need to receive a reference to ourselves, properly typed! Our definition, then, should be:

protected def equalsTo[This <: Equatable](self : This, other : Any, equateValues : This => Seq[Any], superEquals : => Boolean) : Boolean

The body of our function, then, becomes:

(other : @unchecked) match {
case that : This =>
(
that.canEqual(this)
&& superEquals
&& hashCode == that.hashCode // Can speed up or slow down
&& equateValues(self).zip(equateValues(that)).foldLeft(true) {
(equals, tuple) => equals && tuple._1 == tuple._2
} && equateValues.size == that.equateValues.size
)
case _ => false
}}
}

Now, how would our equals definition look like? Here:

override def equals(other : Any) = equalsTo[Point](this, other, that => List(that.x, that.y), true)
override def equals(other : Any) = equalsTo[Point3D](this, other, that => List(that.z), super.equals(other))

For big objects, inserting the function in the parameters might be awkward. Instead, we might prefer to assign the function to a val first. For example:

private val pointValues = (that : Point) => List(that.x, that.y)

So, here’s everything together, with a bit of further editing for performance resons:

trait Equatable extends scala.util.Hashable {
def canEqual (that : Any) : Boolean

protected def equalsTo[This <: Equatable](self : This, other : Any, equateValues : This => Seq[Any], superEquals : => Boolean) : Boolean =
(other : @unchecked) match {
case that : This if self.getClass.isAssignableFrom(that.getClass) =>
// Testing for hash code can improve or decrease performance, depending on the implementation;
// if hashCode gets implemented as a val, it will make equality faster
if (that.canEqual(this) && superEquals && hashCode == that.hashCode) {
val thisValues = equateValues(self)
val thatValues = equateValues(that)
(thisValues.zip(thatValues).foldLeft(true) { (equals, tuple) => equals && tuple._1 == tuple._2 }
&& thisValues.size == thatValues.size)
} else false
case _ => false
}
}

class Point (val x : Int, val y : Int) extends Equatable {
override def toString = "(%d, %d)" format (x, y)

// Hashable definitions
// override def hashValues = List(x, y)
override def hashValues = List(0) // We don't want the hash skipping our tests

// Equatable definitions
private val pointValues = (that : Point) => List(that.x, that.y) // one way
override def equals(other : Any) = equalsTo[Point](this, other, pointValues, true)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) with Equatable {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable defintions
// override def hashValues = List(x, y, z)
override def hashValues = List(0) // We don't want the hash skipping our tests

// Equatable defintions
// private val point3DValues = (that : Point3D) => List(that.z)
override def equals(other : Any) = equalsTo[Point3D](this, other, that => List(that.z), super.equals(other))
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
}


And the tests. I thought about doing them as assertions, but it was too silent for my taste. Anyway,

scala> val x = new Point(1, 2); val x2 = new Point(1, 2)
x: Point = (1, 2)
x2: Point = (1, 2)

scala> x == x2 // super.equals does not get called, so we do not perform reference equality
res0: Boolean = true

scala> val y = new Point(2, 1)
y: Point = (2, 1)

scala> x == y // expected false
res1: Boolean = false

scala> val z = new Point3D(1, 2, 0)
z: Point3D = (1, 2, 0)

scala> x == z // false in that canEqual this test
res2: Boolean = false

scala> z == x // false through reflection isAssignableFrom
res3: Boolean = false

scala> val z2 = new Point3D(2, 1, 0)
z2: Point3D = (2, 1, 0)

scala> z == z2 // expected false
res4: Boolean = false

scala> val z3 = new Point3D(1, 2, 0)
z3: Point3D = (1, 2, 0)

scala> z == z3 // expected true
res5: Boolean = true

scala> val z4 = new Point3D(1, 2, 1)
z4: Point3D = (1, 2, 1)

scala> z == z4 // extected false
res6: Boolean = false

scala> x == x
res7: Boolean = true

Tuesday, June 16, 2009

Equality & Scala

I'm just through reading three different sources on equality in less than a week. They all said pretty much the same thing, with a minor variation here or there. It got me thinking about it, and I have some thoughts to share. For this discussion, I'll assume you are familiar with how to do equality correctly. My examples follow the model given in this article.

Let's start with two simple classes:



class Point (val x : Int, val y : Int) {
override def toString = "(%d, %d)" format (x, y)
}

class Point3D (x : Int, y : Int, val z : Int) extends Point(x,y) {
override def toString = "(%d, %d, %d)" format (x, y, z)
}


Now, a proper equality method in these classes would look like the following:



class Point (val x : Int, val y : Int) {
override def toString = "(%d, %d)" format (x, y)
override def hashCode = 41 * (41 + x) + y
override def equals(other : Any) : Boolean = other match {
case that : Point => (
that.canEqual(this)
&& this.x == that.x
&& this.y == that.y
)
case _ => false
}
def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D (x : Int, y : Int, val z : Int) extends Point(x,y) {
override def toString = "(%d, %d, %d)" format (x, y, z)
override def hashCode = 41 * (41 * (41 + x) + y) + z
override def equals(other : Any) : Boolean = other match {
case that : Point3D => (
that.canEqual(this)
&& super.equals(that)
&& this.z == that.z
)
case _ => false
}
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
}

Now, Scala has, starting with version 2.8, a Hashable trait, with which we can simplify things:



class Point (val x : Int, val y : Int) extends scala.util.Hashable {
override def toString = "(%d, %d)" format (x, y)
override def hashValues = List(x, y)
override def equals(other : Any) : Boolean = other match {
case that : Point => (
that.canEqual(this)
&& this.x == that.x
&& this.y == that.y
)
case _ => false
}
def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D (x : Int, y : Int, val z : Int) extends Point(x,y) {
override def toString = "(%d, %d, %d)" format (x, y, z)
override def hashValues = List(x, y, z)
override def equals(other : Any) : Boolean = other match {
case that : Point3D => (
that.canEqual(this)
&& super.equals(that)
&& this.z == that.z
)
case _ => false
}
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
}

While it doesn't seem to have gained us anything, it might for larger objects, and, at any rate, it removes the "magic" of a hash code, and let someone else worry how to do it.

Still, there's a lot of stuff in there just to get equality right, and these are pretty simple classes. What we see is a programming pattern, but one so common and so important that, in my opinion, it merits special attention from the language itself.

Barring that, let's see what we can do programmatically about it. I'll start with a helper function, and how it would be used:



def testEquality(one : AnyRef, other : AnyRef, elementsOne : Seq[Any], elementsOther : Seq[Any]) : Boolean = {
val classOfOne = one.getClass
val classOfOther = other.getClass

def sameClassOrSubclass: Boolean = classOfOne.isAssignableFrom(classOfOther)

def superEquals : Boolean = try {
val superEqualsMethod = classOfOne.getSuperclass.getMethod("equals", classOf[Any])
if (superEqualsMethod.getDeclaringClass != classOf[Any])
superEqualsMethod.invoke(one, other) match {
case flag : java.lang.Boolean => flag.booleanValue // Translate boxed boolean into boolean
case _ => error("Method equals on the parent class of object " + one + " does not return a boolean")
}
else true
} catch {
case _ => true
}

def canEqual : Boolean = try {
val canEqualMethod = classOfOther.getMethod("canEqual", classOf[Any])
canEqualMethod.invoke(other, one) match {
case flag : java.lang.Boolean => flag.booleanValue // Translate boxed boolean into boolean
case _ => error("Method canEqual on object " + other + " does not return a boolean")
}
} catch {
case _ => true
}

def elementsEquals : Boolean = {
elementsOne.zip(elementsOther).foldLeft(true) {
(equals, pair) => equals && pair._1 == pair._2
} && elementsOne.size == elementsOther.size
}

(sameClassOrSubclass(classOfOne, classOfOther)
&& canEqual
&& superEquals
&& elementsEquals
)
}

class Point (val x : Int, val y : Int) extends scala.util.Hashable {
override def toString = "(%d, %d)" format (x, y)

// Hashable
override def hashValues = List(x, y)

// Equality
override def equals(other : Any) : Boolean = other match {
case that : Point => testEquality(this, that, List(x, y), List(that.x, that.y))
case _ => false
}
def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable
override def hashValues = List(x, y)

// Equality
override def equals(other : Any) : Boolean = other match {
case that : Point3D => testEquality(this, that, List(x, y, z), List(that.x, that.y, that.z))
case _ => false
}
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
}

Now, this method always calls the parent's equals, unless it's Any's equals. You might want to parametrize this. Also, it expects canEqual to be defined if needed, which might lead to bugs. Furthermore, its usage of reflection makes it slower than needed. Finally, the definition of equals is not that much simpler than what we had before.

So, can we do better? Ideally, one could build a trait similar to Hashable, but it turns out that is not that simple. Let's try:



trait Equatable extends scala.util.Hashable {
protected type EquateThis <: Equatable
private def equalsFromAny : Boolean = {
this.getClass.getSuperclass.getMethod("equals", classOf[Any])
.getDeclaringClass == classOf[Any]
}

protected def equateValues : Seq[Any]

def canEqual (that : Any) : Boolean = true

abstract override def equals(other : Any) : Boolean = (other : @unchecked) match {
case that : EquateThis =>
(
that.canEqual(this)
&& (equalsFromAny || super.equals(that))
&& hashCode == that.hashCode // Can speed up or slow down
&& equateValues.zip(that.equateValues).foldLeft(true) {
(equals, tuple) => equals && tuple._1 == tuple._2
} && equateValues.size == that.equateValues.size
)
case _ => false
}
}

class Point (val x : Int, val y : Int) extends Equatable {
override def toString = "(%d, %d)" format (x, y)

// Hashable definitions
override def hashValues = List(x, y)

// Equatable definitions
override type EquateThis = Point
override def equateValues = List(x, y)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) with Equatable {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable defintions
override def hashValues = List(x, y, z)

// Equatable defintions
override type EquateThis = Point3D
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
override def equateValues = List(z)
}


That looks more like it, but it has a few problems still. It still uses reflection, for one thing, to test for super's equals method. Also, you can't parametrize that is it is. It won't get Equatable's own equals, though, as traits compiles down to part of the class being defined, not as an ancestor to it.

Next, it has a default for canEqual, and a dangerous one at that. If the programmer forgets to override it, it will lead to trouble.

But, most importantly, it doesn't work. The class Point3D can't override Point's definition for type EquateThis. I don't understand precisely why this is the case, and I'd be glad if anyone stepped in to explain this.

Anyway, let fix these problems:



trait Equatable extends scala.util.Hashable {
protected def testSuperEquals : Boolean
protected def equateValues : Seq[Any]
def canEqual (that : Any) : Boolean

protected def equalsTo[This <: Equatable](other : Any) : Boolean = (other : @unchecked) match {
case that : This =>
(
that.canEqual(this)
&& ((!testSuperEquals) || super.equals(that))
&& hashCode == that.hashCode // Can speed up or slow down
&& equateValues.zip(that.equateValues).foldLeft(true) {
(equals, tuple) => equals && tuple._1 == tuple._2
} && equateValues.size == that.equateValues.size
)
case _ => false
}
}

class Point (val x : Int, val y : Int) extends Equatable {
override def toString = "(%d, %d)" format (x, y)

// Hashable definitions
override def hashValues = List(x, y)

// Equatable definitions
override def testSuperEquals = false
override def equateValues = List(x, y)
override def equals(other : Any) = equalsTo[Point](other)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) with Equatable {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable defintions
override def hashValues = List(x, y, z)

// Equatable defintions
override def testSuperEquals = true
override def equals(other : Any) = equalsTo[Point3D](other)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
override def equateValues = List(z)
}
This finally get us where we wanted. Or as close to as I could, at least. :-)

The definition of canEqual is made abstract. That forces the first class to mix Equatable in to define it, even if to a default of "true". Next, instead of trying to figure out if equality must be called on the super, we simply require the class to tell us.

Finally, the equals method. We can't (or I couldn't) get the trait to define it, but I got pretty close. In the end, class still has to define an equals method, but that method pretty much gets reduced to a simple call to one defined in the trait, passing the object being compared to and the class expected. The class expected gets passed as an explicit type parametrization.

Neat. I didn't think I'd be able to get this much! Now, who's going to port it to Java?


Update:
Ok, I spoke too soon. This method fails here:



scala> val x = new Point(1, 2); val x2 = new Point(1, 2)
x: Point = (1, 2)
x2: Point = (1, 2)

scala> x == x2 // super.equals does not get called, so we do not perform reference equality
res7: Boolean = true

scala> val y = new Point(2, 1)
y: Point = (2, 1)

scala> x == y // expected false
res8: Boolean = false

scala> val z = new Point3D(1, 2, 0)
z: Point3D = (1, 2, 0)

scala> x == z // false in the canEqual call
res9: Boolean = false

scala> z == x // false in the type check
res10: Boolean = false

scala> val z2 = new Point3D(2, 1, 0)
z2: Point3D = (2, 1, 0)

scala> z == z2 // false in the super.equals call
res11: Boolean = false

scala> val z3 = new Point3D(1, 2, 0)
z3: Point3D = (1, 2, 0)

scala> z == z3 // false in the super.super.equals call - it should have been true
res12: Boolean = false

Anyone up for fixing it?

Update 2:
Ok, I fixed it. I resorted to delegating this task to the calling class. That means I do away with testSuperEquals, but require a second parameter to equalsTo. I make it by name, so that it doesn't get evaluated needlessly.


trait Equatable extends scala.util.Hashable {
protected def equateValues : Seq[Any]
def canEqual (that : Any) : Boolean

protected def equalsTo[This <: Equatable](other : Any, superEquals : => Boolean) : Boolean = (other : @unchecked) match {
case that : This =>
(
that.canEqual(this)
&& superEquals
&& hashCode == that.hashCode // Can speed up or slow down
&& equateValues.zip(that.equateValues).foldLeft(true) {
(equals, tuple) => equals && tuple._1 == tuple._2
} && equateValues.size == that.equateValues.size
)
case _ => false
}}
}

class Point (val x : Int, val y : Int) extends Equatable {
override def toString = "(%d, %d)" format (x, y)

// Hashable definitions
override def hashValues = List(x, y)

// Equatable definitions
override def equateValues = List(x, y)
override def equals(other : Any) = equalsTo[Point](other, true)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) with Equatable {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable defintions
override def hashValues = List(x, y, z)

// Equatable defintions
override def equals(other : Any) = equalsTo[Point3D](other, super.equals(other))
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
override def equateValues = List(z)
}

And the testing:


scala> val x = new Point(1, 2); val x2 = new Point(1, 2)
x: Point = (1, 2)
x2: Point = (1, 2)

scala> x == x2 // super.equals does not get called, so we do not perform reference equality
res105: Boolean = true

scala> val y = new Point(2, 1)
y: Point = (2, 1)

scala> x == y // expected false
res106: Boolean = false

scala> val z = new Point3D(1, 2, 0)
z: Point3D = (1, 2, 0)

scala> x == z // false in the canEqual call
res107: Boolean = false

scala> z == x // false in the type check
res108: Boolean = false

scala> val z2 = new Point3D(2, 1, 0)
z2: Point3D = (2, 1, 0)

scala> z == z2 // false in the super.equals call
res109: Boolean = true

scala> val z3 = new Point3D(1, 2, 0)
z3: Point3D = (1, 2, 0)

scala> z == z3 // expected true
res110: Boolean = true

Update 3:
This is still seriously broken, as the very test above indicates. What is happening is that when super.equals gets called, it uses this.equateValues instead of the super's version of it. It might be easy to fix for "this", but not for "that". At this point, I'm giving up on super.equals. Let's assume the equality for all classes is defined by the equalsTo method, and require a call to super.equateValues at every subclass (that finds it necessary). Here it is


trait Equatable extends scala.util.Hashable {
protected def equateValues : Seq[Any]
def canEqual (that : Any) : Boolean

protected def equalsTo[This <: Equatable](other : Any) : Boolean = (other : @unchecked) match {
case that : This =>
(
that.canEqual(this)
&& hashCode == that.hashCode // Can speed up or slow down
&& equateValues.zip(that.equateValues).foldLeft(true) {
(equals, tuple) => equals && tuple._1 == tuple._2
} && equateValues.size == that.equateValues.size
)
case _ => false
}}
}

class Point (val x : Int, val y : Int) extends Equatable {
override def toString = "(%d, %d)" format (x, y)

// Hashable definitions
override def hashValues = List(x, y)

// Equatable definitions
override def equateValues = List(x, y)
override def equals(other : Any) = equalsTo[Point](other)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point]
}

class Point3D(x : Int,y : Int, val z : Int) extends Point(x,y) with Equatable {
override def toString = "(%d, %d, %d)" format (x, y, z)

// Hashable defintions
override def hashValues = List(x, y, z)

// Equatable defintions
override def equals(other : Any) = equalsTo[Point3D](other)
override def canEqual(other : Any) : Boolean = other.isInstanceOf[Point3D]
override def equateValues = z :: super.equateValues
}

And test:


scala> val x = new Point(1, 2); val x2 = new Point(1, 2)
x: Point = (1, 2)
x2: Point = (1, 2)

scala> x == x2 // super.equals does not get called, so we do not perform reference equality
res138: Boolean = true

scala> val y = new Point(2, 1)
y: Point = (2, 1)

scala> x == y // expected false
res139: Boolean = false

scala> val z = new Point3D(1, 2, 0)
z: Point3D = (1, 2, 0)

scala> x == z // false in the canEqual call
res140: Boolean = false

scala> z == x // false in the type check
res141: Boolean = false

scala> val z2 = new Point3D(2, 1, 0)
z2: Point3D = (2, 1, 0)

scala> z == z2 // expected false
res142: Boolean = false

scala> val z3 = new Point3D(1, 2, 0)
z3: Point3D = (1, 2, 0)

scala> z == z3 // expected true
res143: Boolean = true