class RegexF(pattern: String) extends String => Option[Seq[String]]
or, perhaps,
class RegexPF(pattern: String) extends PartialFunction[String, Seq[String]]
In fact,
RegexPF.lift
would (could) yield a RegexF
. It then caught my attention that RegexF.apply
has the same signature as Regex.unapplySeq
, which is the standard way of handling regex in Scala!Might this be what has been bugging me about Scala's regex all along? Should we translate
val YYYYMMDD = """(\d{4})-(\d{2})-(\d{2})""".r val MMDDYYYY = """(\d{2})/(\d{2})/(\d{4})""".r def getYear(s: String) = s match { case YYYYMMDD(year, _, _) => year case MMDDYYYY(_, _, year) => year }
into
val YYYYMMDD = """(\d{4})-(\d{2})-(\d{2})""".r val MMDDYYYY = """(\d{2})/(\d{2})/(\d{4})""".r andThen (fields => fields.last +: fields.init) def getYear(s: String) = ((YYYYMMDD orElse MMDDYYYY) andThen (_.head))(s)
I can certainly see the advantages of pattern matching, but... it doesn't compose very well. And it has some performance issues, which is a big deal for most regex usages. And being a PartialFunction would not prevent a Regex from having extractors as well.
so this does not modify my litle life in nothing... but is a great think to to think about.....
ReplyDeleteDaniel, I thought the same thing. But with a little bit help you can compose Regexs and other extractors this way:
ReplyDeleteval Year = pattern {
case YYYYMMDD(year, _, _) => year
case MMDDYYYY(_, _, year) => year
}
text match { case Year(y) => y ... }
I have an old blog about this and similar ideas:
http://notes.langdale.com.au/Querying_a_Dataset_with_Scala_s_Pattern_Matching.html
Cheers,
Arnold