2015年7月26日 星期日

Scala Other (Session 10)

Scala Other

Lazy

Scala 允許資源在要使用時才載入。只要在宣告時,加 lazy 這個關鍵字。

eg: 取得資料庫連線

lazy val conn = DriverManager.getConnection
lazy val stmt = conn.preparedStatement(....)
lazy val rs = stmt.executeQuery()

try {
   stmt.setInt(1, xxx)
   
   while (rs.next) {
     ...
   }
}
catch {
  case ex: Exception =>
}
finally {
  DBUtils.close(rs)
  DBUtils.close(stmt)
  DBUtils.close(conn)
}

當執行到 stmt.setInt(1, xxx) 時,才會開始初始化 PreparedStatement,在初始化 PreparedStatement 時,才會去初始化 Connection

Implicit

Implicit Parameter

Implicit 的技術在 Scala 使用很多,以 collection 為例,從 Java Collection 轉換成 Scala Collection 的版本,都是用 implicit 的方式來完成。

在 OOP 最常見的 implicit 是 this 這個關鍵字。在 OOP 的 Class 內,並沒有宣告 this 這個變數,卻可以使用。

Python 的設計哲學是不允許 implicit 的,因此寫 Python 常會宣告 self 來代表 this

eg:

scala> implicit val a = 10
a: Int = 10

scala> def test(a: Int)(implicit b: Int) = a + b
test: (a: Int)(implicit b: Int)Int

scala> test(1000)
res0: Int = 1010

當宣告的參數是 implicit 時,Scala Compiler 會去找符合條件的 implicit 變數。

implicit 的設計會讓系統更加靈活,如 Future 使用的 Thread Pool。Scala 有內建 Thread Pool,但也可以自定 Thread Pool 後,修改 import 的部分後,其他的部分不用再修改。

附註: How this work (c++)

Source code:

class CRect {
    private: 
        int m_color;
    public: 
        void setcolor(int color) {
            m_color = color;
        }
}

After Compiling:

class CRect {
    private: 
        int m_color;
    public: 
        void setcolor(int color, (CRect*)this) {
            this->m_color = color;
        }
}

Implicit Conversions

在針對 Java Collection 轉成 Scala 相對應的 Collection,都使用 implicit function 在做轉換。

eg: Scala wrapAsScala

implicit def asScalaBuffer[A](l: ju.List[A]): mutable.Buffer[A] = l match {
  case MutableBufferWrapper(wrapped) => wrapped
  case _ =>new JListWrapper(l)
}

Tail Recursion

Scala Compiler 會針對 Tail Recursion 做最佳化。所謂的 Tail Recursion 是指,把做 recursion 放在最後一個指令

eg: GCD

def gcd(a: Int, b: Int): Int = if (a == 0) b else gcd(b % a, a)

Recursion 放在最後一行,Scala Compiler 會針對這類型的寫法做最佳化處理。

eg: Tail Recursion,因為最後是 recursive call 後再 + 1

def boom(x: Int): Int = if (x == 0) throw new Exception("boom!") else boom(x - 1) + 1

執行的結果:

scala> boom(2)
java.lang.Exception: boom!
  at .boom(<console>:7)
  at .boom(<console>:7)
  at .boom(<console>:7)
  ... 33 elided

上例中,最後一行是 boom(x - 1) + 1,因此不是 Tail Recursion。執行的結果中,有三個 .boom,代表有三個 stack frame。所以是以傳統 recursive call 在進行。

需改寫成:

scala> def boom(x: Int): Int = if (x == 0) throw new Exception("boom!") else boom(x - 1)
boom: (x: Int)Int

結果:

scala> boom(2)
java.lang.Exception: boom!
  at .boom(<console>:8)
  ... 33 elided

改寫後的結果,只有一個 .boom。Scala Compiler 會把 Tail Recursion 改寫成只用一個 stack frame 來進行。

範例來自:

Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition

Scala Parallel Computing (Session 9)

Scala Parallel Computing

平行化處理,很適合用在 I/O bound 的程式,讓 I/O 可以同時間被處理,讓 CPU 等待的時間縮到最短。

Future and Await

以往在 Java ,是使用 Thread 來進行計算。Scala 提供 Future 來儲存尚未完成的結果,在使用 Future 時,需要 import concurrent.ExecutionContext.Implicits.global,這一行的用意是使用 Scala 內建的 Thread Pool。在使用 Future 時,會需要使用 Thread 來進行計算。

在 Multi-Thread 環境下,通常主程式需要等待所有的 Thread 完成後,才能結束程式。Scala 提供 Await 等待 Thread 執行的結果。

eg: 同時讀取兩個檔案的內容

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()
   
    val future1 = Future { readFile("ufo_awesome_1.tsv") }
    val future2 = Future { readFile("ufo_awesome_2.tsv") }
    
    val result = Await.result(Future.sequence(Seq(future1, future2)), Duration.Inf)
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

使用 FuturereadFile 工作包裝起來,在 val future1 = Future { readFile("ufo_awesome_1.tsv") } 會產生一個 Thread 來處理,並且立即執行下一行的工作。最後使用 Await.result 取得結果,或者也可以使用 Await.ready

如果需要等待多個 Future 的執行結果,先將需要等待的 Future,組成一個 Seq,再使用 Future.sequence 組合後回傳單一個 Future後,再使用 Await 等待執行的結果。

如果移除 val result = Await.result(Future.sequence(Seq(future1, future2)), Duration.Inf) 會發現主程式一下子就執行完畢。

Callback

onSuccess

顧名思義就是當 Future 包裝的工作執行成功時,會執行的工作。

eg:

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()
    
    println(s"${System.currentTimeMillis()} - create future")
    val future = Future { readFile("ufo_awesome_1.tsv"); println(s"${System.currentTimeMillis()} - read complete") }
    
    println(s"${System.currentTimeMillis()} - register onSuccess")
    future onSuccess {
      case sb => println(s"${System.currentTimeMillis()} - success")
    }
    
    println(s"${System.currentTimeMillis()} - await")
    
    val result = Await.result(future, Duration.Inf)
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

結果:

start
1436100618543 - create future
1436100618816 - register onSuccess
1436100618818 - await
1436100620040 - read complete
end and cost: 1502 ms
1436100620042 - success

注意當宣告完 onSuccess 時,主程式並不會 Future 執行結束,而是往下繼續,一直到 Future 的工作完成後,才會執行 onSuccess

onFailure

Future 內的工作,有發生 Exception or Error 時 (也就是有 Throwable )。 onFailure 並不會做 catch 的動作。這一點要特別注意

eg:

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source
import scala.util.control.NonFatal

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()    
    
    println(s"${System.currentTimeMillis()} - create future")
    
    /* ufo_awesome_3.tsv 不存在*/
    val future = Future { readFile("ufo_awesome_3.tsv"); println(s"${System.currentTimeMillis()} - read complete") }
    
    println(s"${System.currentTimeMillis()} - register onSuccess")
    future onSuccess {
      case sb => println(s"${System.currentTimeMillis()} - success")
    }
    
    println(s"${System.currentTimeMillis()} - register onFailure")
    future onFailure {
      case ex: Exception => println(s"${System.currentTimeMillis()} - failure")
    }
    
    println(s"${System.currentTimeMillis()} - await")
    
    val result = Await.result(future, Duration.Inf)
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

結果:

start
1436102289048 - create future
1436102289345 - register onSuccess
1436102289350 - register onFailure
1436102289351 - failure
1436102289352 - await
Exception in thread "main" java.io.FileNotFoundException: ufo_awesome_3.tsv (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at scala.io.Source$.fromFile(Source.scala:91)
    at scala.io.Source$.fromFile(Source.scala:76)
    at scala.io.Source$.fromFile(Source.scala:54)
    at com.example.FutureTest$.readFile(FutureTest.scala:19)
    at com.example.FutureTest$$anonfun$1.apply$mcV$sp(FutureTest.scala:44)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:44)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:44)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

onComplete

Future 內的工作執行完畢,不論成功或失敗。

eg:

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source
import scala.util.control.NonFatal
import scala.util.Success
import scala.util.Failure

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()

    println(s"${System.currentTimeMillis()} - create future")
    
    /* ufo_awesome_3.tsv 不存在*/
    val future = Future { readFile("ufo_awesome_3.tsv"); println(s"${System.currentTimeMillis()} - read complete") }
        
    println(s"${System.currentTimeMillis()} - register onComplete")
    future onComplete {
      case Success(sb) => println(s"${System.currentTimeMillis()} - onComplete - success")
      case Failure(error) => println(s"${System.currentTimeMillis()} - onComplete - failure ${error.toString()}")
    }
    
    println(s"${System.currentTimeMillis()} - await")
    
    val result = Await.result(future, Duration.Inf)
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

結果:

start
1436105851821 - create future
1436105852172 - register onComplete
1436105852175 - await
1436105852177 - onComplete - failure java.io.FileNotFoundException: ufo_awesome_3.tsv (No such file or directory)
Exception in thread "main" java.io.FileNotFoundException: ufo_awesome_3.tsv (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at scala.io.Source$.fromFile(Source.scala:91)
    at scala.io.Source$.fromFile(Source.scala:76)
    at scala.io.Source$.fromFile(Source.scala:54)
    at com.example.FutureTest$.readFile(FutureTest.scala:21)
    at com.example.FutureTest$$anonfun$1.apply$mcV$sp(FutureTest.scala:46)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:46)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:46)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

多個 Callback

一個 Future 允許有多個 onSuccess, onFailure, 及 onComplete 。執行的順序不一定會依照程式碼的順序。

eg:

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source
import scala.util.control.NonFatal
import scala.util.Success
import scala.util.Failure

/**
 * @author kigi
 */
object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()

    /*
    val future1 = Future { readFile("ufo_awesome_1.tsv") }
    val future2 = Future { readFile("ufo_awesome_2.tsv") }
    
    val result = Await.result(Future.sequence(Seq(future1, future2)), Duration.Inf)
    */
    
    
    println(s"${System.currentTimeMillis()} - create future")
    //val future = Future { readFile("ufo_awesome_1.tsv"); println(s"${System.currentTimeMillis()} - read complete") }
    
    /* ufo_awesome_3.tsv 不存在*/
    val future = Future { readFile("ufo_awesome_3.tsv"); println(s"${System.currentTimeMillis()} - read complete") }
    
    
    println(s"${System.currentTimeMillis()} - register onSuccess")
    future onSuccess {
      case sb => println(s"${System.currentTimeMillis()} - success")
    }
    
    println(s"${System.currentTimeMillis()} - register onFailure")
    future onFailure {
      case ex: Exception => println(s"${System.currentTimeMillis()} - failure")
    }
    
    println(s"${System.currentTimeMillis()} - register onComplete")
    future onComplete {
      case Success(sb) => println(s"${System.currentTimeMillis()} - onComplete - success")
      case Failure(error) => println(s"${System.currentTimeMillis()} - onComplete - failure ${error.toString()}")
    }
    
    println(s"${System.currentTimeMillis()} - await")
    
    val result = Await.result(future, Duration.Inf)
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

結果:

start
1436105970847 - create future
1436105971148 - register onSuccess
1436105971151 - register onFailure
1436105971153 - register onComplete
1436105971154 - failure
1436105971155 - await
1436105971155 - onComplete - failure java.io.FileNotFoundException: ufo_awesome_3.tsv (No such file or directory)
Exception in thread "main" java.io.FileNotFoundException: ufo_awesome_3.tsv (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at scala.io.Source$.fromFile(Source.scala:91)
    at scala.io.Source$.fromFile(Source.scala:76)
    at scala.io.Source$.fromFile(Source.scala:54)
    at com.example.FutureTest$.readFile(FutureTest.scala:21)
    at com.example.FutureTest$$anonfun$1.apply$mcV$sp(FutureTest.scala:46)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:46)
    at com.example.FutureTest$$anonfun$1.apply(FutureTest.scala:46)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Map 及 flatMap

Future 也有支援 mapflatMap。也就是說可以利用 mapflatMap 來進一步做資料處理。

eg: 取出檔案每一行後,計算每一行的長度,最後加總。

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source
import scala.util.control.NonFatal
import scala.util.Success
import scala.util.Failure

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()
 
    /* Map Start */
    
    val future1 = Future { Source.fromFile("ufo_awesome_1.tsv").getLines().toSeq }
    
    val future2 = future1 map { seq => 
      seq.map { _.length }
    }
    
    val result = Await.result(future2, Duration.Inf)
    println(s"total: ${result.reduce( _ + _)}")
    /* Map End */
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

結果:

start
total: 75281071
end and cost: 1161 ms

eg: 結合兩個檔案的內容

package com.example

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.TimeoutException
import scala.concurrent.duration.Duration
import scala.io.Source
import scala.util.control.NonFatal
import scala.util.Success
import scala.util.Failure

object FutureTest {
  
  def readFile(file: String): StringBuilder = {
    val ret = new StringBuilder
    
    Source.fromFile(file).getLines() foreach { line =>
      ret ++= (line + "\r\n")
    }
    
    ret
  }
  
  def main(args: Array[String]) {
    
    println("start")
    
    val time = System.currentTimeMillis()    
    
    /* flatMap (for) start */
    
    val future1 = Future { readFile("ufo_awesome_1.tsv") }
    val future2 = Future { readFile("ufo_awesome_2.tsv") }
    
    val future3 = for (sb1 <- future1; sb2 <- future2) yield {
      sb1.toString + "\r\n" + sb2.toString
    }
   
    val result = Await.result(future3, Duration.Inf)
    println(s"total: ${result.length()}")
    
    /* flatMap (for) end */
    
    
    println(s"end and cost: ${System.currentTimeMillis() - time} ms")
  }
}

Scala Error Handle (Session 8)

Scala Error Handle

在 Function Language 中,Exception 是一種 Side Effect。在實際的環境中,只要跟 I/O 相關的,都會需要處理 Exception。Scala 除了保留原本 Java 的 try-catch-finally的機制外,提供下列三種方式來處理 Exception,再結合 Option 的方式,讓程式更可以專注在資料處理上。

Java: try - catch - finally

一般寫法:

try {
  ...
} catch {
    case ex: Exception =>
      ...
} finally {
  ...
}

千萬不要這麼寫:

try {
  ...
} catch {
  case _ =>
    ...
} finally {
 ...
}

try {
  ...
} catch {
  case _: Throwable =>
    ...
} finally {
  ...
}

偷懶的寫法:使用 NonFatal

import scala.util.control.NonFatal

try {
  ...
} catch {
  case NonFatal(_) =>
    ...
} finally {
  ...
}

Q: 為什麼不能使用:case _ => or case _: Throwable => ?

A: Throwable 有兩個 subclass: ExceptionError,一般在 Java 我們都是處理 ExceptionError 通常都是 JVM 發生重大錯誤時發出,如: OutOfMemoryError;此時應該是讓 JVM 中止執行,而不是繼續執行。

NonFatal 不會處理以下錯誤:

  • VirtualMachineError (包括 OutOfMemoryError and other fatal errors)
  • ThreadDeath
  • InterruptedException
  • LinkageError
  • ControlThrowable

Try

TryOption 相似,本身有兩個 subclass: SuccessFailure。當沒有發生 Exception 時,會回傳 Success 否則回傳 Failure

舉例:

scala> import scala.util.Try
import scala.util.Try
scala> def parseInt(value: String) = Try { value.toInt }
parseInt: (value: String)scala.util.Try[Int]

Try 常用的幾個 Function:

  • map
scala> val t1 = parseInt("1000") map { _ * 2 }
t1: scala.util.Try[Int] = Success(2000)

scala> for (t1 <- parseInt("1000")) yield t1
res0: scala.util.Try[Int] = Success(1000)


scala> val t2 = parseInt("abc") map { _ * 2 }
t2: scala.util.Try[Int] = Failure(java.lang.NumberFormatException: For input string: "abc")

scala> for (t2 <- parseInt("abc")) yield t2
res1: scala.util.Try[Int] = Failure(java.lang.NumberFormatException: For input string: "abc")
  • recover
scala> import scala.util.control.NonFatal
import scala.util.control.NonFatal

scala> t1 recover { case NonFatal(_) => -1 }
res5: scala.util.Try[Int] = Success(2000)

scala> t2 recover { case NonFatal(_) => -1 }
res6: scala.util.Try[Int] = Success(-1)
  • toOption
scala> t1.toOption
res7: Option[Int] = Some(2000)

scala> t2.toOption
res8: Option[Int] = None
  • getOrElse
scala> t1.getOrElse(-1)
res9: Int = 2000

scala> t2.getOrElse(-1)
res10: Int = -1
  • 註1: Try 使用 NonFatal 來處理 Exception。
  • 註2: Option 及 Try 都是 ADT (Algebraic Data Type)

Catch

Catch 是用來處理 catchfinally。搭配 OptionEither 來處理 Exception。

scala> import scala.util.control.Exception._
import scala.util.control.Exception._

scala> def parseInt(value: String) = nonFatalCatch[Int] opt { value.toInt }
parseInt: (value: String)Option[Int]

scala> def parseInt(value: String) = nonFatalCatch[Int] either { value.toInt }
parseInt: (value: String)scala.util.Either[Throwable,Int]

scala> def parseInt(value: String) = nonFatalCatch[Int] andFinally { println("finally") } opt { value.toInt }
parseInt: (value: String)Option[Int]

scala> def parseInt(value: String) = nonFatalCatch[Int] andFinally { println("finally") } opt { println("begin"); value.toInt }
parseInt: (value: String)Option[Int]

scala> parseInt("abc")
begin
finally
res2: Option[Int] = None

scala> parseInt("123")
begin
finally
res3: Option[Int] = Some(123)

scala> def parseInt(value: String) = catching(classOf[Exception]) opt { value.toInt }
parseInt: (value: String)Option[Int]

scala> parseInt("456")
res5: Option[Int] = Some(456)

Either

Either 可以讓 Fuction 達到回傳不同型別資料效果。Either 有兩個 subclass: RightLeft。可以使用 match-case 來確認是回傳 Right or Left;進而了解是成功或失敗。

scala> def parseInt(value: String) = try { Right(value.toInt) } catch { case ex: Exception => Left(value) } 

parseInt: (value: String)Product with Serializable with scala.util.Either[String,Int]

scala> parseInt("123") match {
     | case Right(v) => println(s"success ${v}")
     | case Left(s) => println(s"failure ${s}")
     | }
success 123

scala> parseInt("abc") match {
     | case Right(v) => println(s"success ${v}")
     | case Left(s) => println(s"failure ${s}")
     | }
failure abc

Map & Reduce in Scala Collection (Session 7)

Map & Reduce in Scala Collection

What's Map & Reudce

Map 是指將 collection 每個元素,經過指定的 function 處理,產生新的值 (也可以是另一個 Collection)

Map 的特性:

  • Collection 的每一個元素都是獨立被處理,也就是說跟 collection 的其他元素無關。
  • 經 Map 程序處理後,最後回傳的 collection 資料型別不變。比如說: List 經過 map 程序後,回傳值依然是 List,但其中的元素資料型別視處理的 function 而有所不同。
  • 原本的 collection 內的元素值都不會被改變,最後是回傳一個的 collection。

延伸: flatMap

Reduce 是指將 collection 的元素做歸納處理後,回傳一個跟元素資料型別相同或者是supertype 的值。

Reduce 特性:

  • 指定的歸納 function,第一次會處理兩個元素值。
  • 每次處理後的結果,再跟下一個元素,再做一次處理,以此類推,直到所有的元素都被處理過。
  • 處理的過程,不見得會依照預期的順序,因此指定的 function 如果沒有結合律的特性,也許結果會不如預期。
  • 回傳最終處理的結果,且資料型別是跟元素相同或者是 supertype。

註:結合律是 (x + y) + z = x + (y + z)。像四則運算的 \(+\) 與 \(\times\) 有結合律,但 \(-\) 與 \(\div\) 沒有。

延伸:reduceLeft & reduceRight

舉例:輸入一個字串的 List,計算其中字串長度的總和。

scala> val lst = List("ABC", "Hello, World!!!", "Apple", "Microsoft")
lst: List[String] = List(ABC, Hello, World!!!, Apple, Microsoft)

scala> val lenLst = lst map { _.length }
lenLst: List[Int] = List(3, 15, 5, 9)

scala> val total = lenLst reduce { _ + _ }
total: Int = 32

scala> val lst = List("ABC", "Hello, World!!!", "Apple", "Microsoft")
lst: List[String] = List(ABC, Hello, World!!!, Apple, Microsoft)

scala> lst map { _.length } reduce { _ + _ }
res7: Int = 32

Function Languge Map-Reduce 的概念,後來被應用到 Hadoop 的 Map-Reduce。

Map, flatMap & for-yield

Scala 的 for-yield 處理,實際上是轉成 flatMapmap 處理。

舉例:

一層迴圈

scala> for (i <- 1 to 9) yield i + 1
res5: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> (1 to 9) map { _ + 1 }
res6: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 3, 4, 5, 6, 7, 8, 9, 10)

二層迴圈

scala> for (i <- 1 to 9; j <- 1 to i) yield i * j
res7: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 4, 3, 6, 9, 4, 8, 12, 16, 5, 10, 15, 20, 25, 6, 12, 18, 24, 30, 36, 7, 14, 21, 28, 35, 42, 49, 8, 16, 24, 32, 40, 48, 56, 64, 9, 18, 27, 36, 45, 54, 63, 72, 81)

scala> (1 to 9) flatMap { i => (1 to i) map { i * _ } }
res8: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 4, 3, 6, 9, 4, 8, 12, 16, 5, 10, 15, 20, 25, 6, 12, 18, 24, 30, 36, 7, 14, 21, 28, 35, 42, 49, 8, 16, 24, 32, 40, 48, 56, 64, 9, 18, 27, 36, 45, 54, 63, 72, 81)

Option 處理

單個:

scala> val s1 = Some("s1")
s1: Some[String] = Some(s1)

scala> for (s <- s1) yield s + s
res11: Option[String] = Some(s1s1)

scala> s1 map { s => s + s }
res13: Option[String] = Some(s1s1)

多個:

scala> val s1 = Option("s1")
s1: Option[String] = Some(s1)

scala> val s2 = None
s2: None.type = None

scala> for (a <- s1; b <- s2) yield (a, b)
res15: Option[(String, Nothing)] = None

scala> s1 flatMap { a => s2 map { b => (a, b) } }
res17: Option[(String, Nothing)] = None

scala> for (a <- s1; b <- s2) yield a + b
res16: Option[String] = None

scala> s1 flatMap { a => s2 map { b => a + b } }
res18: Option[String] = None

flatMap or for-yield 很適合處理 AND 的情況

舉例:每個 Store 都有一個歸屬的 Store,目前要查詢歸屬的 Store 名稱。

scala> case class Store(id: Int, name: String, belong: Int)
defined class Store

scala> val map = Map(0 -> Store(0, "3C", 0))
map: scala.collection.immutable.Map[Int,Store] = Map(0 -> Store(0,3C,0))

if-else 的版本

val s1 = map.get(0)

scala> if (s1.isDefined) {
     | val s2 = map.get(s1.get.belong)
     | if (s2.isDefined) Some(s2.get.name)
     | else None
     | } else None
res6: Option[String] = Some(3C)

for-yield 版本

scala> for (s1 <- map.get(0); s2 <- map.get(s1.belong)) yield s2.name
res0: Option[String] = Some(3C)

Collection 相關函數

fold, foldLeft, foldRight

foldreduce 很類似,fold 多了可以指定 初始值

scala> val lst = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
lst: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> lst reduce { _ + _ }
res20: Int = 45

scala> lst.fold(100) { _ + _ }
res25: Int = 145

scan, scanLeft, scanRight

scan 可以指定 初始值,第一個元素與初始值處理的結果,再與第二個元素處理,以此類推,最後回傳原本 collection 的資料型別,初始值當作第一個元素。概念跟 map 有點像,map 是獨立處理每個元素,scan 會與上一次處理的結果有關。

scala> val lst = List(1, 2, 3)
lst: List[Int] = List(1, 2, 3)

scala> lst map { _ + 1 }
res26: List[Int] = List(2, 3, 4)

scala> lst.scan(10) { (a, b) => println(a, b); a + b  }
(10,1)
(11,2)
(13,3)
res30: List[Int] = List(10, 11, 13, 16)

groupBy

groupBy 利用指定的 function 回傳值當作 key,自動依 key 分群,回傳一個 Map,Map 內的 value 資料型別,會與原來的 collection 相同。

scala> val lst = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
lst: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> lst groupBy { _ % 3 }
res31: scala.collection.immutable.Map[Int,List[Int]] = Map(2 -> List(2, 5, 8), 1 -> List(1, 4, 7), 0 -> List(3, 6, 9))

scala> val arr = Array(1, 2, 3, 4, 5, 6, 7, 8, 9)
arr: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> arr groupBy { _ % 3 }
res32: scala.collection.immutable.Map[Int,Array[Int]] = Map(2 -> Array(2, 5, 8), 1 -> Array(1, 4, 7), 0 -> Array(3, 6, 9))

Zip

將兩個 collection 中的元素,一對一的方式組成兩個元素的 tuple。行為很類似 拉鏈

scala> val lst1 = List("a", "b", "c", "d")
lst1: List[String] = List(a, b, c, d)

scala> val lst2 = List(1, 2, 3)
lst2: List[Int] = List(1, 2, 3)

scala> lst1 zip lst2
res0: List[(String, Int)] = List((a,1), (b,2), (c,3))

scala> lst1.zipWithIndex
res2: List[(String, Int)] = List((a,0), (b,1), (c,2), (d,3))

附錄

Scala Variance & Bounds

Varince

Variance 主要是在討論

If \(T_1\) is subclass of \(T\), is Container[\(T_1\)] is subclass of Container[\(T\)] ?

Variance Meaning Scala notation
covariant C[\(T_1\)] is subclass of C[\(T\)] [+T]
contravariant C[\(T\)] is subclass of C[\(T_1\)] [-T]
invariant C[\(T_1\)] and C[\(T\)] are not related [T]

舉例:Covariant

scala> class Covariant[+A]
defined class Covariant

scala> val cv: Covariant[AnyRef] = new Covariant[String]
cv: Covariant[AnyRef] = Covariant@1fc2b765

scala> val cv: Covariant[String] = new Covariant[AnyRef]
<console>:8: error: type mismatch;
 found   : Covariant[AnyRef]
 required: Covariant[String]
       val cv: Covariant[String] = new Covariant[AnyRef]

舉例:Contravariant

scala> class Contravariant[-A]
defined class Contravariant

scala> val cv: Contravariant[String] = new Contravariant[AnyRef]
cv: Contravariant[String] = Contravariant@7bc1a03d

scala> val cv: Contravariant[AnyRef] = new Contravariant[String]
<console>:8: error: type mismatch;
 found   : Contravariant[String]
 required: Contravariant[AnyRef]
       val cv: Contravariant[AnyRef] = new Contravariant[String]

範例來自:Twitter's Scala School - Type & polymorphism basics

記憶方法:

\[ \forall T_1 \in +T, \text{then }T_1 \text{ is subclass of } T \]

\[ \forall T_1 \in -T, \text{then }T_1 \text{ is superclass of } T \]

\[ \begin{equation} -T \\ \uparrow\\ T \\ \uparrow \\ +T \end{equation} \]

ContraVariant 案例:Function1[-T, +R]

eg:

scala> class Test[+A] { def test(a: A): String = a.toString }
<console>:7: error: covariant type A occurs in contravariant position in type A of value a
       class Test[+A] { def test(a: A): String = a.toString }

fix:

scala> class Test[+A] { def test[B >: A](b: B): String = b.toString }
defined class Test

Bounds

Lower Type Bound

A >: B => A is superclass of B. B is the lower bound.

Upper Type Bound

A <: B => A is subclass of B. B is the upper bound.

參考:

Case Class and Pattern Match (Session 6)

Case Class and Pattern Match

Case Class

宣告:

case class Person(name: String, age: Int)

Compiler 後,會有 Person.classPerson$.class

scalap -private Person:

case class Person(name: scala.Predef.String, age: scala.Int) extends scala.AnyRef with scala.Product with scala.Serializable {
  val name: scala.Predef.String = { /* compiled code */ }
  val age: scala.Int = { /* compiled code */ }
  def copy(name: scala.Predef.String, age: scala.Int): Person = { /* compiled code */ }
  override def productPrefix: java.lang.String = { /* compiled code */ }
  def productArity: scala.Int = { /* compiled code */ }
  def productElement(x$1: scala.Int): scala.Any = { /* compiled code */ }
  override def productIterator: scala.collection.Iterator[scala.Any] = { /* compiled code */ }
  def canEqual(x$1: scala.Any): scala.Boolean = { /* compiled code */ }
  override def hashCode(): scala.Int = { /* compiled code */ }
  override def toString(): java.lang.String = { /* compiled code */ }
  override def equals(x$1: scala.Any): scala.Boolean = { /* compiled code */ }
}
object Person extends scala.runtime.AbstractFunction2[scala.Predef.String, scala.Int, Person] with scala.Serializable {
  def this() = { /* compiled code */ }
  final override def toString(): java.lang.String = { /* compiled code */ }
  def apply(name: scala.Predef.String, age: scala.Int): Person = { /* compiled code */ }
  def unapply(x$0: Person): scala.Option[scala.Tuple2[scala.Predef.String, scala.Int]] = { /* compiled code */ }
  private def readResolve(): java.lang.Object = { /* compiled code */ }
}

javap -p Person

Compiled from "Person.scala"
public class Person implements scala.Product,scala.Serializable {
  private final java.lang.String name;
  private final int age;
  public static scala.Option<scala.Tuple2<java.lang.String, java.lang.Object>> unapply(Person);
  public static Person apply(java.lang.String, int);
  public static scala.Function1<scala.Tuple2<java.lang.String, java.lang.Object>, Person> tupled();
  public static scala.Function1<java.lang.String, scala.Function1<java.lang.Object, Person>> curried();
  public java.lang.String name();
  public int age();
  public Person copy(java.lang.String, int);
  public java.lang.String copy$default$1();
  public int copy$default$2();
  public java.lang.String productPrefix();
  public int productArity();
  public java.lang.Object productElement(int);
  public scala.collection.Iterator<java.lang.Object> productIterator();
  public boolean canEqual(java.lang.Object);
  public int hashCode();
  public java.lang.String toString();
  public boolean equals(java.lang.Object);
  public Person(java.lang.String, int);
}

javap -p Person$

Compiled from "Person.scala"
public final class Person$ extends scala.runtime.AbstractFunction2<java.lang.String, java.lang.Object, Person> implements scala.Serializable {
  public static final Person$ MODULE$;
  public static {};
  public final java.lang.String toString();
  public Person apply(java.lang.String, int);
  public scala.Option<scala.Tuple2<java.lang.String, java.lang.Object>> unapply(Person);
  private java.lang.Object readResolve();
  public java.lang.Object apply(java.lang.Object, java.lang.Object);
  private Person$();
}

需注意的重點:

  • scalapjavap 來看,在宣告 case class 後,會產生兩個 class。
  • 當我們在產生 case class 時,是呼叫 object (singeton) 的 apply function.
  • case class contructor 的參數,會自動變成 read only 的 member data.
scala> case class Person(name: String, age: Int)
defined class Person

scala> val p1 = Person("abc", 10)
p1: Person = Person(abc,10)

與 Pattern Match 有直接關係的 function: apply and unapply. 以 Person 為例:

def apply(name: scala.Predef.String, age: scala.Int): Person = { /* compiled code */ }

def unapply(x$0: Person): scala.Option[scala.Tuple2[scala.Predef.String, scala.Int]] = { /* compiled code */ }

Pattern Match

上例 Person 的 Pattern Match 範例:

scala> p1 match {
     | case Person(n, a) => println(n, a)
     | case _ => println("not match")
     | }
(abc,10)

Extractor

一個 class or object 有以下之一的 function 時,就可以稱作 Extractor

  • unapply
  • unapplySeq

這類的 function ,稱為 extraction;反之,apply 則稱為 injection

Extractor 只要有實作 unapply or unapplySeq 即可;但如果 Extractor 沒有實作 apply, 則 unapply 回傳型別必須是 Boolean

unapplySeq 是用在 variable argument 也就是類似 func(lst: String*)

Extractor 可以是 object or classclass 可以存當時的條件,但 object 則沒有這樣的效果 (因為 object 是 singleton,無法存每次不同的比對條件)

Pattern, Extractor, and Binding

Extractor only with extraction and binding

package com.example

object EMail {

  /* Injection */
  def apply(u: String, d: String) = s"${u}@${d}"

  /* Extraction */
  def unapply(s: String): Option[(String, String)] = {
    println("EMail.unapply")
    var parts = s.split("@")
    if (parts.length == 2) Some(parts(0), parts(1)) else None
  }
}

/* Extraction */
object UpperCase {
  def unapply(s: String): Boolean = {
    println("UpperCase.unapply")
    s == s.toUpperCase()
  }
}

object PatternTest {

  def main(args: Array[String]) {
    "Test@test.com" match {
      case EMail(user @ UpperCase(), domain) => println(user, domain) /* 注意:UpperCase 後面一定要加 () (括號) */
      case _ => println("not match")
    }

    "TEST@test.com" match {
      case EMail(user @ UpperCase(), domain) => println(user, domain)
      case _ => println("not match")
    }
  }

}

執行結果:

EMail.unapply
UpperCase.unapply
not match
EMail.unapply
UpperCase.unapply
(TEST,test.com)

當在執行 pattern match 至 case EMail 時,會去呼叫 EMail.unapply(s: String) 看是否符合;當符合時,再呼叫 UpperCase.unapply(s: String)

Test@test.com 結果是 not match, 因為在 UpperCasefalse. TEST@test.com 則是 (TEST, test.com)

截自:Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition

Extractor with variable arguement

/* Extraction Only*/
class Between(val min: Int, val max: Int) {

  def unapplySeq(value: Int): Option[List[Int]] =
    if (min <= value && value <= max) Some(List(min, value, max))
    else None
}
    
object PatternTest {

  def main(args: Array[String]) {
     val between5and15 = new Between(5, 15)

    10 match {
      case between5and15(min, value, max) => println(value)
      case _ => println("not match")
    }

    20 match {
      case between5and15(min, value, max) => println(value)
      case _ => println("not match")
    }
  }
}

執行結果:

10
not match

因為 BetweenunapplySeq 回傳是 List(min, value, max),所以比對的 pattern 就必須是 List 的 pattern,像 (min, value, max) or (_, value, max) or (min, _*)

Extractor with binding

class Between(val min: Int, val max: Int) {

  def unapplySeq(value: Int): Option[List[Int]] =
    if (min <= value && value <= max) Some(List(min, value, max))
    else None
}

object PatternTest {

  def main(args: Array[String]) {
  
    (50, 10) match {
      case (n @ between5and15(_*), _) => println("first match " + n)
      case (_, m @ between5and15(_*)) => println("second match " + m)
      case _ => println("not match")
    }
  }

}

執行結果:

second match 10

Extractor 用在 binding 時,要注意要附上比對的 pattern (ex: between5and15(_*)),如果沒寫對,會比對失敗。比如說:把 (_, m @ between5and15(_*)) 改成 case (_, m @ between5and15()), 雖然 m (m = 10) 在 5 ~ 15,但會比對失敗。

Pattern and Regex

Scala 的 Regex 有實作 unapplySeq, Regex 搭配 Pattern 非常好用。

object RegexTest {

  def main(args: Array[String]) {
    val digits = """(\d+)-(\d+)""".r

    "123-456" match {
      case digits(a, b) => println(a, b)
      case _ => println("not match")
    }

    "123456" match {
      case digits(a, b) => println(a, b)
      case _ => println("not match")
    }

    "abc-456" match {
      case digits(a, b) => println(a, b)
      case _ => println("not match")
    }
  }
}

執行結果:

(123,456)
not match
not match

因為 digits 有用到 group,所以 pattern 會是 digits(a, b)。如果把 val digits = """(\d+)-(\d+)""".r 改成 val digits = """\d+-\d+""".r 不使用 group 時,因為比對的 pattern 改變 (digits(a, b) -> digits()),所以上面的三個比對都會是 not match。需要將程式改成如下,才會正確

val digits = """\d+-\d+""".r
  
"123-456" match {
  case digits() => println("ok")
  case _ => println("not match")
}
  

所以使用 Regex 時,儘量用 group 的功能,在系統設計時,彈性會比較大。

Regex and Binding

val digits = """(\d+)-(\d+)-(\d+)""".r

("123-abc-789", "123-456-789") match {
  case (_ @ digits(a, _*), _) => println(a)
  case (_, _ @ digits(a, b, c)) => println(a, b, c)
  case _ => println("not match")
}

用 Binding 時,一樣要注意比對的 pattern,如: digits(a, _*), digits(a, b, c)

Case Class, Patch Match and Algebraic Data Type

sealed trait Tree

object Empty extends Tree
case class Leaf(value: Int) extends Tree
case class Node(left: Tree, right: Tree) extends Tree

object TreeTest {

 val depth: PartialFunction[Tree, Int] = {
    case Empty => 0
    case Leaf(value) => 1
    case Node(l, r) => 1 + Seq(depth(l), depth(r)).max
  }
  
  def max(tree: Tree): Int = tree match {
    case Empty => Int.MinValue
    case Leaf(value) => value
     case Node(l, r) => Seq(max(r), max(l)).max
  }
  
  def main(args: Array[String]) {
    
    val tree = Node(
          Node(
            Leaf(1),
            Node(Leaf(3), Leaf(4))
          ),
          Node(
              Node(Leaf(100), Node(Leaf(6), Leaf(7))),
              Leaf(2)
          )
        )
        
        
     println(depth(tree))
     
     println(max(tree))
  }
}

注意:使用 sealed 時,子類別都要與父類別放在同一個原始碼中,且如果在 pattern match 少比對一種子類別時,會出現警告

範例修改自:

進階:

Wiki: Algebric Data Type

Scala Functional Language 簡介 (Session 5)

Scala Functional Language 簡介

Functional Language 其實很早就有了,只是 OOP 概念比較能讓人接受,因此 Functional Language 就被忽略。近來大數據需要做大量平行與分散式的運算,Functional Lanauge 才又被重視。

Functional Language 簡單來說,就是:

Treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data (Side Effects)

截自 Wiki: Functional Lanauge

名詞解釋

Mathematical Functions

Mathematical Functions 簡單來說,就是兩個集合間的關係,且每一個輸入值,只會對應到一個輸出值。

最簡單的數學函數的表示式:

\[f: X \mapsto Y\]

比如說:Square

\[Square: Int \mapsto Int \] \[f(x) = x^2\]

比如 \(f(3) = 9\), \(f(-3) = 9\) ,每次計算 \(f(3)\) 一定都會是 9 不會變成其他值。

Side Effects

程式的函式有以下的行為時,就會稱該函式有 Side Effects

  • Reassigning a variable (val v.s. var)
  • Modify a data structure in place (mutable v.s. immutable)
  • Setting a field on an object (change object state)
    • 這裏指的 object 是 OOP 的 Object 不是 Scala 的 object (singleton)
    • OOP 修改物件的值,在程式語言的術語:改變物件的狀態,如上說的 changing-state
  • Throwing an exception or halting with error
  • Printing to the console or reading user input (I/O)
  • Reading from or write to a file (I/O)
  • Drawing on the screen (I/O)

截自 Functional Language in Scala

就實際狀況來說,我們寫程式不可能不去碰 I/O。

Purely Functions

Purely functional functions (or expressions) have no side effects (memory or I/O).

截自 Wiki: Purely Function

Referential Transparency (RT)

An expression is said to be referentially transparent if it can be replaced with its value without changing the behavior of a program

截自 Wiki: Referential Transparency

簡單來說,程式碼中的變數,可以用此變數的值或運算式子來取代,而且不會改變輸出的結果。

舉例來說:String 是 immutable,當每次呼叫 reverse 時,都會回傳固定的值。

scala> val x = "Hello, World"
x: String = Hello, World

scala> val r1 = x.reverse
r1: String = dlroW ,olleH

scala> val r2 = x.reverse
r2: String = dlroW ,olleH

此時,可以將上述的 x 直接替換成 Hello, World,程式的結果都不會被改變。此特性,就是 Referential Transparency。如下:

scala> val r1 = "Hello, World".reverse
r1: String = dlroW ,olleH

scala> val r2 = "Hello, World".reverse
r2: String = dlroW ,olleH

另舉反例:StringBuilder 是 mutable,append 會修改 StringBuffer 內的值 (change object state)。

scala> val x = new StringBuilder("Hello")
x: StringBuilder = Hello

scala> val y = x.append(", World")
y: StringBuilder = Hello, World

scala> val r1 = y.toString
r1: String = Hello, World

scala> val r2 = y.toString
r2: String = Hello, World

當將 y 替換成 x.append(", World") 時:

scala> val r1 = x.append(", World").toString
r1: String = Hello, World

scala> val r2 = x.append(", World").toString
r2: String = Hello, World, World

此時 r1r2 的值並不一致,這樣子就沒有 Referential Transparency

範例截自: Functional Language in Scala

為什麼 Referential Transparency 如此重要

當程式設計都符合 Referential Transparency 時,就代表程式可以分散在不同 Thread, CPU核心,甚至不同主機上處理(空間),而且不論什麼時候被處理(時間),都不會影響輸出的結果。

Funcational Language 程式設計的終極目標就是 Referential Transparency。

First-Class Function and High Order Function

First-Class Function

一個程式語言有 First-Class Function 特性,是指此程式語言將 Function 當作是一種資料型態。

在 Scala 中,有定義 Function 這個 class。如下:

scala> val max = (x: Int, y:Int) => if (x > y) x else y
max: (Int, Int) => Int = <function2>

scala> max(3, 4)
res5: Int = 4

High Order Function

Hight Order Function 是指 Function 其中一個參數的資料型別是 Function。比如 Listforeach

scala> List(1, 2, 3, 4) foreach { x => println(x + x) }
2
4
6
8

Function Composition

數學的複合函數:

\[f: X \mapsto Y\]

\[g: Y \mapsto Z\]

\[ g \circ f: X \mapsto Z\]

\[ (g \circ f )(x) = g(f(x))\]

在 Scala 上的實作,有 composeandThen

f andThen g 等同於 \( g \circ f \)

f compose g 等同於 \( f \circ g \)

eg:

scala> val f = (x: Int) => x * x
f: Int => Int = <function1>

scala> val g = (x: Int) => x + 1
g: Int => Int = <function1>

scala> val goff = f andThen g
goff: Int => Int = <function1>

scala> goff(10)
res10: Int = 101

scala> val fofg = f compose g
fofg: Int => Int = <function1>

scala> fofg(10)
res11: Int = 121

轉成 Function Class

一般會用 def 宣告 function;可以使用 _ 轉換成 Function Class。如下:

scala> def f(x: Int) = x * x
f: (x: Int)Int

scala> def g(x: Int) = x + 1
g: (x: Int)Int

scala> f andThen g
<console>:10: error: missing arguments for method f;
follow this method with `_' if you want to treat it as a partially applied function
              f andThen g
              ^
<console>:10: error: missing arguments for method g;
follow this method with `_' if you want to treat it as a partially applied function
              f andThen g
                        ^
                        
 scala> f _ andThen g _
res1: Int => Int = <function1>

Partially Applied Function

def sum(x: Int, y: Int, z: Int) = x + y + z
sum: (x: Int, y: Int, z: Int)Int

scala> val a = sum _
a: (Int, Int, Int) => Int = <function3>

scala> val b = sum(1, _: Int, 3)
b: Int => Int = <function1>

scala> b(2)
res1: Int = 6

Closure

A function object that captures free variables, and is said to be “closed” over the variables visible at the time it is created.

舉例:

scala> var more = 10
more: Int = 10

scala> val addMore = (x: Int) => x + more
addMore: Int => Int = <function1>

addMore 是一個 Closure. more 這個變數是 free variable. xbounded variable.

Currying

假設有個 function 由兩個以上的集合對應到一個集合:

\[f: X \times Y \mapsto Z\]

比如說:

\[f(x, y) = \frac{y}{x}\]

我們可以定義一個 Curring Function

\[h(x) = y \mapsto f(x, y)\]

\(h(x)\) 是一個 Function ,它的輸入值是 x ,回傳值是 Function 。

比如說:

\[h(2) = y \mapsto f(2, y)\]

這時候的 \( h(2) \) 是一個 Function:

\[ f(2, y) = \frac {y}{2} \]

此時,我們可以再定義一個 Function: \(g(y) \)

\[g(y) = h(2) = y \mapsto f(2, y)\]

也就是

\[g(y) = f(2, y) = \frac {y}{2}\]

在 Scala 的實作:

定義 \( f: X \times Y \mapsto Z \)

scala> def f(x: Int)(y: Int) = y / x
f: (x: Int)(y: Int)Int

scala> f(4)(2)
res7: Int = 0

scala> f(2)(4)
res8: Int = 2

定義 \( g(y) = h(2) = y \mapsto f(2, y) \) i.e. \[g(y) = f(2, y) = \frac {y}{2} \]

scala> val h = f(2) _
h: Int => Int = <function1>

當 \( y = 4 \) 時,

\[g(4) = f(2, 4) = \frac {4} {2} = 2\]

scala> h(4)
res9: Int = 2

範例截自 Wiki: Curring

另一種使用時機:

scala> def modN(n: Int)(x: Int) = ((x % n) == 0)
modN: (n: Int)(x: Int)Boolean

scala> val nums = List(1, 2, 3, 4, 5, 6, 7, 8)
nums: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8)

scala> nums filter { modN(2) }
res10: List[Int] = List(2, 4, 6, 8)

scala> nums filter { modN(3) }
res11: List[Int] = List(3, 6)

數學式對應:

\[modN: Int \times Int \mapsto Boolean\]

\[modN(n, x) \Rightarrow ((x \bmod n ) == 0)\]

\[mod2(x) = modN(2, x) \Rightarrow ((x \bmod 2) == 0)\]

\[mod3(x) = modN(3, x) \Rightarrow ((x \bmod 3) == 0)\]

範例截自:Scala Document: Currying

Scala Partial Function

一般定義 Function 都會去處理輸入值的所有情況。比如說:

def plusOne(x: Int) = x + 1

所有 Int 整數值,傳進 plusOne 都會被處理。

Partial Function 換言之就是只處理某些狀況下的值。

定義:

scala> val one: PartialFunction[Int, String] = { case 1 => "one" }
one: PartialFunction[Int,String] = <function1>

使用:如果輸入沒有要處理的值時,會出現 Exception。比如 1 有定義,但 2 沒有,所以輸入 1 沒問題,輸入 2 就會有 Exception

scala> one(1)
res0: String = one

scala> one(2)
scala.MatchError: 2 (of class java.lang.Integer)
  at scala.PartialFunction$$anon$1.apply(PartialFunction.scala:253)
  at scala.PartialFunction$$anon$1.apply(PartialFunction.scala:251)
  at $anonfun$1.applyOrElse(<console>:7)
  at $anonfun$1.applyOrElse(<console>:7)
  at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
  ... 33 elided

查詢輸入值,是否已在處理的範圍內:

scala> one.isDefinedAt(1)
res2: Boolean = true

scala> one.isDefinedAt(2)
res3: Boolean = false

Composition of Partial Function

可以使用多個 Partial Function 組成一個複合函數。

scala> val two: PartialFunction[Int, String] = { case 2 => "two" }
two: PartialFunction[Int,String] = <function1>

scala> val three: PartialFunction[Int, String] = { case 3 => "three" }
three: PartialFunction[Int,String] = <function1>

scala> val wildcard: PartialFunction[Int, String] = { case _ => "something else" }
wildcard: PartialFunction[Int,String] = <function1>

scala> val partial = one orElse two orElse three orElse wildcard
partial: PartialFunction[Int,String] = <function1>

scala> partial(5)
res4: String = something else

scala> partial(3)
res5: String = three

scala> partial(2)
res6: String = two

scala> partial(1)
res7: String = one

scala> partial(0)
res8: String = something else

scala> partial.isDefinedAt(10)
res9: Boolean = true

scala> partial.isDefinedAt(1000)
res10: Boolean = true

範例截自:Twitter Scala School - Partial Function

總結

開始使用 Functional Lanague 時,思維需要做改變,程式設計時,以往用 OO 處理的設計,要轉換到是否可以切割成 Function 來處理,尤其是 Function 要符合數學函數或 Purly Function 的定義,Functional Language 程式的設計思維,會更倚重數學邏輯。

進階: