[Golang] Concurrency and Goroutine

此篇為各筆記之整理，非原創內容，資料來源可見文後參考資料。

TL;DR

從一個 goroutine 切換到另一個 goroutine 的時機點是「當正在執行的 goroutine 阻塞時，就會交給其他 goroutine 做事」

概念釐清

goroutines vs threads

goroutines 是由 Go runtime 所管理的輕量化的 thread
goroutines 會在相同的 address space 中執行，因此要存取共享的記憶體必須要是同步的（synchronized）。
當我們在執行 Go 程式時，Go runtime 會建立許多 threads，當某一個 goroutine 的 thread 被阻塞時，它會切換去其它 thread 執行其他的 goroutine，這個過程很類似 thread scheduling，但它是由 go runtime 來處理，而且速度更快
傳統的 Apache 伺服器來說，當每分鐘需要處理 1000 個請求時，每個請求如果都要 concurrently 的運作，將會需要建立 1000 個 threads 或者分派到不同的 process 去做，如果 OS 的每個 thread 都需要使用 1MB 的 stack size 的話，就會需要 1GB 的記憶體才能撐得住這樣的流量。但相對於 goroutine 來說，因為 stack size 可以動態增長，因此可以擴充到 1000 個 goroutine，每個 goroutine 只需要 2KB（Go 1.4 之後）的 stack size。
在 Go 1.5 之後，Golang 預設會使用的 CPU 的數目（GOMAXPROCS）將會根據電腦實體 CPU 的數目來決定
使用越多的 CPU 來執行不見得會有更好的效能，因為不同 CPU 之間需要更多時間來進行溝通和資料交換，透過 runtime.GOMAXPROCS(n) 可以改變 go runtime 使用的處理器數目

OS thread	goroutine
由 OS kernel 管理，相依於硬體	goroutines 是由 go runtime 管理，不依賴於硬體
OS threads 一般有固定 1-2 MB 的 stack size	goroutines 的 stack size 約 8KB（自從 Go 1.4 開始為 2KB）
在編譯的時候就決定了 stack 的大小，並且不能增長	由於是在 run-time 管理 stack size，透過分配和釋放 heap storage 可以增長到 1GB
不同 thread 之間沒有簡易的溝通媒介，並且溝通時易有延遲	goroutine 使用 `channels` 來和其它的 goroutine 溝通，且低延遲
thread 有 identity，透過 TID 可以辨別 process 中的不同 thread	goroutine 沒有 identity
Thread 有需要 setup 和 teardown cost，需要向 OS 請求資源並在完成時還回去	goroutine 是在 go 的 runtime 中建立和摧毀，和 OS threads 相比非常容易，因為 go runtime 已經為 goroutines 建立了 thread pools，因此 OS 並不會留意到 coroutines
threads 需要先被 scheduled，在不同 thread 間切換時的消耗很高，因為 scheduler 需要儲存和還原	context switch is very cheap

資料來源：threads vs goroutines @ gist

concurrency vs parallelism

Concurrency 指的是開啟很多的 threads 在執行程式碼，但它們並不是「同時」執行，而是透過快速切換來執行（只有一個 CPU 在負責）。
Parallelism 指的是開啟很多 threads 「同時」執行程式碼，需要倚靠多個 CPU。
Concurrency 和 Parallelism 的雖然概念不同，但透過 Concurrency，在設備有支援的情況下，有機會能達到 Parallelism。

"concurrency is dealing with multiple things at once, parallelism is doing multiple things at once"（Achieving concurrency in Go）

Goroutines

Anatomy of goroutines in Go -Concurrency in Go @ rungo

每個 Go 程式預設都會建立一個 goroutine，這被稱作是 main goroutine，也就是函式 main 中執行的內容
所有的 goroutines 都是沒有名稱的（anonymous），因為 goroutine 並沒有 identity
在下面這段程式中，當 main goroutine 開始執行時，go 排程器（scheduler）並不會將控制權交給 printHello 這個 goroutine，因此當 goroutine 執行完畢後，程式會立即中止，而排程器並沒有機會把 printHello 這個 goroutine 加入排程中。

func printHello() {
	fmt.Println("Hello World")
}

func main() {
	fmt.Println("main execution started")

	// call function
	go printHello()

	fmt.Println("main execution stopped")
}

但我們知道，當 goroutine 被阻塞的時候，就會把控制權交給其他的 goroutine，因此這裡可以試著用 time.Sleep() 來把它阻塞：

func printHello() {
	fmt.Println("Hello World")
}

func main() {
	fmt.Println("main execution started")

	// call function
	go printHello()

  // block here
	time.Sleep(10 * time.Millisecond)
	fmt.Println("main execution stopped")
}

anonymous goroutine

func main() {
	fmt.Println("main() started")

	c := make(chan string)

	// anonymous goroutine
	go func(c chan string) {
		fmt.Println("Hello " + <-c + "!")
	}(c)

	c <- "John"
	fmt.Println("main() ended")
}

Concurrency Pattern

Generator

透過 goroutine 的方式，

// 程式來源：https://medium.com/rungo/anatomy-of-channels-in-go-concurrency-in-go-1ec336086adb
// fib 會回傳 read-only channel
func fib(length int) <-chan int {
	c := make(chan int, length)

	// run generation concurrently
	go func() {
		for i, j := 0, 1; i < length; i, j = i+j, i {
			c <- i
		}

		close(c)
	}()

	// return channel
	return c
}

func main() {
	for fn := range fib(10) {
		fmt.Println("Current fibonacci number is ", fn)
	}
}

TL;DR​

概念釐清​

goroutines vs threads​

concurrency vs parallelism​

Goroutines​

anonymous goroutine​

Concurrency Pattern​

Generator​

TL;DR

概念釐清

goroutines vs threads

concurrency vs parallelism

Goroutines

anonymous goroutine

Concurrency Pattern

Generator