I found a topic in the Racket group about the performance of channel creating.
I want to write a OCaml's version to test.
let post (c,x) = Event.sync (Event.send c x);;
let accept c = Event.sync (Event.receive c);;
let get_chan c = let n = accept c in print_int n;print_newline ();;
let chan_trans (old_chan, new_chan) =
  let s = accept old_chan in
  post (new_chan,(s+1));;
let rec whisper count init_val =
  let rec aux n chan =
    if n >= count then chan
    else
      let new_chan = Event.new_channel ()
      in Thread.create chan_trans (chan, new_chan);
      aux (n+1) new_chan
  in let leftest_chan = Event.new_channel ()
  in let t0 = Thread.create post (leftest_chan, init_val)
  in let rightest_chan = aux 0 leftest_chan
  in get_chan rightest_chan;;
whisper 10000 1;;
The question is, when I tested for whisper 1000 1, it produced 1001 as expected. However, when I tried to test whisper 10000 1, there's an error as
Fatal error: exception Sys_error("Thread.create: Resource temporarily unavailable")
I used this command to compile and run
ocamlc -thread unix.cma threads.cma -o prog whisper.ml&&./prog -I +threads
OCaml Thread module uses the real system (kernel) threads. The total number of threads is bounded by the kernel:
 cat /proc/sys/kernel/threads-max
 251422
You can increase this of course,
 echo 100000 > /proc/sys/kernel/threads-max
but a better approach would be to treat threads as a resource and manage them correspondingly.
let rec whisper count init_val =
  let rec aux n t chan =
    if n >= count then chan
    else
      let new_chan = Event.new_channel () in
      let t' = Thread.create chan_trans (chan, new_chan) in
      Thread.join t;
      aux (n+1) t' new_chan in
  let leftest_chan = Event.new_channel () in
  let t = Thread.create post (leftest_chan, init_val) in
  let rightest_chan = aux 0 t leftest_chan in
  get_chan rightest_chan
In that case it will run with any size of the pipeline. For example:
$ ocamlbuild -use-ocamlfind -tag thread -pkg threads ev.native
$ time ./ev.native 
100001
real    0m1.581s
But this implementation of Chinese Whispers is very crude and inefficient. You shouldn't use heavyweight native threads for this (and neither go uses them). Instead, you should use cooperative lightweight threads from Lwt or Async libraries. This would be much efficient and nice.
This implementation follows closely the Go implementation from the blog post, but I think that we can do this more efficient and concise in OCaml without using mailboxes (but I'm not sure whether it will conform to the rules of the benchmark).
open Lwt.Infix
let whispers n =
  let rec whisper i p =
    if i < n then
      Lwt_mvar.take p >>= fun x ->
      whisper (i+1) (Lwt_mvar.create (x+1))
    else Lwt_mvar.take p in
  whisper 0 (Lwt_mvar.create 1)
let () = print_int @@ Lwt_main.run (whispers 100000)
The results are:
$ ocamlbuild -use-ocamlfind -tag thread -pkg lwt.unix lev.native --
$ time ./lev.native 
100001
real    0m0.007s
To compare with Go implementation on mine machine:
$ go build whispers.go 
$ time ./whispers 
100001
real    0m0.952s
The code above is a completely honest reimplementation of the original Go version. But one of the reasons why it so fast, is that OCaml and Lwt is very clever, and although it creates 100_000 threads and 100_001 channels, no threads are ever got yielded to a background, since every time the whisper is called the channel already contains data, so the thread is in a ready state. As a result, this is just an efficient loop, that creates threads and channels. It can create a million threads in 50 ms. 
So this is an idiomatic and correct way of doing things. But lets for the sake of true comparison mimick Go behavior. The following implementation will first eagerly create in the heap 100_001 channels, and 100_000 threads, waiting to transfer data from left to right channel. And only afterward it will put a value into the leftmost channel to provoke a chain of reaction. This would basically mimick what is happening in Go underneath the hood.
let whispers n =
  let rec loop i p =
    if i < n then
      let p' = Lwt_mvar.create_empty () in
      let _t =
        Lwt_mvar.take p >>= fun x ->
        Lwt_mvar.put p' (x+1) in
      loop (i+1) p'
    else Lwt_mvar.take p in
  let p0 = Lwt_mvar.create_empty () in
  let t = loop 1 p0 in
  Lwt_mvar.put p0 1 >>= fun () -> t
$ time ./lev.native
100001
real    0m0.111s
So it is slightly slower, in fact it is 20 times slower than the previous implementation (I've used 1 million of threads to compare them), but it is still 10 times faster than the Go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With