Gray Soft / Tags / Concurrencytag:graysoftinc.com,2014-03-20:/tags/Concurrency2014-09-19T14:54:24ZJames Edward Gray IITaking Rust to Tasktag:graysoftinc.com,2014-09-06:/posts/1272014-09-19T14:54:24ZAn example of me exploring how to do message passing style multiprocessing in Rust.<p>Now that I've reached the point where I can get some Rust code running without asking questions in IRC every five minutes, I really wanted to play with some <em>tasks</em>. Tasks are the way Rust handles multiprocessing code. Under the hood they can map one-to-one with operating system threads or you can use a many-to-one mapping that I'm not ready to go into yet.</p>
<p>Probably one of the most exciting aspect of tasks in Rust, in my opinion, is that <a href="http://carol-nichols.com/2014/07/14/ruby-rust-concurrency/">unsafe use of shared memory is rejected outright as a compile error</a>. That lead me to want to figure out how you communicate correctly. (Spoiler: the same was you do in Ruby: <a href="http://graysoftinc.com/rubies-in-the-rough/sleepy-programs">just pass messages</a>.)</p>
<p>Ready to dive in, I grossly simplified a recent challenge from work and coded it up in Rust. You can get the idea with a glance at <code>main()</code>:</p>
<div class="highlight highlight-rust"><pre><span class="k">use</span> <span class="n">std</span><span class="o">::</span><span class="n">collections</span><span class="o">::</span><span class="n">HashMap</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="k">fn</span> <span class="n">string_vec</span><span class="p">(</span><span class="n">strs</span><span class="o">:</span> <span class="o">&</span><span class="p">[</span><span class="o">&</span><span class="err">'</span><span class="k">static</span> <span class="k">str</span><span class="p">])</span> <span class="o">-></span> <span class="n">Vec</span><span class="o"><</span><span class="n">String</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">v</span> <span class="o">=</span> <span class="n">Vec</span><span class="o">::</span><span class="n">new</span><span class="p">();</span>
<span class="k">for</span> <span class="n">s</span> <span class="n">in</span> <span class="n">strs</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span> <span class="p">{</span>
<span class="n">v</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">s</span><span class="p">.</span><span class="n">to_string</span><span class="p">());</span>
<span class="p">}</span>
<span class="n">v</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">services</span> <span class="o">=</span> <span class="n">HashMap</span><span class="o">::</span><span class="n">new</span><span class="p">();</span>
<span class="n">services</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">"S1"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="n">string_vec</span><span class="p">([</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"B"</span><span class="p">]));</span>
<span class="n">services</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">"S2"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="n">string_vec</span><span class="p">([</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"C"</span><span class="p">]));</span>
<span class="n">services</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">"S3"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="n">string_vec</span><span class="p">([</span><span class="s">"C"</span><span class="p">,</span> <span class="s">"D"</span><span class="p">,</span> <span class="s">"E"</span><span class="p">,</span> <span class="s">"F"</span><span class="p">]));</span>
<span class="n">services</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">"S4"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="n">string_vec</span><span class="p">([</span><span class="s">"D"</span><span class="p">,</span> <span class="s">"B"</span><span class="p">]));</span>
<span class="n">services</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">"S5"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="n">string_vec</span><span class="p">([</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"Z"</span><span class="p">]));</span>
<span class="k">let</span> <span class="n">work</span> <span class="o">=</span> <span class="n">Work</span><span class="p">(</span><span class="n">Search</span><span class="o">::</span><span class="n">new</span><span class="p">(</span><span class="s">"A"</span><span class="p">.</span><span class="n">to_string</span><span class="p">(),</span> <span class="s">"B"</span><span class="p">.</span><span class="n">to_string</span><span class="p">()));</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">task_manager</span> <span class="o">=</span> <span class="n">TaskManager</span><span class="o">::</span><span class="n">new</span><span class="p">(</span><span class="n">services</span><span class="p">);</span>
<span class="n">task_manager</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">work</span><span class="p">);</span>
<span class="p">}</span>
</pre></div>
<p>The <code>HashMap</code> sets up the story. Let's assume we have various travel services: <em>S1</em>, <em>S2</em>, etc. Each of those services can reach various stops. For example, <em>S1</em> can reach <em>A</em> and <em>B</em>. Now the <code>work</code> variable tells you what we want to do, which is to search for ways to get from <em>A</em> to <em>B</em>.</p>
<p>The twist is handled by <code>TaskManager</code>. It's going to run each service in an isolated separate process (a <em>task</em> in Rust speak) that only knows about its own stops. This means finding non-direct paths must involve inter-process communication.</p>
<p>The <code>string_vec()</code> helper just builds a vector of <code>String</code> objects for me. I'm not using <code>String</code> for its mutability, but it's more of an attempt to clarify ownership without scattering lifetime specifications everywhere. (This may not been ideal. Please remember that I'm still a pretty green Rust programmer.)</p>
<p>If the code I write works, it will find two possible paths for our search and indeed it does:</p>
<pre><code>$ ./pathfinder
Path: A--S1-->B
Path: A--S2-->C--S3-->D--S4-->B
</code></pre>
<p>The rest of this blog post will examine just how it accomplishes this.</p>
<p>Let's begin with a pretty trivial data structure:</p>
<div class="highlight highlight-rust"><pre><span class="cp">#[deriving(Clone)]</span>
<span class="k">struct</span> <span class="n">Path</span> <span class="p">{</span>
<span class="n">from</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span>
<span class="n">to</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span>
<span class="n">service</span><span class="o">:</span> <span class="n">String</span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Path</span> <span class="p">{</span>
<span class="k">fn</span> <span class="n">new</span><span class="p">(</span><span class="n">from</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span> <span class="n">to</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span> <span class="n">service</span><span class="o">:</span> <span class="n">String</span><span class="p">)</span> <span class="o">-></span> <span class="n">Path</span> <span class="p">{</span>
<span class="n">Path</span><span class="p">{</span><span class="n">from</span><span class="o">:</span> <span class="n">from</span><span class="p">,</span> <span class="n">to</span><span class="o">:</span> <span class="n">to</span><span class="p">,</span> <span class="n">service</span><span class="o">:</span> <span class="n">service</span><span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">to_string</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">String</span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">from</span>
<span class="p">.</span><span class="n">clone</span><span class="p">()</span>
<span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="s">"--"</span><span class="p">)</span>
<span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">service</span><span class="p">.</span><span class="n">as_slice</span><span class="p">())</span>
<span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="s">"-->"</span><span class="p">)</span>
<span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="n">as_slice</span><span class="p">())</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>This just defines a <code>Path</code> as two endpoints and the <code>service</code> that connects them. You can see that the <code>to_string()</code> method gives us the simple ASCII arrow output from the solution.</p>
<p><code>Search</code> builds on <code>Path</code>:</p>
<div class="highlight highlight-rust"><pre><span class="cp">#[deriving(Clone)]</span>
<span class="k">struct</span> <span class="n">Search</span> <span class="p">{</span>
<span class="n">from</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span>
<span class="n">to</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span>
<span class="n">paths</span><span class="o">:</span> <span class="n">Vec</span><span class="o"><</span><span class="n">Path</span><span class="o">></span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">Search</span> <span class="p">{</span>
<span class="k">fn</span> <span class="n">new</span><span class="p">(</span><span class="n">from</span><span class="o">:</span> <span class="n">String</span><span class="p">,</span> <span class="n">to</span><span class="o">:</span> <span class="n">String</span><span class="p">)</span> <span class="o">-></span> <span class="n">Search</span> <span class="p">{</span>
<span class="n">Search</span><span class="p">{</span><span class="n">from</span><span class="o">:</span> <span class="n">from</span><span class="p">,</span> <span class="n">to</span><span class="o">:</span> <span class="n">to</span><span class="p">,</span> <span class="n">paths</span><span class="o">:</span> <span class="n">vec</span><span class="o">!</span><span class="p">[]}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">services</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">Vec</span><span class="o"><</span><span class="n">String</span><span class="o">></span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">paths</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">path</span><span class="o">|</span> <span class="n">path</span><span class="p">.</span><span class="n">service</span><span class="p">.</span><span class="n">clone</span><span class="p">()).</span><span class="n">collect</span><span class="p">()</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">stops</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">Vec</span><span class="o"><</span><span class="n">String</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">all</span> <span class="o">=</span> <span class="n">vec</span><span class="o">!</span><span class="p">[];</span>
<span class="n">all</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">from</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="n">all</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="k">for</span> <span class="n">path</span> <span class="n">in</span> <span class="n">self</span><span class="p">.</span><span class="n">paths</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span> <span class="p">{</span>
<span class="n">all</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">from</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="n">all</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="p">}</span>
<span class="n">all</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">add_path</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">,</span> <span class="n">path</span><span class="o">:</span> <span class="n">Path</span><span class="p">)</span> <span class="o">-></span> <span class="n">Search</span> <span class="p">{</span>
<span class="n">Search</span><span class="p">{</span> <span class="n">from</span><span class="o">:</span> <span class="n">path</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span>
<span class="n">to</span><span class="o">:</span> <span class="n">self</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span>
<span class="n">paths</span><span class="o">:</span> <span class="n">self</span><span class="p">.</span><span class="n">paths</span><span class="p">.</span><span class="n">clone</span><span class="p">().</span><span class="n">append</span><span class="p">([</span><span class="n">path</span><span class="p">])</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>A <code>Search</code> holds the endpoints we're currently trying to traverse and any <code>paths</code> collected so far in an attempt to reach our goal. The <code>add_path()</code> method is used with partial matches. It returns a new <code>Search</code> for the remaining distance with the partial <code>Path</code> added to the list.</p>
<p>The other methods, <code>services()</code> and <code>stops()</code>, just return lists of what has been used so far in the accumulated paths. These are helpful in avoiding building circular paths.</p>
<p>Note that both data structures so far are cloneable. This is so that copies can be made to sent across channels to other tasks.</p>
<p>We need two more aggregate data structures to be the messages that we will shuttle between tasks:</p>
<div class="highlight highlight-rust"><pre><span class="cp">#[deriving(Clone)]</span>
<span class="k">enum</span> <span class="n">Job</span> <span class="p">{</span>
<span class="n">Work</span><span class="p">(</span><span class="n">Search</span><span class="p">),</span>
<span class="n">Finish</span>
<span class="p">}</span>
<span class="k">enum</span> <span class="n">Event</span> <span class="p">{</span>
<span class="n">Match</span><span class="p">(</span><span class="n">Vec</span><span class="o"><</span><span class="n">Path</span><span class="o">></span><span class="p">),</span>
<span class="n">Partial</span><span class="p">(</span><span class="n">Vec</span><span class="o"><</span><span class="n">Search</span><span class="o">></span><span class="p">),</span>
<span class="n">Done</span><span class="p">(</span><span class="n">String</span><span class="p">)</span>
<span class="p">}</span>
</pre></div>
<p>A <code>Job</code> can be either a wrapped <code>Search</code> we want to perform or the special <code>Finish</code> flag. The latter just tells a task to bust out of its infinite listening-for-searches loop and exit cleanly. This <code>enum</code> lists the messages we can send <strong>to</strong> the service tasks.</p>
<p>Messages <strong>from</strong> a service task (back to our not yet examined <code>TaskManager</code>) are called an <code>Event</code>. There are three possible types. <code>Match</code> means we have a full solution, using the included paths. <code>Partial</code> means that the task matched part of the path and is sending back a list of searches to try for locating the rest of it. Again <code>Done</code> is a special flag that pretty much means, "I've got nothing." This flag includes the name of the service for a reason that will become obvious after we view this next helper object:</p>
<div class="highlight highlight-rust"><pre><span class="k">struct</span> <span class="n">SearchTracker</span> <span class="p">{</span>
<span class="n">counts</span><span class="o">:</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="k">uint</span><span class="o">></span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">SearchTracker</span> <span class="p">{</span>
<span class="k">fn</span> <span class="n">new</span><span class="o"><</span><span class="err">'</span><span class="n">a</span><span class="p">,</span> <span class="n">I</span><span class="o">:</span> <span class="n">Iterator</span><span class="o"><&</span><span class="err">'</span><span class="n">a</span> <span class="n">String</span><span class="o">>></span><span class="p">(</span><span class="k">mut</span> <span class="n">service_names</span><span class="o">:</span> <span class="n">I</span><span class="p">)</span> <span class="o">-></span> <span class="n">SearchTracker</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">tracker</span> <span class="o">=</span> <span class="n">SearchTracker</span><span class="p">{</span><span class="n">counts</span><span class="o">:</span> <span class="n">HashMap</span><span class="o">::</span><span class="n">new</span><span class="p">()};</span>
<span class="k">for</span> <span class="n">name</span> <span class="n">in</span> <span class="n">service_names</span> <span class="p">{</span>
<span class="n">tracker</span><span class="p">.</span><span class="n">counts</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">name</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span> <span class="m">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">tracker</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">add_search</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="n">_</span><span class="p">,</span> <span class="n">count</span><span class="p">)</span> <span class="n">in</span> <span class="n">self</span><span class="p">.</span><span class="n">counts</span><span class="p">.</span><span class="n">mut_iter</span><span class="p">()</span> <span class="p">{</span>
<span class="o">*</span><span class="n">count</span> <span class="o">+=</span> <span class="m">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">mark_done</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">,</span> <span class="n">name</span><span class="o">:</span> <span class="n">String</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">count</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">counts</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&</span><span class="n">name</span><span class="p">);</span>
<span class="o">*</span><span class="n">count</span> <span class="o">-=</span> <span class="m">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">is_done</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">bool</span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">counts</span><span class="p">.</span><span class="n">values</span><span class="p">().</span><span class="n">all</span><span class="p">(</span><span class="o">|</span><span class="n">n</span><span class="o">|</span> <span class="o">*</span><span class="n">n</span> <span class="o">==</span> <span class="m">0</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>The idea behind <code>SearchTracker</code> is that we want to know when we're done seeing results from the various tasks. We can do that by expecting exactly one response (an <code>Event</code>) from each task for each <code>Search</code> sent.</p>
<p>This object just makes a counter for each service name passed into the constructor. You can then <code>add_search()</code> to bump all counters (because each <code>Search</code> is sent to all service tasks) and <code>mark_done()</code> to reduce a named counter when you receive a response from that service. The <code>is_done()</code> method will return true when all the counts balance out.</p>
<p>We need one last helper before we get into the actual process:</p>
<div class="highlight highlight-rust"><pre><span class="k">struct</span> <span class="n">MultiTaskSender</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="n">senders</span><span class="o">:</span> <span class="n">Vec</span><span class="o"><</span><span class="n">Sender</span><span class="o"><</span><span class="n">T</span><span class="o">>></span>
<span class="p">}</span>
<span class="k">impl</span><span class="o"><</span><span class="n">T</span><span class="o">:</span> <span class="n">Clone</span> <span class="o">+</span> <span class="n">Send</span><span class="o">></span> <span class="n">MultiTaskSender</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">fn</span> <span class="n">new</span><span class="p">()</span> <span class="o">-></span> <span class="n">MultiTaskSender</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="n">MultiTaskSender</span><span class="p">{</span><span class="n">senders</span><span class="o">:</span> <span class="n">vec</span><span class="o">!</span><span class="p">[]}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">add_sender</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">,</span> <span class="n">sender</span><span class="o">:</span> <span class="n">Sender</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">senders</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">sender</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">send</span><span class="p">(</span><span class="o">&</span><span class="n">self</span><span class="p">,</span> <span class="n">t</span><span class="o">:</span> <span class="n">T</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="n">sender</span> <span class="n">in</span> <span class="n">self</span><span class="p">.</span><span class="n">senders</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span> <span class="p">{</span>
<span class="n">sender</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>As I said, each <code>Search</code> will be sent to all tasks. This simple multicaster just allows you to <code>add_sender()</code> as each task is built and later <code>send()</code> to all those channels. The actual object sent is handled as a generic type that we just need to be able to <code>Clone</code> and <code>Send</code>.</p>
<p>We're finally ready for the main workhorse of this code:</p>
<div class="highlight highlight-rust"><pre><span class="k">struct</span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="n">services</span><span class="o">:</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Vec</span><span class="o"><</span><span class="n">String</span><span class="o">>></span><span class="p">,</span>
<span class="n">tracker</span><span class="o">:</span> <span class="n">SearchTracker</span><span class="p">,</span>
<span class="n">multi_sender</span><span class="o">:</span> <span class="n">MultiTaskSender</span><span class="o"><</span><span class="n">Job</span><span class="o">></span><span class="p">,</span>
<span class="n">event_sender</span><span class="o">:</span> <span class="n">Sender</span><span class="o"><</span><span class="n">Event</span><span class="o">></span><span class="p">,</span>
<span class="n">event_receiver</span><span class="o">:</span> <span class="n">Receiver</span><span class="o"><</span><span class="n">Event</span><span class="o">></span>
<span class="p">}</span>
<span class="k">impl</span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="k">fn</span> <span class="n">new</span><span class="p">(</span><span class="n">services</span><span class="o">:</span> <span class="n">HashMap</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Vec</span><span class="o"><</span><span class="n">String</span><span class="o">>></span><span class="p">)</span> <span class="o">-></span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="k">let</span> <span class="p">(</span><span class="n">sender</span><span class="p">,</span> <span class="n">receiver</span><span class="p">)</span> <span class="o">=</span> <span class="n">channel</span><span class="p">();</span>
<span class="n">TaskManager</span><span class="p">{</span> <span class="n">services</span><span class="o">:</span> <span class="n">services</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span>
<span class="n">tracker</span><span class="o">:</span> <span class="n">SearchTracker</span><span class="o">::</span><span class="n">new</span><span class="p">(</span><span class="n">services</span><span class="p">.</span><span class="n">keys</span><span class="p">()),</span>
<span class="n">multi_sender</span><span class="o">:</span> <span class="n">MultiTaskSender</span><span class="o">::</span><span class="n">new</span><span class="p">(),</span>
<span class="n">event_sender</span><span class="o">:</span> <span class="n">sender</span><span class="p">,</span>
<span class="n">event_receiver</span><span class="o">:</span> <span class="n">receiver</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">run</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">,</span> <span class="n">work</span><span class="o">:</span> <span class="n">Job</span><span class="p">)</span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">launch_services</span><span class="p">();</span>
<span class="n">self</span><span class="p">.</span><span class="n">send_job</span><span class="p">(</span><span class="n">work</span><span class="p">);</span>
<span class="n">self</span><span class="p">.</span><span class="n">wait_for_services</span><span class="p">();</span>
<span class="n">self</span><span class="p">.</span><span class="n">send_job</span><span class="p">(</span><span class="n">Finish</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span>
</pre></div>
<p>This shows you what a <code>TaskManager</code> keeps track of, which is just the data for our challenge, the helper objects, and various channels (the pipes of communication between Rust tasks). You can see how this gets setup in <code>new()</code>.</p>
<p>Once we have everything we need to track, <code>run()</code> actually does the <code>Job</code>. It will:</p>
<ol>
<li>Launch a task for each service</li>
<li>Send the full search we want to perform</li>
<li>Wait for and respond to reported work from the tasks</li>
<li>Signal all tasks to shutdown when the work is done</li>
</ol><p>Two pieces of this process are beating heart of the system. Here's the first of those:</p>
<div class="highlight highlight-rust"><pre><span class="k">impl</span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="k">fn</span> <span class="n">launch_services</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">stops</span><span class="p">)</span> <span class="n">in</span> <span class="n">self</span><span class="p">.</span><span class="n">services</span><span class="p">.</span><span class="n">clone</span><span class="p">().</span><span class="n">move_iter</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">task_event_sender</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">event_sender</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span>
<span class="k">let</span> <span class="p">(</span><span class="n">search_sender</span><span class="p">,</span> <span class="n">search_receiver</span><span class="p">)</span> <span class="o">=</span> <span class="n">channel</span><span class="p">();</span>
<span class="n">self</span><span class="p">.</span><span class="n">multi_sender</span><span class="p">.</span><span class="n">add_sender</span><span class="p">(</span><span class="n">search_sender</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span>
<span class="n">spawn</span><span class="p">(</span> <span class="n">proc</span><span class="p">()</span> <span class="p">{</span>
<span class="k">loop</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">job</span> <span class="o">=</span> <span class="n">search_receiver</span><span class="p">.</span><span class="n">recv</span><span class="p">();</span>
<span class="k">match</span> <span class="n">job</span> <span class="p">{</span>
<span class="n">Work</span><span class="p">(</span><span class="n">search</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="n">stops</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&</span><span class="n">search</span><span class="p">.</span><span class="n">from</span><span class="p">)</span> <span class="o">&&</span>
<span class="n">stops</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&</span><span class="n">search</span><span class="p">.</span><span class="n">to</span><span class="p">)</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">path</span> <span class="o">=</span> <span class="n">Path</span><span class="o">::</span><span class="n">new</span><span class="p">(</span>
<span class="n">search</span><span class="p">.</span><span class="n">from</span><span class="p">,</span>
<span class="n">search</span><span class="p">.</span><span class="n">to</span><span class="p">,</span>
<span class="n">name</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>
<span class="p">);</span>
<span class="k">let</span> <span class="n">paths</span> <span class="o">=</span> <span class="n">search</span><span class="p">.</span><span class="n">paths</span><span class="p">.</span><span class="n">append</span><span class="p">([</span><span class="n">path</span><span class="p">]);</span>
<span class="n">task_event_sender</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">Match</span><span class="p">(</span><span class="n">paths</span><span class="p">))</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">tos</span> <span class="o">=</span> <span class="n">stops</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span>
<span class="k">let</span> <span class="n">previous</span> <span class="o">=</span> <span class="n">search</span><span class="p">.</span><span class="n">stops</span><span class="p">();</span>
<span class="n">tos</span><span class="p">.</span><span class="n">retain</span><span class="p">(</span><span class="o">|</span><span class="n">stop</span><span class="o">|</span> <span class="o">!</span><span class="n">previous</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="n">stop</span><span class="p">));</span>
<span class="k">if</span> <span class="o">!</span><span class="n">search</span><span class="p">.</span><span class="n">services</span><span class="p">().</span><span class="n">contains</span><span class="p">(</span><span class="o">&</span><span class="n">name</span><span class="p">)</span> <span class="o">&&</span>
<span class="n">stops</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&</span><span class="n">search</span><span class="p">.</span><span class="n">from</span><span class="p">)</span> <span class="o">&&</span>
<span class="o">!</span><span class="n">tos</span><span class="p">.</span><span class="n">is_empty</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">searches</span> <span class="o">=</span> <span class="n">tos</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">map</span><span class="p">(</span> <span class="o">|</span><span class="n">to</span><span class="o">|</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">path</span> <span class="o">=</span> <span class="n">Path</span><span class="o">::</span><span class="n">new</span><span class="p">(</span>
<span class="n">search</span><span class="p">.</span><span class="n">from</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span>
<span class="n">to</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span>
<span class="n">name</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>
<span class="p">);</span>
<span class="n">search</span><span class="p">.</span><span class="n">add_path</span><span class="p">(</span><span class="n">path</span><span class="p">)</span>
<span class="p">}</span> <span class="p">).</span><span class="n">collect</span><span class="p">();</span>
<span class="n">task_event_sender</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">Partial</span><span class="p">(</span><span class="n">searches</span><span class="p">));</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">task_event_sender</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">Done</span><span class="p">(</span><span class="n">name</span><span class="p">.</span><span class="n">clone</span><span class="p">()));</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">Finish</span> <span class="o">=></span> <span class="p">{</span> <span class="k">break</span><span class="p">;</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span> <span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span>
</pre></div>
<p>You're pretty much looking at a service task here (the whole part inside the call to <code>spawn()</code>). Inside, they are just an endless <code>loop</code> calling <code>recv()</code> to get new <code>Work</code> wrapped <code>Search</code> objects from the channel. The first <code>if</code> branch inside the <code>match</code> of <code>Work(search)</code> handles the simple case of the service matching the <code>Search</code> exactly.</p>
<p>When the <code>else</code> branch is selected, because we don't have a direct match, some work is done to see if a partial match is possible. (This is a foolish algorithm, by the way. It does partial matches if it can directly match the <code>from</code> endpoint and not the <code>to</code>. This rules out some viable scenarios, but it helped to keep this already large example smaller.)</p>
<p>If a partial match is found, it's transformed into a list of new searches to try that may later find direct matches or more partial matches. When no direct or partial match is found, the <code>Done</code> flag is sent back so <code>TaskManager</code> knows to stop waiting on this task.</p>
<p>The <code>Finish</code> <code>match</code> clause just breaks out of the <code>loop</code> as described previously. Outside of <code>spawn()</code> is a simple loop that creates each task and some code that prepares the variables for the task to capture.</p>
<div class="highlight highlight-rust"><pre><span class="k">impl</span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="k">fn</span> <span class="n">wait_for_services</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">)</span> <span class="p">{</span>
<span class="k">loop</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">self</span><span class="p">.</span><span class="n">event_receiver</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span> <span class="p">{</span>
<span class="n">Match</span><span class="p">(</span><span class="n">paths</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">paths</span><span class="p">.</span><span class="n">last</span><span class="p">().</span><span class="n">expect</span><span class="p">(</span><span class="s">"No path"</span><span class="p">).</span><span class="n">service</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span>
<span class="n">self</span><span class="p">.</span><span class="n">tracker</span><span class="p">.</span><span class="n">mark_done</span><span class="p">(</span><span class="n">name</span><span class="p">);</span>
<span class="k">let</span> <span class="n">path_string</span> <span class="o">=</span> <span class="n">paths</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">skip</span><span class="p">(</span><span class="m">1</span><span class="p">).</span><span class="n">fold</span><span class="p">(</span>
<span class="n">paths</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="n">to_string</span><span class="p">(),</span>
<span class="o">|</span><span class="n">s</span><span class="p">,</span> <span class="n">p</span><span class="o">|</span> <span class="n">s</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">to_string</span><span class="p">().</span><span class="n">as_slice</span><span class="p">().</span><span class="n">slice_from</span><span class="p">(</span><span class="m">1</span><span class="p">))</span>
<span class="p">);</span>
<span class="n">println</span><span class="o">!</span><span class="p">(</span><span class="s">"Path: {}"</span><span class="p">,</span> <span class="n">path_string</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">Partial</span><span class="p">(</span><span class="n">searches</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">searches</span><span class="p">.</span><span class="n">last</span><span class="p">()</span>
<span class="p">.</span><span class="n">expect</span><span class="p">(</span><span class="s">"No search"</span><span class="p">)</span>
<span class="p">.</span><span class="n">paths</span>
<span class="p">.</span><span class="n">last</span><span class="p">()</span>
<span class="p">.</span><span class="n">expect</span><span class="p">(</span><span class="s">"No path"</span><span class="p">)</span>
<span class="p">.</span><span class="n">service</span>
<span class="p">.</span><span class="n">clone</span><span class="p">();</span>
<span class="n">self</span><span class="p">.</span><span class="n">tracker</span><span class="p">.</span><span class="n">mark_done</span><span class="p">(</span><span class="n">name</span><span class="p">);</span>
<span class="k">for</span> <span class="n">search</span> <span class="n">in</span> <span class="n">searches</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span> <span class="p">{</span>
<span class="n">self</span><span class="p">.</span><span class="n">send_job</span><span class="p">(</span><span class="n">Work</span><span class="p">(</span><span class="n">search</span><span class="p">.</span><span class="n">clone</span><span class="p">()));</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">Done</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span> <span class="n">self</span><span class="p">.</span><span class="n">tracker</span><span class="p">.</span><span class="n">mark_done</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="p">}</span>
<span class="p">}</span>
<span class="k">if</span> <span class="n">self</span><span class="p">.</span><span class="n">tracker</span><span class="p">.</span><span class="n">is_done</span><span class="p">()</span> <span class="p">{</span> <span class="k">break</span><span class="p">;</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="p">}</span>
</pre></div>
<p>This chunk of code is the other half of the puzzle. It's another infinite <code>loop</code> listening on the <code>Event</code> channel. A full <code>Match</code> is pretty printed and a <code>Partial</code> is sent back out to the services in a wave of new searches. Regardless of the <code>Event</code> type, we record the response for the sending service, though where we find the service name varies by case. This allows us to exit this <code>loop</code> and the program when our <code>tracker</code> says we're done.</p>
<p>There's only one final method on <code>TaskManager</code> and it's what actually sends the messages to the service tasks, tracking each new <code>Search</code> as it goes out:</p>
<div class="highlight highlight-rust"><pre><span class="k">impl</span> <span class="n">TaskManager</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="k">fn</span> <span class="n">send_job</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span> <span class="n">self</span><span class="p">,</span> <span class="n">job</span><span class="o">:</span> <span class="n">Job</span><span class="p">)</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">job</span> <span class="p">{</span>
<span class="n">Work</span><span class="p">(</span><span class="n">_</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span> <span class="n">self</span><span class="p">.</span><span class="n">tracker</span><span class="p">.</span><span class="n">add_search</span><span class="p">();</span> <span class="p">}</span>
<span class="n">Finish</span> <span class="o">=></span> <span class="p">{</span> <span class="cm">/* do nothing */</span> <span class="p">}</span>
<span class="p">}</span>
<span class="n">self</span><span class="p">.</span><span class="n">multi_sender</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">job</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>You can find <a href="https://github.com/JEG2/learning_rust/blob/be1cb6cca05dbc92368073fd6c7d703df4b98350/pathfinder/pathfinder.rs">the full code</a> on GitHub.</p>James Edward Gray IISleepy Programstag:graysoftinc.com,2014-08-22:/posts/1252014-09-05T19:11:03ZCan we do modern message passing multiprocessing in Ruby across processes and thread? Sure can.<p>When we think of real multiprocessing, our thoughts probably drift more towards languages like Erlang, Go, Clojure, or Rust. Such languages really focus on getting separate "processes" to communicate via messages. This makes it a lot easier to know when one process is waiting on another, because calls to receive messages typically block until one is available.</p>
<p>But what about Ruby? Can we do intelligent process coordination in Ruby?</p>
<p>Yes, we can. The tools for it are more awkward though. It's easy to run into tricky edge cases and hard to code your way out of them correctly.</p>
<p>Let's play with an example to see how good we can make things. Here's what we will do:</p>
<ol>
<li>We will start one parent process that will <code>fork()</code> a single child process</li>
<li>The child will push three messages onto a RabbitMQ queue and <code>exit()</code>
</li>
<li>The parent will listen for three messages to arrive, then <code>exit()</code>
</li>
</ol><p>Here's a somewhat sloppy first attempt at solving this:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby</span>
<span class="nb">require</span> <span class="s2">"benchmark"</span>
<span class="nb">require</span> <span class="s2">"bunny"</span>
<span class="no">QUEUE_NAME</span> <span class="o">=</span> <span class="s2">"example"</span>
<span class="no">MESSAGES</span> <span class="o">=</span> <span class="sx">%w[first second third]</span>
<span class="k">def</span> <span class="nf">send_messages</span><span class="p">(</span><span class="o">*</span><span class="n">messages</span><span class="p">)</span>
<span class="n">connection</span> <span class="o">=</span> <span class="no">Bunny</span><span class="o">.</span><span class="n">new</span><span class="o">.</span><span class="n">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:start</span><span class="p">)</span>
<span class="n">exchange</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">create_channel</span><span class="o">.</span><span class="n">default_exchange</span>
<span class="n">messages</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">message</span><span class="o">|</span>
<span class="n">exchange</span><span class="o">.</span><span class="n">publish</span><span class="p">(</span><span class="n">message</span><span class="p">,</span> <span class="n">routing_key</span><span class="p">:</span> <span class="no">QUEUE_NAME</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="n">connection</span> <span class="o">=</span> <span class="no">Bunny</span><span class="o">.</span><span class="n">new</span><span class="o">.</span><span class="n">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:start</span><span class="p">)</span>
<span class="n">queue</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">create_channel</span><span class="o">.</span><span class="n">queue</span><span class="p">(</span><span class="no">QUEUE_NAME</span><span class="p">,</span> <span class="n">auto_delete</span><span class="p">:</span> <span class="kp">true</span><span class="p">)</span>
<span class="n">queue</span><span class="o">.</span><span class="n">subscribe</span> <span class="k">do</span> <span class="o">|</span><span class="n">delivery_info</span><span class="p">,</span> <span class="n">metadata</span><span class="p">,</span> <span class="n">payload</span><span class="o">|</span>
<span class="n">received_messages</span> <span class="o"><<</span> <span class="n">payload</span>
<span class="k">end</span>
<span class="n">time_it</span><span class="p">(</span><span class="s2">"Received </span><span class="si">#{</span><span class="no">MESSAGES</span><span class="o">.</span><span class="n">size</span><span class="si">}</span><span class="s2"> messages"</span><span class="p">)</span> <span class="k">do</span>
<span class="k">yield</span>
<span class="k">end</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">time_it</span><span class="p">(</span><span class="nb">name</span><span class="p">)</span>
<span class="n">elapsed</span> <span class="o">=</span> <span class="no">Benchmark</span><span class="o">.</span><span class="n">realtime</span> <span class="k">do</span>
<span class="k">yield</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">"%s: %.2fs"</span> <span class="o">%</span> <span class="o">[</span><span class="nb">name</span><span class="p">,</span> <span class="n">elapsed</span><span class="o">]</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="k">until</span> <span class="n">received_messages</span> <span class="o">==</span> <span class="no">MESSAGES</span>
<span class="nb">sleep</span> <span class="mi">0</span><span class="o">.</span><span class="mi">1</span> <span class="c1"># don't peg the CPU while we wait</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">send_and_receive</span>
<span class="n">pid</span> <span class="o">=</span> <span class="nb">fork</span> <span class="k">do</span>
<span class="nb">sleep</span> <span class="mi">3</span> <span class="c1"># make sure we're receiving before they are sent</span>
<span class="n">send_messages</span><span class="p">(</span><span class="o">*</span><span class="no">MESSAGES</span><span class="p">)</span>
<span class="k">end</span>
<span class="no">Process</span><span class="o">.</span><span class="n">detach</span><span class="p">(</span><span class="n">pid</span><span class="p">)</span>
<span class="n">received_messages</span> <span class="o">=</span> <span class="o">[</span> <span class="o">]</span>
<span class="n">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span> <span class="k">do</span>
<span class="n">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">send_and_receive</span>
</pre></div>
<p>Let's talk about each piece of this code real quick. You can mostly ignore the first two methods, <code>send_messages()</code> and <code>listen_for_messages()</code>. These are just wrappers over RabbitMQ's publish and subscribe process. The only tricky bit is that <code>listen_for_messages()</code> does a <code>yield</code> after subscribing to the queue. The reason for this is that subscribing just spins up a separate <code>Thread</code> which will call the passed block as messages arrive. That's happening in the background, which means the main <code>Thread</code> needs to find some way to wait until we have received the expected messages. The <code>yield</code> gives us a place to insert this waiting code.</p>
<p>The next two methods, <code>time_it()</code> and <code>wait_for_messages()</code>, are simple helpers. I added the first mainly to give us some noticeable output. The latter performs the waiting and checking discussed above.</p>
<p>The real action happens in <code>send_and_receive()</code>. This method should look a lot like the steps we defined earlier: <code>fork()</code> off a child, <code>send_messages()</code>, then <code>listen_for_messages()</code>.</p>
<p>Now this code has a couple of problems. One way to see them is to run it:</p>
<pre><code>$ ruby sleepy.rb
Received 3 messages: 3.03s
</code></pre>
<p>Doesn't three seconds sound a little slow for modern hardware communicating via a super efficient queuing system? Yeah, it is.</p>
<p>I actually put the sleeps in the code manually. Look for these two lines:</p>
<pre><code># ...
sleep 0.1 # don't peg the CPU while we wait
# ...
sleep 3 # make sure we're receiving before they are sent
# ...
</code></pre>
<p>Now it's obvious where the three second delay is coming from, eh? Let's talk about why I added that second <code>sleep()</code>.</p>
<p>The issue is that once we <code>fork()</code> that child process, it's off to the races. The parent process will continue running too, but we don't know who will get to what first. If the child fires off messages before the parent is listening for them, they will be missed. Instead we need the child to wail until the parent is ready to begin the experiment.</p>
<p>My three second sleep is one crude way to sort of handle this. I just delay the child for a significant period of time in computerland. Odds are that the parent will be setup by the time it starts sending. It could still fail though, if my machine was under heavy load at the time and it didn't give my parent process enough attention before the child woke up. Plus, it's slowing our experiment way down. In other words, this is a bad idea all around.</p>
<p>The good news is that we can fix it by making some semi-cryptic changes to just one method:</p>
<div class="highlight highlight-ruby"><pre><span class="k">def</span> <span class="nf">send_and_receive</span>
<span class="n">reader</span><span class="p">,</span> <span class="n">writer</span> <span class="o">=</span> <span class="no">IO</span><span class="o">.</span><span class="n">pipe</span>
<span class="n">pid</span> <span class="o">=</span> <span class="nb">fork</span> <span class="k">do</span>
<span class="n">writer</span><span class="o">.</span><span class="n">close</span>
<span class="n">reader</span><span class="o">.</span><span class="n">read</span>
<span class="n">reader</span><span class="o">.</span><span class="n">close</span>
<span class="n">send_messages</span><span class="p">(</span><span class="o">*</span><span class="no">MESSAGES</span><span class="p">)</span>
<span class="k">end</span>
<span class="no">Process</span><span class="o">.</span><span class="n">detach</span><span class="p">(</span><span class="n">pid</span><span class="p">)</span>
<span class="n">reader</span><span class="o">.</span><span class="n">close</span>
<span class="n">received_messages</span> <span class="o">=</span> <span class="o">[</span> <span class="o">]</span>
<span class="n">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span> <span class="k">do</span>
<span class="n">writer</span><span class="o">.</span><span class="n">puts</span> <span class="s2">"ready"</span>
<span class="n">writer</span><span class="o">.</span><span class="n">close</span>
<span class="n">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>As you can see, I've introduced a pipe. A pipe is a one-way communication channel between processes. You get an endpoint to write to and another to read from. After you <code>fork()</code>, it's good practice to have each side <code>close()</code> the end they're not using. Then I just have the child call <code>read()</code> on the pipe. This will block until the parent sends some content that can be read. The parent completes its setup, including subscribing to the queue, and then it pushes a simple <code>"ready"</code> message down the pipe. That will get the child unblocked and sending messages.</p>
<p>Does this change help? Yes, a lot:</p>
<pre><code>$ ruby sleepy.rb
Received 3 messages: 0.10s
</code></pre>
<p>We're three seconds faster.</p>
<p>Unfortunately, the remaining delay looks suspiciously like my other call to <code>sleep()</code>. Here's that code to refresh your memory:</p>
<div class="highlight highlight-ruby"><pre><span class="k">def</span> <span class="nf">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="k">until</span> <span class="n">received_messages</span> <span class="o">==</span> <span class="no">MESSAGES</span>
<span class="nb">sleep</span> <span class="mi">0</span><span class="o">.</span><span class="mi">1</span> <span class="c1"># don't peg the CPU while we wait</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>This loop just periodically checks to see if we have our three messages yet. We could technically remove the call to <code>sleep()</code> here and it would run. However, it would waste a lot of CPU time just checking these messages over and over again as fast as possible. Ironically, that bid for speed might starve the child process of resources and slow things down. So we kind of need the <code>sleep()</code>, or something like it.</p>
<p>But the problem remains that we're likely getting our messages very quickly and then just waiting for a <code>sleep()</code> call to run out so we notice they have arrived. We can do better with one simple change:</p>
<div class="highlight highlight-ruby"><pre><span class="k">def</span> <span class="nf">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="n">connection</span> <span class="o">=</span> <span class="no">Bunny</span><span class="o">.</span><span class="n">new</span><span class="o">.</span><span class="n">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:start</span><span class="p">)</span>
<span class="n">queue</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">create_channel</span><span class="o">.</span><span class="n">queue</span><span class="p">(</span><span class="no">QUEUE_NAME</span><span class="p">,</span> <span class="n">auto_delete</span><span class="p">:</span> <span class="kp">true</span><span class="p">)</span>
<span class="n">main_thread</span> <span class="o">=</span> <span class="no">Thread</span><span class="o">.</span><span class="n">current</span>
<span class="n">queue</span><span class="o">.</span><span class="n">subscribe</span> <span class="k">do</span> <span class="o">|</span><span class="n">delivery_info</span><span class="p">,</span> <span class="n">metadata</span><span class="p">,</span> <span class="n">payload</span><span class="o">|</span>
<span class="n">received_messages</span> <span class="o"><<</span> <span class="n">payload</span>
<span class="n">main_thread</span><span class="o">.</span><span class="n">wakeup</span>
<span class="k">end</span>
<span class="n">time_it</span><span class="p">(</span><span class="s2">"Received </span><span class="si">#{</span><span class="no">MESSAGES</span><span class="o">.</span><span class="n">size</span><span class="si">}</span><span class="s2"> messages"</span><span class="p">)</span> <span class="k">do</span>
<span class="k">yield</span>
<span class="k">end</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span>
<span class="k">end</span>
</pre></div>
<p>The difference here is that I capture the <code>main_thread</code> before I setup my subscription. Remember, that block will be called in a different <code>Thread</code>. Then, each time I receive a message, I cancel any <code>sleep()</code> the <code>main_thread</code> is currently doing with a call to <code>wakeup()</code>. This means it will recheck, when it should, as new messages arrive.</p>
<p>That gives us another significant speed boost:</p>
<pre><code>$ ruby sleepy.rb
Received 3 messages: 0.01s
</code></pre>
<p>I would probably stop here, but I should warn you that my solution isn't perfect. Some might be tempted to take this final step:</p>
<div class="highlight highlight-ruby"><pre><span class="k">def</span> <span class="nf">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">)</span>
<span class="k">until</span> <span class="n">received_messages</span> <span class="o">==</span> <span class="no">MESSAGES</span>
<span class="nb">sleep</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>Here the short <code>sleep()</code> has been changed into an indefinite one. You would think this is OK, because the other <code>Thread</code> will wake us when the time comes. Sadly, it's not because my last fix added a race condition. Consider what would happen if the <code>Thread</code>s executed code in this order:</p>
<div class="highlight highlight-ruby"><pre><span class="c1"># ...</span>
<span class="c1"># first the main thread checks, but finds only two of the three messages:</span>
<span class="k">until</span> <span class="n">received_messages</span> <span class="o">==</span> <span class="no">MESSAGES</span>
<span class="c1"># ...</span>
<span class="c1"># then the listening thread queues the final message and wakes the main</span>
<span class="c1"># thread (this has no effect since it isn't currently sleeping):</span>
<span class="n">received_messages</span> <span class="o"><<</span> <span class="n">payload</span>
<span class="n">main_thread</span><span class="o">.</span><span class="n">wakeup</span>
<span class="c1"># ...</span>
<span class="c1"># finally the main thread goes back to sleep, forever:</span>
<span class="nb">sleep</span>
</pre></div>
<p>As long as you leave my short <code>sleep</code>, you'll only pay a small penalty if this edge case does kick in.</p>
<p>Could we ensure it didn't happen though? Yes, with more message passing! Here's the final code:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby</span>
<span class="nb">require</span> <span class="s2">"benchmark"</span>
<span class="nb">require</span> <span class="s2">"thread"</span>
<span class="nb">require</span> <span class="s2">"bunny"</span>
<span class="no">QUEUE_NAME</span> <span class="o">=</span> <span class="s2">"example"</span>
<span class="no">MESSAGES</span> <span class="o">=</span> <span class="sx">%w[first second third]</span>
<span class="k">def</span> <span class="nf">send_messages</span><span class="p">(</span><span class="o">*</span><span class="n">messages</span><span class="p">)</span>
<span class="n">connection</span> <span class="o">=</span> <span class="no">Bunny</span><span class="o">.</span><span class="n">new</span><span class="o">.</span><span class="n">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:start</span><span class="p">)</span>
<span class="n">exchange</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">create_channel</span><span class="o">.</span><span class="n">default_exchange</span>
<span class="n">messages</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">message</span><span class="o">|</span>
<span class="n">exchange</span><span class="o">.</span><span class="n">publish</span><span class="p">(</span><span class="n">message</span><span class="p">,</span> <span class="n">routing_key</span><span class="p">:</span> <span class="no">QUEUE_NAME</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">,</span> <span class="n">check_queue</span><span class="p">,</span> <span class="n">listen_queue</span><span class="p">)</span>
<span class="n">connection</span> <span class="o">=</span> <span class="no">Bunny</span><span class="o">.</span><span class="n">new</span><span class="o">.</span><span class="n">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:start</span><span class="p">)</span>
<span class="n">queue</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">create_channel</span><span class="o">.</span><span class="n">queue</span><span class="p">(</span><span class="no">QUEUE_NAME</span><span class="p">,</span> <span class="n">auto_delete</span><span class="p">:</span> <span class="kp">true</span><span class="p">)</span>
<span class="n">queue</span><span class="o">.</span><span class="n">subscribe</span> <span class="k">do</span> <span class="o">|</span><span class="n">delivery_info</span><span class="p">,</span> <span class="n">metadata</span><span class="p">,</span> <span class="n">payload</span><span class="o">|</span>
<span class="n">received_messages</span> <span class="o"><<</span> <span class="n">payload</span>
<span class="n">check_queue</span> <span class="o"><<</span> <span class="ss">:check</span>
<span class="n">listen_queue</span><span class="o">.</span><span class="n">pop</span>
<span class="k">end</span>
<span class="n">time_it</span><span class="p">(</span><span class="s2">"Received </span><span class="si">#{</span><span class="no">MESSAGES</span><span class="o">.</span><span class="n">size</span><span class="si">}</span><span class="s2"> messages"</span><span class="p">)</span> <span class="k">do</span>
<span class="k">yield</span>
<span class="k">end</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">time_it</span><span class="p">(</span><span class="nb">name</span><span class="p">)</span>
<span class="n">elapsed</span> <span class="o">=</span> <span class="no">Benchmark</span><span class="o">.</span><span class="n">realtime</span> <span class="k">do</span>
<span class="k">yield</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">"%s: %.2fs"</span> <span class="o">%</span> <span class="o">[</span><span class="nb">name</span><span class="p">,</span> <span class="n">elapsed</span><span class="o">]</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">,</span> <span class="n">check_queue</span><span class="p">,</span> <span class="n">listen_queue</span><span class="p">)</span>
<span class="kp">loop</span> <span class="k">do</span>
<span class="n">check_queue</span><span class="o">.</span><span class="n">pop</span>
<span class="k">break</span> <span class="k">if</span> <span class="n">received_messages</span> <span class="o">==</span> <span class="no">MESSAGES</span>
<span class="n">listen_queue</span> <span class="o"><<</span> <span class="ss">:listen</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">send_and_receive</span>
<span class="n">reader</span><span class="p">,</span> <span class="n">writer</span> <span class="o">=</span> <span class="no">IO</span><span class="o">.</span><span class="n">pipe</span>
<span class="n">pid</span> <span class="o">=</span> <span class="nb">fork</span> <span class="k">do</span>
<span class="n">writer</span><span class="o">.</span><span class="n">close</span>
<span class="n">reader</span><span class="o">.</span><span class="n">read</span>
<span class="n">reader</span><span class="o">.</span><span class="n">close</span>
<span class="n">send_messages</span><span class="p">(</span><span class="o">*</span><span class="no">MESSAGES</span><span class="p">)</span>
<span class="k">end</span>
<span class="no">Process</span><span class="o">.</span><span class="n">detach</span><span class="p">(</span><span class="n">pid</span><span class="p">)</span>
<span class="n">reader</span><span class="o">.</span><span class="n">close</span>
<span class="n">received_messages</span> <span class="o">=</span> <span class="o">[</span> <span class="o">]</span>
<span class="n">check_queue</span> <span class="o">=</span> <span class="no">Queue</span><span class="o">.</span><span class="n">new</span>
<span class="n">listen_queue</span> <span class="o">=</span> <span class="no">Queue</span><span class="o">.</span><span class="n">new</span>
<span class="n">listen_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">,</span> <span class="n">check_queue</span><span class="p">,</span> <span class="n">listen_queue</span><span class="p">)</span> <span class="k">do</span>
<span class="n">writer</span><span class="o">.</span><span class="n">puts</span> <span class="s2">"ready"</span>
<span class="n">writer</span><span class="o">.</span><span class="n">close</span>
<span class="n">wait_for_messages</span><span class="p">(</span><span class="n">received_messages</span><span class="p">,</span> <span class="n">check_queue</span><span class="p">,</span> <span class="n">listen_queue</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">send_and_receive</span>
</pre></div>
<p>Look Ma, no <code>sleep()</code>!</p>
<p>My changes here are very similar to the earlier pipe trick, only I used a <code>Thread</code>-safe <code>Queue</code>. The <code>pop()</code> method of a <code>Queue</code> will block waiting just like <code>IO</code>'s <code>read()</code> did. I also had to introduce two <code>Queue</code>s, because I needed two-way communication. The listening <code>Thread</code> now tells the main <code>Thread</code> when it's time to check and it won't resume listening again until the main <code>Thread</code> gives approval.</p>
<p>I think this version is safe from race conditions and it doesn't wake up periodically to check things that haven't changed. It's also still as fast as the unsafe version.</p>
<p>If you must do safe multiprocessing, in any language, just pass messages.</p>James Edward Gray IIErlang Message Passingtag:graysoftinc.com,2007-08-13:/posts/362014-04-05T14:27:05ZLet's take a look at something another programming language does well and see how we might bring at least some elements these features to Ruby.<p>Like many <a href="http://www.pragmaticprogrammer.com/">Pragmatic Programmer</a> fans, I've been having a look at <a href="http://www.erlang.org/">Erlang</a> recently by working my way through <a href="http://www.pragmaticprogrammer.com/titles/jaerlang/">Programming Erlang</a>. In the book, the author includes a challenge: build a message ring of <em>processes</em> of size M and send a message around the ring N times, timing how long this takes. The author also suggests doing this in other languages and comparing the results. Having now done this, I can tell you that it is an interesting exercise.</p>
<p>First, the Erlang results. Here's a sample run that creates 30,000 processes and sends a message around that ring 1,000 times:</p>
<pre><code>$ erl -noshell -s solution start 30000 1000
Creating 30000 processes (32768 allowed)...
Done.
Timer started.
Sending a message around the ring 1000 times...
Done: success
Time in seconds: 29
</code></pre>
<p>So we see about 30,000,000 message passes there in roughly 30 seconds. I should also note that Erlang creates those processes very, very fast. It's possible to raise the process limit shown there, but I'm more interested in comparing what these languages can do out of the box.</p>
<p>Now Ruby doesn't have an equivalent to Erlang processes, so we need to decide what the proper replacement is. The first thing I tried was <code>fork()</code>ing some Unix processes:</p>
<pre><code>$ ruby forked_mring.rb 100 10000
Creating 100 processes...
Timer started.
Sending a message around the ring 10000 times...
Done.
Done: success.
Time in seconds: 32
</code></pre>
<p>You should notice here the small number of processes I could create using the default limits imposed by my operating system. Again, it's possible to raise this limit but I don't think I'm going to get it up to 30,000 very easily. I did get these processes very quickly though, again.</p>
<p>So here we are passing 1,000,000 messages in about the same amount of time.</p>
<p>In an attempt to bypass the low process limit, I wrote another implementation with Ruby's threads. The results of that aren't too impressive though:</p>
<pre><code>$ ruby threaded_mring.rb 100 1000
Using the standard Ruby thread library.
Creating 100 processes...
Timer started.
Sending a message around the ring 1000 times...
Done: success.
Time in seconds: 32
$ ruby threaded_mring.rb 1000 4
Using the standard Ruby thread library.
Creating 1000 processes...
Timer started.
Sending a message around the ring 4 times...
Done: success.
Time in seconds: 30
</code></pre>
<p>You should see from the second run that it is possible to create quite a few more threads, but I need to mention that creating that many took around 15 seconds. Sadly, both of these runs paint an ugly picture: introducing synchronization just kills performance. Using the fastthread library doesn't help as much as we would like:</p>
<pre><code>$ ruby -rubygems threaded_mring.rb 100 1000
Using the fastthread library.
Creating 100 processes...
Timer started.
Sending a message around the ring 1000 times...
Done: success.
Time in seconds: 28
$ ruby -rubygems threaded_mring.rb 1000 5
Using the fastthread library.
Creating 1000 processes...
Timer started.
Sending a message around the ring 5 times...
Done: success.
Time in seconds: 29
</code></pre>
<p>So at best, we're passing 100,000 and 5,000 messages in our roughly 30 second timeframe, depending on how many processes we need.</p>
<p>Am I suggesting we all switch to Erlang? No. I've enjoyed seeing how the other side lives and I've learned a lot from getting into the functional mindset. Parts of Erlang are very impressive and concurrency is definitely one of them. It hasn't been enough to win me over from Ruby yet though. I couldn't ever see myself doing my day to day work without The Red Lady.</p>
<p>What I would love to see is some way to manage Erlang-like concurrency in Ruby. We could have some great fun building servers with that, I think.</p>
<p>I'll share the Erlang code here so the people that know the language better than me can provide corrections. First, here's the code the spawns processes and passes messages:</p>
<div class="highlight highlight-erlang"><pre><span class="p">-</span><span class="ni">module</span><span class="p">(</span><span class="n">mring</span><span class="p">).</span>
<span class="p">-</span><span class="ni">export</span><span class="p">([</span><span class="n">build</span><span class="o">/</span><span class="mi">1</span><span class="p">,</span> <span class="n">send_and_receive</span><span class="o">/</span><span class="mi">2</span><span class="p">,</span> <span class="n">round_and_round</span><span class="o">/</span><span class="mi">3</span><span class="p">]).</span>
<span class="nf">build</span><span class="p">(</span><span class="nv">RingSize</span><span class="p">)</span> <span class="o">-></span>
<span class="nv">ParentPid</span> <span class="o">=</span> <span class="n">self</span><span class="p">(),</span>
<span class="nb">spawn</span><span class="p">(</span><span class="k">fun</span><span class="p">()</span> <span class="o">-></span> <span class="n">build</span><span class="p">(</span><span class="nv">RingSize</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="nv">ParentPid</span><span class="p">)</span> <span class="k">end</span><span class="p">).</span>
<span class="nf">build</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nv">StartPid</span><span class="p">)</span> <span class="o">-></span> <span class="n">forward</span><span class="p">(</span><span class="nv">StartPid</span><span class="p">);</span>
<span class="nf">build</span><span class="p">(</span><span class="nv">RingSize</span><span class="p">,</span> <span class="nv">StartPid</span><span class="p">)</span> <span class="o">-></span>
<span class="nv">ChildPid</span> <span class="o">=</span> <span class="nb">spawn</span><span class="p">(</span><span class="k">fun</span><span class="p">()</span> <span class="o">-></span> <span class="n">build</span><span class="p">(</span><span class="nv">RingSize</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="nv">StartPid</span><span class="p">)</span> <span class="k">end</span><span class="p">),</span>
<span class="n">forward</span><span class="p">(</span><span class="nv">ChildPid</span><span class="p">).</span>
<span class="nf">forward</span><span class="p">(</span><span class="nv">Pid</span><span class="p">)</span> <span class="o">-></span>
<span class="k">receive</span>
<span class="p">{</span><span class="n">message</span><span class="p">,</span> <span class="nv">Text</span><span class="p">,</span> <span class="nv">PassCount</span><span class="p">}</span> <span class="o">-></span>
<span class="nv">Pid</span> <span class="o">!</span> <span class="p">{</span><span class="n">message</span><span class="p">,</span> <span class="nv">Text</span><span class="p">,</span> <span class="nv">PassCount</span> <span class="o">+</span> <span class="mi">1</span><span class="p">},</span>
<span class="n">forward</span><span class="p">(</span><span class="nv">Pid</span><span class="p">)</span>
<span class="k">end</span><span class="p">.</span>
<span class="nf">send_and_receive</span><span class="p">(</span><span class="nv">Ring</span><span class="p">,</span> <span class="nv">Text</span><span class="p">)</span> <span class="o">-></span>
<span class="nv">Ring</span> <span class="o">!</span> <span class="p">{</span><span class="n">message</span><span class="p">,</span> <span class="nv">Text</span><span class="p">,</span> <span class="mi">0</span><span class="p">},</span>
<span class="k">receive</span> <span class="nv">Returned</span> <span class="o">-></span> <span class="nv">Returned</span> <span class="k">end</span><span class="p">.</span>
<span class="nf">round_and_round</span><span class="p">(_,</span> <span class="p">_,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">-></span> <span class="n">success</span><span class="p">;</span>
<span class="nf">round_and_round</span><span class="p">(</span><span class="nv">Ring</span><span class="p">,</span> <span class="nv">ProcessCount</span><span class="p">,</span> <span class="nv">Repeat</span><span class="p">)</span> <span class="o">-></span>
<span class="nv">Check</span> <span class="o">=</span> <span class="s">"Checking the ring..."</span><span class="p">,</span>
<span class="k">case</span> <span class="n">send_and_receive</span><span class="p">(</span><span class="nv">Ring</span><span class="p">,</span> <span class="s">"Checking the ring..."</span><span class="p">)</span> <span class="k">of</span>
<span class="p">{</span><span class="n">message</span><span class="p">,</span> <span class="nv">Check</span><span class="p">,</span> <span class="nv">ProcessCount</span><span class="p">}</span> <span class="o">-></span>
<span class="n">round_and_round</span><span class="p">(</span><span class="nv">Ring</span><span class="p">,</span> <span class="nv">ProcessCount</span><span class="p">,</span> <span class="nv">Repeat</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
<span class="nv">Unexpected</span> <span class="o">-></span> <span class="p">{</span><span class="n">failure</span><span class="p">,</span> <span class="nv">Unexpected</span><span class="p">}</span>
<span class="k">end</span><span class="p">.</span>
</pre></div>
<p>Next, we have a little helper I wrote to time things. I'm pretty confident there must be a better way to do this, but all my attempts to find it failed. Help me Erlang Jedi:</p>
<div class="highlight highlight-erlang"><pre><span class="p">-</span><span class="ni">module</span><span class="p">(</span><span class="n">stopwatch</span><span class="p">).</span>
<span class="p">-</span><span class="ni">export</span><span class="p">([</span><span class="n">time_this</span><span class="o">/</span><span class="mi">1</span><span class="p">,</span> <span class="n">time_and_print</span><span class="o">/</span><span class="mi">1</span><span class="p">]).</span>
<span class="nf">time_this</span><span class="p">(</span><span class="nv">Fun</span><span class="p">)</span> <span class="o">-></span>
<span class="p">{</span><span class="nv">StartMega</span><span class="p">,</span> <span class="nv">StartSec</span><span class="p">,</span> <span class="nv">StartMicro</span><span class="p">}</span> <span class="o">=</span> <span class="n">now</span><span class="p">(),</span>
<span class="nv">Fun</span><span class="p">(),</span>
<span class="p">{</span><span class="nv">EndMega</span><span class="p">,</span> <span class="nv">EndSec</span><span class="p">,</span> <span class="nv">EndMicro</span><span class="p">}</span> <span class="o">=</span> <span class="n">now</span><span class="p">(),</span>
<span class="p">(</span><span class="nv">EndMega</span> <span class="o">*</span> <span class="mi">1000000</span> <span class="o">+</span> <span class="nv">EndSec</span> <span class="o">+</span> <span class="nv">EndMicro</span> <span class="ow">div</span> <span class="mi">1000000</span><span class="p">)</span> <span class="o">-</span>
<span class="p">(</span><span class="nv">StartMega</span> <span class="o">*</span> <span class="mi">1000000</span> <span class="o">+</span> <span class="nv">StartSec</span> <span class="o">+</span> <span class="nv">StartMicro</span> <span class="ow">div</span> <span class="mi">1000000</span><span class="p">).</span>
<span class="nf">time_and_print</span><span class="p">(</span><span class="nv">Fun</span><span class="p">)</span> <span class="o">-></span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span><span class="s">"Timer started.</span><span class="si">~n</span><span class="s">"</span><span class="p">),</span>
<span class="nv">Time</span> <span class="o">=</span> <span class="n">time_this</span><span class="p">(</span><span class="nv">Fun</span><span class="p">),</span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span><span class="s">"Time in seconds: </span><span class="si">~p~n</span><span class="s">"</span><span class="p">,</span> <span class="p">[</span><span class="nv">Time</span><span class="p">]).</span>
</pre></div>
<p>Finally we have the application code that glues these modules together:</p>
<div class="highlight highlight-erlang"><pre><span class="p">-</span><span class="ni">module</span><span class="p">(</span><span class="n">solution</span><span class="p">).</span>
<span class="p">-</span><span class="ni">export</span><span class="p">([</span><span class="n">start</span><span class="o">/</span><span class="mi">1</span><span class="p">]).</span>
<span class="nf">start</span><span class="p">([</span><span class="nv">ProcessesArg</span><span class="p">,</span> <span class="nv">CyclesArg</span><span class="p">])</span> <span class="o">-></span>
<span class="nv">Processes</span> <span class="o">=</span> <span class="nb">list_to_integer</span><span class="p">(</span><span class="nb">atom_to_list</span><span class="p">(</span><span class="nv">ProcessesArg</span><span class="p">)),</span>
<span class="nv">Cycles</span> <span class="o">=</span> <span class="nb">list_to_integer</span><span class="p">(</span><span class="nb">atom_to_list</span><span class="p">(</span><span class="nv">CyclesArg</span><span class="p">)),</span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span> <span class="s">"Creating </span><span class="si">~p</span><span class="s"> processes (</span><span class="si">~p</span><span class="s"> allowed)...</span><span class="si">~n</span><span class="s">"</span><span class="p">,</span>
<span class="p">[</span><span class="nv">Processes</span><span class="p">,</span> <span class="nn">erlang</span><span class="p">:</span><span class="nb">system_info</span><span class="p">(</span><span class="n">process_limit</span><span class="p">)]),</span>
<span class="nv">Ring</span> <span class="o">=</span> <span class="nn">mring</span><span class="p">:</span><span class="nf">build</span><span class="p">(</span><span class="nv">Processes</span><span class="p">),</span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span><span class="s">"Done.</span><span class="si">~n</span><span class="s">"</span><span class="p">),</span>
<span class="nn">stopwatch</span><span class="p">:</span><span class="nf">time_and_print</span><span class="p">(</span>
<span class="k">fun</span><span class="p">()</span> <span class="o">-></span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span><span class="s">"Sending a message around the ring </span><span class="si">~p</span><span class="s"> times...</span><span class="si">~n</span><span class="s">"</span><span class="p">,</span> <span class="p">[</span><span class="nv">Cycles</span><span class="p">]),</span>
<span class="nv">Result</span> <span class="o">=</span> <span class="nn">mring</span><span class="p">:</span><span class="nf">round_and_round</span><span class="p">(</span><span class="nv">Ring</span><span class="p">,</span> <span class="nv">Processes</span><span class="p">,</span> <span class="nv">Cycles</span><span class="p">),</span>
<span class="nn">io</span><span class="p">:</span><span class="nf">format</span><span class="p">(</span><span class="s">"Done: </span><span class="si">~p~n</span><span class="s">"</span><span class="p">,</span> <span class="p">[</span><span class="nv">Result</span><span class="p">])</span>
<span class="k">end</span>
<span class="p">),</span>
<span class="nn">init</span><span class="p">:</span><span class="nf">stop</span><span class="p">().</span>
</pre></div>
<p>You will see the Ruby solutions in <a href="http://www.rubyquiz.com/quiz135.html">this week's Ruby Quiz</a>.</p>James Edward Gray IIThe Ruby VM: Episode IIItag:graysoftinc.com,2007-04-27:/posts/332014-04-04T21:03:39ZIn this interview, we get the scoop on the past and future of Ruby's threading support.<p><strong>Let's talk a little about threading, since that's a significant change in the new VM. First, can you please explain the old threading model used in Ruby 1.8 and also the new threading model now used in Ruby 1.9?</strong></p>
<dl>
<dt>
<strong>Matz</strong>:</dt>
<dd>
<p>
Old threading model is the green thread, to provide universal
threading on every platform that Ruby runs. I think it was reasonable
decision 14 years ago, when I started developing Ruby. Time goes by
situation has changed. pthread or similar threading libraries are now
available on almost every platform. Even on old platforms, pth
library (a thread library which implements pthread API using setjmp
etc.) can provide green thread implementation.
</p>
<p>
Koichi decided to use native thread for YARV. I honor his decision.
Only regret I have is we couldn't have continuation support that used
our green thread internal structure. Koichi once told me it's not
impossible to implement continuation on YARV (with some restriction),
so I expect to have it again in the future. Although it certainly has
lower priority in 1.9 implementation.
</p>
<dd>
</dd>
</dd>
<dt>
<strong>ko1</strong>:</dt>
<dd>
<p>
Matz explained old one, so I show you YARV's thread model.
</p>
<p>
As you know, YARV support native thread. It means that you can run each
Ruby thread on each native thread concurrently.
</p>
<p>
It doesn't mean that every Ruby thread runs in parallel. YARV has
global VM lock (global interpreter lock) which only one running Ruby
thread has. This decision maybe makes us happy because we can run most
of the extensions written in C without any modifications.
</p>
<dd>
</dd>
</dd>
</dl><p><strong>Why was this change made? What's wrong with green threads?</strong></p>
<dl>
<dt>
<strong>Matz</strong>:</dt>
<dd>
<p>
Because green threads does not work well with libraries using native
threads. For example, Ruby/Tk has made huge effort to live along with
pthread.
</p>
<dd>
</dd>
</dd>
<dt>
<strong>ko1</strong>:</dt>
<dd>
<p>
Ruby's green (userlevel) thread implementation was too naive to run
fast. All machine stacks are copied when thread context switches. And
more important point is it's not easy to re-implement green thread on
YARV :)
</p>
<dd>
</dd>
</dd>
</dl><p><strong>What are the downsides to the native threads approach?</strong></p>
<dl>
<dt>
<strong>Matz</strong>:</dt>
<dd>
<p>
It is pretty difficult to implement continuation. Besides that, even
with native thread approach, no real concurrency can not be made due
to the global interpreter lock. Koichi is going to address this issue
by Multi-VM approach in the (near) future.
</p>
<dd>
</dd>
</dd>
<dt>
<strong>ko1</strong>:</dt>
<dd>
<p>
Yes, it has several problems. First is Performance problem (as you
know, I love to discuss about performance). Too create native thread is
too pricey. So you may use thread pool or so. And current trunk (YARV)
is not tuned on native thread, so I believe some unknown problems
around threads.
</p>
<p>
Second problem is portability. If your environment has pthread library,
but there are some difference from other pthread system in detail.
</p>
<p>
Third problem is absence of callcc (which is implemented with green
thread scheme) ... for some people :)
</p>
<p>
Programming on native thread has own difficulty. For example, on MacOS
X, exec() doesn't work (cause exception) if other threads are running
(one of portability problem). If we find critical problems on native
thread, I will make green thread version on trunk (YARV).
</p>
<dd>
</dd>
</dd>
</dl><p><strong>Are there plans to support other threading models in the future?</strong></p>
<dl>
<dt>
<strong>Matz</strong>:</dt>
<dd>
<p>
Other threading model, no. Win32 threads and pthreads are enough
burden for us to support. There might be other features to support
parallelism in the future, for example light-weight process a la
Erlang.
</p>
<p>
Koichi may have other idea(s) about supporting concurrency, such as
Multi-VM since he is the expert on it.
</p>
<dd>
</dd>
</dd>
<dt>
<strong>ko1</strong>:</dt>
<dd>
<p>
Parallel computing with Ruby is one of my main concern. There are some
way to do it, but running Ruby threads in parallel (without Giant VM
Lock) on a process is too difficult to support current C extension
libraries because of their synchronization problems.
</p>
<p>
As matz say, if we have multiple VM instance on a process, these VMs can
be run in parallel. I'll work on that theme in the near future (as my
research topic).
</p>
<p>
BTW, I wrote on last question, if there are many many problems on native
threads, I'll implement green thread. As you know, it's has some
benefit against native thread (lightweight thread creation, etc). It
will be lovely hack (FYI. my graduation thesis is to implement userlevel
thread library on our specific SMT CPU).
</p>
<p>
... Does anyone have interest to implement it?
</p>
<dd>
</dd>
</dd>
</dl>James Edward Gray II