Perl Threading

Written by Nikos Vaggalis

Wednesday, 06 April 2011

Article Index
Perl Threading
From single-threaded to multi-threaded
Using a message loop

Page 2 of 3

Choosing a language

As it was my own project, I was able to choose between two languages; C# or Perl. The winner was Perl, hands down.

Although C# would have allowed me to use rich GUI features and crisp graphics, easy threading and the .NET libraries, there was a fundamental issue - interoperating with the unmanaged C++ dll from C# was not easy in pre C# 4.0. You had to go through P/Invoke, map the corresponding C++ structures to C# ones, cross the managed environment, use unsafe pointers, while the GC would cause more harm than good (see fixed unsafe pointer type). One main reason that the dynamic type was introduced in C# 4.0 is easier interop with unmanaged libraries.

With Perl I had much easier interoperation with the C++ dll because of Perl's dynamic typing. Perl structures can be more easily mapped to the structures of the dll because the Perl 'pack' function maps and serializes the structure and then you can move it back and forth between the dll and the Perl client application. And of course I had the power of CPAN at my disposal.

Despite being a dynamic language which is heavily criticized for its performance since it late binds everything, in this case Perl's performance would, I believed, be better than using a static language like C#. As I am already working in an unmanaged environment when I access the C++ dll from within Perl, I don't have to marshal anything and cross boundaries from a managed environment to an unmanaged one.

Another decision was as to what programming approach to use, OOP or procedural? (I use the term 'procedural', not 'functional', since nowadays saying 'functional' is interpreted as relating to functional programming as in F#).

OOP in this case would involve more complexity and would be likely to hinder the approach, rather than benefit it. It is an application in which not everything can be wrapped as an object and as I needed a more direct low level approach I chose the procedural one.

The Tk toolkit

After Perl was chosen as language, the choice of the toolkit in which to do GUI development was an obvious one; the winner was the Tk toolkit whose integration with Perl and ease of use would allow for rapid GUI application development, although its graphics look archaic.

Of course GUIs experience thread affinity, which is a global property and not just a Tk one, but furthermore the Tk module is non-threadsafe which means that for example, you can't fire a thread from within an event callback (the "Free to wrong pool ....Perl/site/lib/Tk/Widget.pm during global destruction." error is all too common).

Single threaded - the flaw

Thus the first GUI version was single threaded which was all right but had one fundamental flaw; when the module was engaged in a long running operation such as extracting a big file, the Message Loop of the main window was essentially phased out hence events could not be processed and the GUI would freeze/become unresponsive, resulting in a blank screen:

unresponsive (Click on image to enlarge)

The workaround to this problem was to refresh the GUI using the Tk update() function (which is the equivalent of DoEvents() in VB.NET) forcing the Message Loop to handle any queued events. Update() had to be called in a loop from within the module's subroutine that does the file processing, creating the pseudo-sense that the GUI is responsive.

This approach is workable but can hide dangers if used improperly as described in 'Is DoEvents Evil?' on the Coding Horror website.

Avoiding those dangers was the main reason (apart from the HCI principles) that all buttons except 'Pause' and 'Resume' are disabled while file processing is in effect.

An additional issue is that update() is not nice to the rest of the threads running through the OS. I've already mentioned that update() had to be called in a loop inside the module. I could have hardcoded the update() call inside the module but that would hinder its universal intent; the module/library has to be independent because it can be called from client applications be it GUI driven ones or not.

There is one programming trick that can help here which is very effective and goes back to the C programming days; make the library call your own user-defined function. This is done with the so called 'callback' which is a function pointer to a user-defined function. The pointer is then passed to the module's subroutine as a parameter. In this case the callback is a wrapper around the update() Tk function and is called at intervals from within the loop that does the file processing. Thus we keep the library independence and overcome the GUI 'freezing', at least to a degree.

(gui.pl)

my $callback=sub {$gui::top->update()};

(Unrar.pm)
while (($RAR_functions{RARReadHeader}->
  Call($handle,$RARHeaderData_struct )) 
                                == 0 ) {
   $blockencryptedflag="yes";
   $callback->(@_) if defined($callback);

An additional advantage of single threading and tight coupling of the processing and GUI code is that pausing the module's operation was very easy. When the user presses the 'Pause' button, waitforvariable is called, pausing everything except the GUI which can still process events.

(gui.pl)
sub pause {
 $cont=0;
 $gui::pause->configure(-relief=>"sunken");

  while (1) {
     print "\n...PAUSED...\n";
     $top->waitVariable(\$cont);
     last;
  }

Multi-threaded

Although this improved the situation, it was not a complete remedy. I needed to use threads to separate concerns and also to decouple the GUI code from the actual processing code - the GUI had to be able to do its job at all times while a worker thread had to be doing all the file processing.

Note that one worker thread would suffice in this case because we are dealing with the hard drive seeking files and we wouldn't want to stress the head by moving it relentlessly around; that would be the case if we kept on spawning threads that did file processing on their own. Thus, we just have to spawn one worker thread which we reuse.

Of course the boss (GUI thread) and the worker thread had to be coordinated.

In view of the thread safety issues prevailing in Tk, the initial approach was to use user-defined windows WM_ messages. The worker thread would send a custom made windows message to the GUI thread which would retrieve it from its Message Loop and fire an event upon it. This in effect would resemble a raw form of the Background worker .NET component, where a callback is fired on a worker thread and upon completion it fires an event on the GUI thread.

This did not work because Tk uses its own window manager, not a native Win32 one, plus it does not provide low level access to the Tk window Windows Procedure which would allow mapping the custom message to an event handler (acting like a C++ MFC MESSAGE_MAP). This could be bypassed by implementing a custom low level hook to intercept all messages to the Tk window and handle them myself, but the complexity made the effort unworthwhile.

Instead, a much cleaner way, in the form of two threadsafe queues was chosen; the queues would be responsible of coordinating the threads and carry the messaging between them.

(Unrar_Extract_and_Recover.pl)
$boss_to_worker_queue=new Thread::Queue;
$worker_to_boss_queue=new Thread::Queue;

The boss/GUI thread passes the information needed to start processing to the worker thread by using the $boss_to_worker_queue :

(gui.pl)
my @messages=($input_dir_path,  
              $output_dir_path,
              $radio,$delete_files_var);
$main::boss_to_worker_queue->
                   enqueue(\ @messages);

while the worker thread processes the file and communicates the result of the processing back to the boss thread through the $worker_to_boss_queue:

(Unrar_Extract_and_Recover.pl)
my @messages=("update",undef);
$worker_to_boss_queue->
                    enqueue(\@messages);

The boss thread then reads the message and acts upon it:

(gui.pl)
if (my $queue_message=$main::
      worker_to_boss_queue->dequeue_nb) {
 my ($message,$no)=@$queue_message;

 given ($message) {
  when ("allfiles") {
    $gui::percent_done=0;
    $gui::progress->configure(-to => $no);

  when ("update") {
    $gui::percent_done += 1;
  }

  when ("end") {
    $gui::percent_done += 1;
    &enable_buttons;
  }

But how does the boss thread know when there is a message inside the queue waiting to be read while not blocking at the same time?

We certainly cannot have the checking done in a loop since that would make the GUI freeze. For that reason normally polling is employed. The boss thread polls at a given internal, and check the queues. This is done by using an alarm:

my $alarm = $mw->repeat( 200 => sub {
 check queue in while loop

But I have opted for another approach.

<< Prev - Next >>

Last Updated ( Monday, 13 March 2017 )