Cheetah C++ Interface Summary - Story so far
------------------------------------------
February 14, 2000


Cheetah repository
----------------

/home/cheetah/repository/cheetah


Directory structure
-------------------

-rw-r--r--    1 bfh      pooma        823 Nov  4 10:19 INSTALL
-rw-r--r--    1 bfh      pooma       6335 Nov  4 14:25 README
-rw-r--r--    1 bfh      pooma       2435 Nov  9 11:21 README.arg_checking
-rw-r--r--    1 bfh      pooma       3847 Nov  9 11:21 README.polymorphic
-rw-r--r--    1 bfh      pooma        739 Feb 12 19:11 config.mak
-rwxr-xr-x    1 bfh      pooma      12424 Feb 12 19:00 configure*
drwxr-sr-x    3 bfh      pooma       4096 Feb 12 19:16 controllers/
drwxr-sr-x    3 bfh      pooma        153 Feb 14 09:36 html/
drwxr-sr-x    3 bfh      pooma         64 Feb 12 19:16 lib/
-rw-r--r--    1 bfh      pooma        681 Feb 12 19:11 makefile
drwxr-sr-x    3 bfh      pooma       4096 Feb 12 19:16 matching/
drwxr-sr-x    8 bfh      pooma       4096 Feb 12 19:16 mm/
drwxr-sr-x    3 bfh      pooma         94 Feb 12 19:10 mpi/
drwxr-sr-x    3 bfh      pooma        143 Feb 12 19:16 shmem/
drwxr-sr-x    3 bfh      pooma       4096 Feb 12 19:17 tests/
drwxr-sr-x    3 bfh      pooma       4096 Feb 12 19:16 cheetah/
drwxr-sr-x    4 bfh      pooma       4096 Feb 12 19:16 util/


util/ ............ simple utilities used by the rest
cheetah/ ......... C and C++ routines implementing ainvoke, register, etc.
shmem/ ........... C routines implementing shared-memory version of ainvoke
mpi/ ............. C routines implementing MPI version of ainvoke
mm/ .............. External package 'MM' used by shmem/ routines
Controller/ ...... C++ "Controller" class - wrapper around shmem, mpi
MatchingHandler/ . C++ "MatchingHandler" class - queue facility for ainvoke's


In POOMA, we will (eventually) just use Controller and
MatchingHandler classes.  These are a C++ interface to the other
routines in the cheetah package.  Only exception will be the need to 
overload Cheetah_pack/cheetah_unpack routines for POOMA-related types.

Controller and MatchingHandler are defined within the namespace
'Cheetah'.  So are most of the rest of the C++ classes used in this
package.


Building
--------

Check out the repository, then run "configure --help".  This will list 
the configuration options.  Main selection options are whether to
include or exclude shmem and mpi code.  Right now there is an option
to build a static library, but this will not work at present (see
below for why).

After configure, run 'make'.  If you need to use shmem/, you will need 
to compile the 'mm' package, so you'll need to first cd to mm/ and run 
'make; make install'.  It will "install" the mm files into bin/, lib/, 
include/, etc. directories within the mm/ subdir.

When make is complete, you will have the files:
	lib/libcheetah.so
	lib/Makefile.cheetah

Look at Makefile.cheetah to see what makefile variables it defines.  You 
can include this stub in another package to get -I, -L, etc. compiler
settings.


Controller
-------------------

controllers/Controller.h defines the Controller class.
'Controllers' provide the C++ interface to the communication
operations in cheetah, like register, ainvoke, put, get, etc.  This
class is an alternative to using the cheetah_* routines specifically,
like cheetah_register, cheetah_ainvoke, etc.  That is one interface to the 
functionality, Controller is another.

Controller uses a Bridge pattern - there is a class
ControllerImpl that defines a virtual interface to the
communication operations, and Controller contains a ref-counted
pointer to an implementation instance.  The important public methods
in Controller are:

  bool isValid();

  int ncontexts() const;

  int mycontext() const;

  int registerHandler(Handler_t handler);

  void ainvoke(int context, int tag, void *buffer, int len,
	       int *local_bell);

  void put(int context, void *remote, void *local, int len,
	   int *local_bell, int *remote_bell);

  void get(int context, void *remote, void *local, int len,
	   int *local_bell, int *remote_bell);

  void poll();

  void wait(volatile int *bell, int value);

  void barrier();

Handler_t is a typedef for the very basic kind of routine that you can 
register with Controller.  It is:

  typedef void (*Handler_t)(int who, int tag, void* buf,int len);


The available controller implementations, so far, are (these are
defined in files in the controllers/ subdir):

MPIController .......... uses MPI and routines in cheetah/mpi/ dir
ShmemController ........ uses MM and routines in cheetah/shmem & mm dirs
SerialController ....... does not use any package, makes system look
                         like a single context.


Controller provides a factory mechanism to create specific
implementation object instances when you construct a Controller.
The current design requires you to provide an argc,argv pair to
Controller constructor, and you give a command-line option to
select what instance to create.  The options are currently defined as:

-shmem -np <N>
-mpi  (does not need -np since you select in mpirun args)
-serial

Because of this, you should never have to include any
implementation-specific files, you should only need to use:

#include "CheetahController.h"


The Controller will parse the command-line options, look for one
of these, and construct the right implementation instance.  If an
instance of that implementation already exists, it will just use that
(it is ref-counted).  So you can have any number of Controller
objects around.  Eventually I think we should have options to select
whether the implementations should initialize their library or not, or 
just assume it is initialized already.  We should also have
constructors to select what library to use that does not depend on
command-line options.

One issue with Controllers is how the implementation factory
works.  Each CheetahControllerImpl subclass currently provides a factory 
function, and registers that function with Controller via a
static object, like this:


  static CheetahControllerImpl*
  mpiControllerFactory(int &argc, char ** &argv)
  {
    return new MPIController(argc,argv);
  }

  MPIController::RegisterSubclass
  MPIController::dummy_s("-mpi",mpiControllerFactory);


However, this seems to only work right now if you
create a shared library.  Creating a static lib does not get all these 
static constructors run, and thus does not get the factory list set
up, and thus results in errors when you try to create your
Controller.  The options here that I see:
	1. Restrict this package to only create shared libs.
	2. Have some kind of initialize() routine for the controllers/ 
in general that will explicitly register the different controller
implementation factory functions.

In POOMA, we will have one Controller object that will be
initialized in our own initialize routine, and this controller will be 
used by the rest of POOMA for inter-context communication.

 
MatchingHandler
--------------

With Controller, you can register a handler function that can be
run via an "ainvoke" call from another context.  Each handler has a
unique tag value, and all the contexts must agree on the tag used for
these handler functions (usually by registering them in the same order 
on all contexts).  The "arguments" to the handler function are always
the same: a char buffer, and an integer describing the buffer length.

There are two limitations here that we need to address for POOMA:

1. We want to be able to pass more than one argument across the wire,
and have handler functions that take these multiple arguments of
different type.

2. We need to deal with the cases where a given context has its
handler functions invoked from other contexts in an unexpected order,
or before it is ready to have its handler run.

MatchingHandler is the class that helps to fix these limitations.  It
provides a queue of different "actions" to take when a message
arrives, and a queue of messages that have arrived but cannot yet be
processed.


You contruct a MatchingHandler with the controller it should use.  The 
different contexts should have the same number of MatchingHandler
objects and create them in the same order.  Each MatchingHandler
object keeps its own queue of actions and messages.  Then, you use
'send' methods within the handler to ainvoke methods on
the destination context, and 'receive' methods to set up handlers that
take arguments of various types.

The important interface for MatchingHandler is:

  Controller controller() const;
  void controller(Controller p);

  int actions(int fromContext);
  int messages(int fromContext);

  //
  // send methods
  //

  void send(int toContext, int matchingTag);

  template<class T1>
  void send(int toContext, int matchingTag, const T1 &x1);

  template<class T1, class T2>
  void send(int toContext, int matchingTag, const T1 &x1, const T2 &x2);

  template<class T1, class T2, class T3>
  void send(int toContext, int matchingTag, const T1 &x1, const T2 &x2,
	    const T3 &x3);

  template<class T1, class T2, class T3, class T4>
  void send(int toContext, int matchingTag, const T1 &x1, const T2 &x2,
	    const T3 &x3, const T4 &x4);

  //
  // receive methods
  //

  template<class Ret>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)());

  template<class Ret, class T1>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(T1));

  template<class Ret, class TL1>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(TL1), TL1 &x0);

  template<class Ret, class T1, class T2>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(T1, T2));

  template<class Ret, class TL1, class T1>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(TL1, T1), TL1 &x0);

  template<class Ret, class T1, class T2, class T3>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(T1, T2, T3));

  template<class Ret, class TL1, class TLH, class T1, class T2>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(TLH, T1, T2), TL1 &x0);

  template<class Ret, class T1, class T2, class T3, class T4>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(T1, T2, T3, T4));

  template<class Ret, class TL1, class T1, class T2, class T3>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(TL1, T1, T2, T3), TL1 &x0);


MatchingHandler::send
---------------------

When you 'send' from a MatchingHandler, you will eventually end up
invoking a method in some handler that was registered on the receiving 
side.  That handler will be one registered with the corresponding
MatchingHandler on the receiving context.  You provide a tag to
indicate what handler in the send call, this must match the tag value
given when the handler is registered.  But these tag values are only
within the MatchingHandler.  Different MatchingHandlers can have the
same sequences of tags and they will not interfere.

Each 'send' will do the following things:

	1. Serialize all the arguments into a single buffer, using
cheetah_pack functions.

	2. controller.ainvoke a handler on the receiving context.  The 
same handler is used by all MatchingHandlers.

'send' is overloaded to take from 0 ... 4 arguments.  The args are
templated.  You will need to make sure that the handler registered to
receive this data is of the proper type and is expecting the correct
number of arguments in the same order.


MatchingHandler::receive
------------------------

When you 'receive' from a MatchingHandler, you are not really
receiving data.  You are actually registering your own handler (aka an
"action" in MatchingHandler).  Right now, that handler must be a
function or static method.  It can take from 0 ... 4 arguments of
different types.  When you register this "action" function, you
specify what the calling context should be, and the tag value.  If a
message has previously arrived that matches the sender and tag, the
handler is immediately invoked.  If no message is available, the
function is queued up as an action to take when the proper message arrives.
The message must be something that was sent from the same
MatchingHandler on the sending context.

When a message arrives or is found that has an available action to
run, 'receive' will do the following things:

	1. From the buffer of received data, unserialize all the
arguments using cheetah_unpack functions.  The number and type of the
arguments is determined from the type of handler function that you
provided in the call to 'receive'.  Thus, if you do not match the
types here to the types being sent, you can have problems.

	2. Call the user-provided handler function with the unpacked
arguments.

	3. Remove the action from the list of registered actions.

If you look at the above interface to MatchingHandler, you will notice 
there are actually two forms of each receive(), for example:

  template<class Ret, class T1, class T2>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(T1, T2));

  template<class Ret, class TL1, class T1>
  inline void receive(int fromContext, int matchingTag,
		      Ret (*handler)(TL1, T1), TL1 &x0);


The first version takes three arguments: fromContext, tag, handler.
The second takes four arguments: fromContext, tag, handler, x0.  In
both cases you are registering a handler that takes two arguments.
The different is this:

	o in the first case, the 'send' call must have sent over two
	  arguments, of types T1 and T2.  These are unpacked and used
	  in this order to call the handler, like
		handler(*p1, *p2);

	o in the second case, the 'send' call must have actually sent
	  over just one argument, of type T1.  But the handler takes two
	  arguments.  The fourth argument to 'receive' is then used as 
	  the other value to call the handler.  Since this is provided
	  on the receive side, it is not sent across the wire.  All
	  that happens is that the "action" will save a reference to
	  x0, and use it to call the handler when the message arrives.
	  It is used as the FIRST argument to the handler, like
		handler(x0, *p1);


An Example
----------

This is example m2.cc in the cheetah/tests directory, with a few
lines removed.


#include <Controller/Controller.h>
#include <MatchingHandler/MatchingHandler.h>
#include <iostream.h>
#include <strings.h>

Controller controller;

int count1 = 0;
int count2 = 0;
int count3 = 0;
int sum1 = 0;
int sum2 = 0;
int sum3 = 0;


struct TestStruct
{
  int a_m;
  double b_m;
  TestStruct() : a_m(-1), b_m(-1.0) { }
  TestStruct(int a, double b) : a_m(a), b_m(b) { }
};


// A handler function taking one argument

void h1(int val)
{
  count1++;
  sum1 += val;
}


// A handler function taking two arguments

void h2(int val1, float val2)
{
  count2++;
  sum2 += val1;
}


// A handler function taking three arguments

void h3(TestStruct &val3, int val1, float val2)
{
  count3++;
  sum3 += val1;
}


int main(int argc, char **argv)
{
  int i;

  // Initialize the controller, based on the command-line arguments.

  controller = Controller(argc,argv);

  // Initialize our handlers.

  MatchingHandler m1(controller);
  MatchingHandler m2(controller);
  MatchingHandler m3(controller);

  // Send out some data to the other contexts.
  // For m1, send to ourselves.
  // For m2 and m3, do NOT send to ourselves.

  for (i=0; i < controller.ncontexts(); ++i)
    {
      int destination = i;
      int tag = controller.mycontext() + 10;
      float xval = 3.5 + i;

      m1.send(destination, tag, (-tag));

      if (i != controller.mycontext())
	{
	  m2.send(destination, tag, (-destination), xval);
	  m3.send(destination, tag, (-destination), xval);
	}
    }

  // Register receive handlers now.

  TestStruct ts(5, 10.5);

  int checksum1 = 0;
  int checksum2 = 0;
  int checksum3 = 0;

  for (i=0; i < controller.ncontexts(); ++i)
    {
      m1.receive(i, i + 10, h1);
      checksum1 += -(i + 10);

      if (i != controller.mycontext())
	{
	  m2.receive(i, i + 10, h2);
	  checksum2 += (-controller.mycontext());

	  m3.receive(i, i + 10, h3, ts);
	  checksum3 += (-controller.mycontext());
	}
    }

  // Wait for everything to complete.  It is complete when we receive
  // a message from all contexts into m1, and from all other contexts
  // into m2 and m3.

  int nc = controller.ncontexts();

  while (count1 < nc || count2 < (nc-1) || count3 < (nc-1))
    controller.poll();

  bool passed = (count1 == controller.ncontexts() && sum1 == checksum1);
  passed = passed && (count2==(controller.ncontexts() - 1) && sum2==checksum2);
  passed = passed && (count3==(controller.ncontexts() - 1) && sum3==checksum3);

  cout << controller.mycontext() << "> ";
  cout << (passed ? "PASSED" : "FAILED") << endl;
}
