Wire-Cell fans

Brett Viren

2020-12-03 09:50

Of course, everyone is a fan of Wire-Cell. This post isn't about you. It's about a common pattern that is found in Wire-Cell data flow graphs and how this pattern is realized at the WCT C++ level and how to form it in WCT configuration.

This pattern we talk about is generally called a "fan" of which there are two variants: fanin and fanout.

fanout : a map of one input to a number of outputs of the same type
fanin : a map of a number of inputs of the same type to one output

Let's look at how fans are represented in Wire-Cell C++ and then in the configuration.

Fans at the C++ level

In Wire-Cell, a fanout node implements the IFanoutNode base and a fanin implements IFaninNode base. These assure the fan-side ports carry all the same type of data and that a fan multiplicity is honored by the graph edges.

Fan multiplicity

The fan multiplicity needs some detail.

By default, it is set by a C++ template argument to IDepoFanout or IDepoFanin. It is then used to set the size of the vectors returned by the output_types() or input_types() methods required by the base INode interface.

But in some cases, we want fanout and fanin nodes which may have the fan multiplicity set at runtime (ie, by configuration). This is done for example by DepoFanout by satisfying the requirement set by IDepoFanout that it must supply output_types(). It does that by accepting a "multiplicity" configuration argument which sets the size of the vector of output ports at run time.

Multiplicity Limitations

There is one last implementation detail.

When Wire-Cell uses the multi-threaded graph execution engine based on TBB the fanout nodes are mapped via the fanout category to a tbb::flow::split_node type and the fanin nodes are mapped via the fanin category to a tbb::flow::join_node.

This detail is important as there are two limits placed on the multiplicity.

At the WCT level, a translation between run-time and compile-time multiplicity is done in the TBB fanout category which exhaustively iterates over allowed sizes up to some max (currently 6 as of this writing). This limit can be trivially raised.

At the TBB level, the allowed maximum multiplicity depends on the version and how it is compiled. Older TBB versions may be limited to 15 or less. Newer TBB requires modern C++ and uses variadic templates and std::tuple and may have a higher limit.

When pushing beyond the current max fan multiplicity of 6, explicit testing will be needed. At the end we will give some pointers.

Fans at the config level

Wire-Cell provides a Jsonnet module pgraph.jsonnet which provides functions to help construct the flow graph configuration. In particular it provides some fan-related functions under the fan object. Below is a tour of its functions.

Caveat

This section presents some WCT configuration examples. While they are syntactically correct, they are not guaranteed to be fleshed out to properly provide all required information to be used to run a wire-cell job.

Atomic fan node

No special help is needed to make a simple, atomic fanout of fanin node. For example, consider:

local g = import "pgraph.jsonnet";
local fanout = g.pnode({
    type: "DepoFanout",
}, nin=1, nout=6);

This defines a 1-to-6 fanout of type DepoFanout. This is simple enough, but the complexity comes in when one wants to attach many nodes to the outputs. There the pgraph.jsonnet functions will help.

Construct a fanout subgraph

A fanout subgraph can be constructed with:

fan.fanout(fantype, pipelines, name="fanout", tag_rules=[])

This function constructs an aggregate pnode around an atomic fanout type, which it also constructs into an atomic pnode in the aggregate. The pipelines array hold pnodes, each of which will be attached to an output of the fan.

It is specifically tested in test-fan-out.jsonnet which can be visualized with:

$ jsonnet -J cfg cfg/test/test-fan-out.jsonnet > test-fan-out.json
$ wirecell-pgraph dotify --jpath=-1 test-fan-out.json test-fan-out.dot
$ dot -Tpng -o test-fan-out.png test-fan-out.dot

nil

In this example, the TrackDepos and DumpDepos nodes are added in order to produce a complete graph and are not required, per se, just for the fanout subgraph.

Construct a fanin subgraph

A fanin subgraph can be constructed with:

fan.fanin(fantype, pipelines, name="fanin", outtags=[])

This function creates the opposite pattern as fan.fanout(). It is tested in test-fan-in.jsonnet which can be visualized similarly to the above:

nil

Construct a "fan pipe"

A fan pipe can be constructed with:

fan.pipe(fout, pipelines, fin, name="fanpipe", outtags=[], tag_rules=[])

In the fan.fanout() example above, we had to manually make a fanin and vice versa for the fan.fanin() example in order to "cap off" all ports so that total graph is complete. The fan.pipe() function allows joining these two patterns to result in an aggregate with one input and one output (ie, a pipe).

This pattern is tested in test-fan-pipe.jsonnet which visualizes as:

nil

General subgraph construction

The special fan functions described above help by creating an aggregate node following a some special graph pattern. They all use the intern() function for the "heavy lifting".

If no special fan function or other helper function builds a pattern which is needed you may always use intern() directly. This is because the intern() function performs arbitrary connections between a set of "input nodes", "output nodes" and "internal nodes" in order to form a single "aggregate node". Here is its calling interface:

intern(innodes=[], outnodes=[], centernodes=[], 
       edges=[], iports=[], oports=[], name="")

As is shown, all arguments are optional. These arguments are define as:

innodes: list of pnode node objects which will automatically (see iports) have their input ports exposed as input nodes for the resulting aggregate node.
outnodes: as above but for output ports
centernodes: nodes which expose no ports outside the resulting aggregate node.
edges: edges between output ports of inputs nodes, vice versa, or between ports of centernodes and output ports of input nodes or input ports of output nodes.
iports: explicitly list the ports to expose on the resulting aggregate node. If empty, then it will automatic be determined from the innodes.
oports: etc for output ports.
name: the name of the resulting pnode.

If you end up needing to use intern() directly consider if your subgraph pattern is "specifically generic" enough (contradiction noted!). If so, consider adding a new helper function to pgraph.jsonnet so other people may benefit.

Multi-layer fans

When a fan multiplicity larger than the limit is required one can produce it from a "fan of fans", each of which keeps a multiplicity within the limit.

This pattern is exercised in the test file test-fan-fan.jsonnet and we'll close out this overly long blog post with an overly long visualization of this double-fanout:

nil

Wire-Cell News