The distributed solution process of GPS Dancer requires the computation of vector averages over the entire peer-to-peer network, in such a way that all peers end up with the same global mean vector. This is done by means of the "square dance" algorithm, that gave the system its name. The name GPS Dancer is also a little wordplay on making the reference frame *denser*, although not everybody seems to get that.

A conventional client-server approach would use a central server to which all Dancer peers upload their own vector. The server then computes the global average, and publishes the result to all clients. Especially in large networks, a central server represents a serious bottleneck. Its finite internet bandwidth imposes a limit on the maximum network size that can be handled in this way. The limits of such a centralized approach form the main reason why there are at present only a few hundred GPS stations in the formal ITRF (...the analysis *centres* can only handle a few hundred stations). GPS Dancer wants to avoid any form of central element, primarily to avoid processing bottlenecks.

**Figure 1**: client-server network

A less primitive (less centralized) accumulation method is offered by a kind of pyramid scheme.

In the example of Figure 2, a network of sixteen computers forms eight pairs (A). One computer of each pair downloads the vector from the other computer, and computes the average between its own vector and the vector from the other computer.

The eight computers that now hold the average of two original vectors can form four new exchange pairs (B). Again, one computer of each pair downloads the vector from the other, and the receiving computer can now find the average over four original contributions.

The four remaining computers form two new pairs, and the process is repeated. One computer in each pair ends up with the average over eight original vectors (C).

Finally, the two remaining computers represent a last exchange pair. One computer downloads the vector from the other and computes the average over all sixteen original contributions in the network (D).

Because all sixteen computers want to have this global average over all vectors, the exchange scheme is now reversed. We can imagine that all arrows in Figure 2 then point in the opposite direction, and that the downloads are performed in the order D-C-B-A. After this, all computers have the same global average over all sixteen original vectors.

This approach also works fine for a network size that is not a precise power of two. For any other size, the exchange cycle (A) would simply be incomplete: some computers have an exchange partner from which they download a vector, while others have not. This is not a problem, but it now becomes necessary to maintain the actual number of original vectors that have been averaged in any intermediate result. A *weighted* average is then performed in every step. For instance if one vector is the average over four original contributions, and the downloaded vector is based on five original contributioins (...as a result of an incomplete cycle "A"), we must compute the *weighted* average by computing: (four times vector 1 plus five times vector 2) divided by (four plus five). The new result is the correct average over the nine involved original contributions.

The workload and data traffic in this pyramid scheme are distributed a bit more evenly over the network than in Figure 1, but the design is still asymmetric (as indicated by the different colors of the computers in Figure 2). The computer that forms the top of the pyramid must do twice as much as the computer in the second-highest layer of the pyramid. In turn, this computer must do twice as much as those in the third-highest level, etc.

What is more worrying, is the fact that each computer in the network fully depends on all computers at higher levels in the pyramid. Ultimately the entire network depends on the single computer at the top of the pyramid, just as if it would be the central server of Figure 1. If this computer would go off-line for whatever reason, all other computers in the network will fail to obtain the correct global result. The top of the pyramid forms a *single-point-of-failure* in the network. However, each computer (apart from the ones at the very bottom) is the top of a sub-pyramid in which it forms a single point of failure to all computers below it.

In short, this single-pyramid exchange still forms a rather poor design. Too many things can go wrong, and there are many wait states for some computers while other computers must work very hard at the same time.

**Figure 2**: single pyramid accumulation

We can avoid the problems of the single pyramid by changing all data exchanges in Figure 2 from one-way downloads into two-way downloads and uploads. Each computer in any pair downloads the vector from the other computer in its pair, and both computers continue to form new pairs with another computer in the network. Because this regular changing of exchange pairs resembles square dancing, this scheme was soon called the square dance method. It will be explained via Figure 3.

The sixteen computers in our example network again build eight exchange pairs, but now each computer downloads the contribution from the other while uploading its own vector. All sixteen computers compute vector averages between their own vector and the incoming vector. This means that any two computers that form a pair end up with the same average, over the same two original contributions. Instead of sixteen original vectors, there are now only eight different vectors in the network. Each of these eight vectors is present at two different computers in the network (3A). The dotted lines in Figure 3 separate clusters of computers that have the same average vector, which is the average over all original contributions in the cluster.

New pairs are formed in Figure 3B among *all* computers, not just half of them, as in Figure 2B. The new pairs are formed *between* the pairs of Figure 3A, so that there is no risk of downloading information that is already included in a local average vector. Again, all computers upload their own vector while downloading the vector from the remote computer, and both computers in a pair compute the average of their own vector and the incoming vector. Because the two computers in any pair end up with the same vector, there are now only four different vectors left in the entire network. Each of these vectors is present at four different computers (Figure 3B), separated by the dotted lines.

Again, new pairs are formed between the clusters from the previous cycle. Another pairwise exchange is performed, and again all computers compute a new vector average. There are then only two different vectors left in the network (Figure 3C), and each of these vectors is present at eight different computers.

Finally, new pairs are formed between the two groups of eight computers. Vectors are exchanged by each pair, and each computer computes the average between the same two vectors. The result is that there is only one vector left in the entire network, which is the global average over all sixteen original contributions - just what we needed.

The advantages of the multiple-pyramid scheme over the single pyramid scheme are obvious:

- The overall duration is shorter, because
*each*computer finds the correct global average at the same time, after only four consecutive downloads. In Figure 2, the last eight computers receive the result only after eight consecutive transfers in the network. - All computers in the network perform exactly the same amount of computations, and have exactly the same amount of incoming and outgoing data traffic.
- There is a healthy level of redundancy in this design. After the first exchange cycle, each vector contribution has arrived at two different computers. If one of them would go off-line at that moment, its information is already present somewhere in the network. After two exchange cycles, there are four computers with the same information, implicitly forming each other's backup in case that one of the four goes off-line; etcetera.
- There are no single points of failure in the exchange, other than in the very first exchange cycle. However, if one computer fails at that moment, no other vector contributions are lost to the rest of the network.

**Figure 3**: Square dance exchange scheme

In the square dance exchange scheme every computer is the top of its own single-pyramid exchange scheme (of Figure 2). Every computer is also at the bottom of all sixteen simultaneous pyramid schemes, at the second level in eight pyramids, at the third level in four pyramids, etc.. Each computer holds the information from two computers after the first cycle (including its own contribution), from four computers after the second cycle, from eight computers after the third cycle, etc. Information spreads through the network at an exponential rate, and the number of contributions that each computer receives also grows exponentially.

Only log_{2}(*N*) successive vector exchanges are needed to compute the average over all *N* processes in the network, and have the result simultaneouslt at all N processes. This allows for very large networks: the data traffic in a network for 10,000 receivers is only twice that of a network with 100 receivers. The internet bandwidth required for processing 1,000,000 receivers in a single solution is only three times more than the bandwidth for 100 receivers.

The square dance approach above is easy to follow if the network size is just a power of two, so that all computers follow exactly the same exchange scheme. In practice, this will be a rare exception. What if there are 23 dancers in the room? Only 16 of them can join the square dance.

In the case of computers exchanging data, this problem can be solved rather easily. Before the start of the exchange process, the seven "spare" computers in the network upload their data contribution to seven other computers in the remaining network of suxteen computers. The receiving computers compute the average between their own vector and the incoming vector.

The contributions from all 23 computers are now held by a sub-network of sixteen computers. These sixteen can perform a square dance exchange of four cycles, with a twist. The number of vectors that is included in a certain average must be passed along with the vector itself, so that a correct weighted average can be performed in all exchange cycles. For instance, the correct weighted average in the first cycle may be (two times one vector plus one time the other vector) divided by three. The first vector is in fact the average over two original contributions, and must therefore count twice in the average over (now) three vectors.

That's it - the sixteen computers in the core network now run the square dance accumulation process, while the seven "spare" computers just sit and wait. After coompletion of the square dance accumulation by the core network, the spare computers download the global sum from the same computer to which they uploaded it initially, and end up with the same global sum (of all 23 original contributions) as the rest of the network.

As explained in the article on network housekeeping, all GPS Dancer processes hold an up-to-date network map that consists of the complete list of peers in the global network (thousands of them, if necessary). From the size of this list, each process also knows exactly how many nodes there are in the entire network. All Dancer peers are numbered consecutively from zero to N-1, where N is the network size.

The square dance processes can only be performed by a subnetwork with a size that is a power of two. Each Dancer process can easily determine the largest power of two that is not larger than N. For this purpose, it looks at the binary representation of the integer number N. For example

N = 423 = 0b**110100111**

The prefix 0b indicates binary numbers (in JAVA, and therefore in GPS Dancer). The size of the sub-network that will perform the square dance exchanges is 256 = 2^{8}. We call this the **core network**. This also implies that there will be eight square dance cycles for every *core node* in the core network. The remaining 167 network nodes are called* spare nodes*.

How does each Dancer process know with which other peers it must form exchange pairs, and in which order? This turns out to be remarkably straightforward. Each process knows its own integer node number, which is a number in the range 0 to 422 in our example (N = 423). The spare nodes are those that have node numbers above the size of the square dance exchange network, i.e. number in the range 256 ... 423 (the number 255 is the highes core node number). To find out if a process is a spare node or a core node, it only needs to look at the ninth bit of its node number: if this is a "1", the node number is larger than 255, so that the process represents a spare node. If the ninth bit is a "0", the node number is smaller than 256, therefore, the process is a core node.

To find the eight other core nodes with which our node must form exchange pairs, each node simply loops over the bits of its node number. By toggling one bit at a time, eight other integers are found:

Core node number m = 183 = 0b**010110111**

Toggling one bit at a time:

0b**1**10110111 = 439 (larger then network size N = 423, so no spare node will connect to this core node)

0b0**0**0110111 = 55

0b01**1**110111 = 247

0b010**0**10111 = 151

0b0101**0**0111 = 167

0b01011**1**111 = 191

0b010110**0**11 = 179

0b0101101**0**1 = 183

0b01011011**0 **= 182

The order of the exchanges for node 183 follow exactly the series of nodes found above, in this precise order. This is guaranteed to avoid multiple inclusions of the same information in more than one exchange (orthogonality of information), which can be easily seen as follows:

(1) In the first exchange cycle, the network of 256 nodes forms 128 pairs by toggling bit number 8 of each node number. The remaining 7 bits of the node numebrs represent integer numbers ranging from zero to 127, and these numbers identify uniquely the 128 separate exchange pairs. After the exchange cycle, the two computers in each of the 128 pairs have found the same vector, which is the average of the two original vectors in the pair.

(2) In the second exchange cycle, we form *pairs between pairs* from the first cycle by toggling the 7th bit of each node number. The remaining 6 bits represent integer numbers from zero to 63, each of which uniquely identifies a group of four computers (64 x 4 = 256). The bits 7 and 8 of each node number represent integer numbers in the range from zero to three, uniquely identifying the four computers in the four pairs. Bit 8 is zero for one pair, and 1 for the other pair. After this exchange cycle, all four computers in the two pairs end up with the same everage, which is the average over their four original computers.

(3) In the third exchange cycle, the five least significant bits represent integer numbers ranging from zero to 31, which uniquely identify 32 groups of eight computers. Within the groups of eight computers, there are four pairs; the bits 7 and 8 of each node number (representing integer numbers from zero to three) identify four pairs, and bit 6 identifies the two different computers within the pair.

In other words, the toggled bit "m" in each exchange cycle uniquely identifies the target computer to form a pair with. The "m-1" less significant bits in the node numbers are the same for all exchange pairs, but the more significant bits are always different. This means that all formed exchange pairs are "orthogonal", they can never include information that was already included in an earlier exchange cycle.

The GPS Dancer project started in 2007 as a voluntary project of a working group of the International Association of Geodesy.

The GPS Dancer system was named after its "square dance" exchange algorithm. Of course, it also wants to to make the GPS reference frame denser.

The Dancer on-line network became immune against internet connection problems by leaving the US marines, and becoming a pirate.