In bioinformatics, neighbor joining is a bottom-up (agglomerative) clustering method for the creation of phylogenetic trees, created by Naruya Saitou and Masatoshi Nei in 1987. Usually based on DNA or protein sequence data, the algorithm requires knowledge of the distance between each pair of taxa (e.g., species or sequences) to create the phylogenetic tree. Neighbor joining takes a distance matrix, which specifies the distance between each pair of taxa, as input. The algorithm starts with a completely unresolved tree, whose topology corresponds to that of a star network, and iterates over the following steps, until the tree is completely resolved, and all branch lengths are known: Based on the current distance matrix, calculate a matrix (defined below). Find the pair of distinct taxa i and j (i.e. with ) for which is smallest. Make a new node that joins the taxa i and j, and connect the new node to the central node. For example, in part (B) of the figure at right, node u is created to join f and g. Calculate the distance from each of the taxa in the pair to this new node. Calculate the distance from each of the taxa outside of this pair to the new node. Start the algorithm again, replacing the pair of joined neighbors with the new node and using the distances calculated in the previous step. Based on a distance matrix relating the taxa, calculate the x matrix as follows: where is the distance between taxa and . For each of the taxa in the pair being joined, use the following formula to calculate the distance to the new node: and: Taxa and are the paired taxa and is the newly created node. The branches joining and and and , and their lengths, and are part of the tree which is gradually being created; they neither affect nor are affected by later neighbor-joining steps. For each taxon not considered in the previous step, we calculate the distance to the new node as follows: where is the new node, is the node which we want to calculate the distance to and and are the members of the pair just joined.
Hervé Bourlard, Afsaneh Asaei, Mohammadjavad Taghizadeh