Correction

Section 5 of this paper mentions an upper bound on the number of elementary trees rooted at a given node. That bound is so incorrect that I have no idea how I ended up with it. Thanks to Keith Hall for asking (on 2004-09-27; I replied the following day).

For example, if m=6, k=3, a given node is the root of at most 59 elementary trees, rather than at most 43 as claimed in the paper.

The correct bound is a generalization of the Catalan numbers. Fix m >= 1. There are precisely f(k) complete m-ary trees with exactly k internal (i.e., non-leaf) nodes, where f is defined by the recurrence

where gj is a helper function that defines a product of j terms: The familiar Catalan numbers are the case m=2, i.e., binary trees. (In that case, at least, f(k) has a closed form that is bounded above by 4k. I haven't studied the question for larger m.)

Thus, a given node in an infinite, complete m-ary tree T is the root of exactly f(0)+f(1)+...+f(k) elementary trees with up to k internal nodes. (Only the internal nodes are freely determined; all children of an internal node must be included as frontier nodes.)

In our situation, T is a finite, not necessarily complete m-ary tree. However, this only reduces the number of elementary trees rooted at each node, so f(0)+f(1)+...+f(k) is still an upper bound. For the example of m=6, k=3, f(0)+f(1)+f(2)+f(3)=59.

Clarification

Shieber (2004) takes issue with a passing remark in this paper:

[I]t has been claimed in passing that synchronous tree-substitution grammars are "equivalent to top-down tree transducers" (Eisner, 2003). This is clearly contravened by the distinction between B(LC,LC) and B(D,M) [respectively].

Shieber's point (in part) is that standard tree transducers can copy subtrees whereas STSGs cannot. Restricting to just the linear tree transducers would remove the ability to copy, which is good, but also kill the ability to accomplish local rotations, which is bad. See Shieber's paper for details.

However, other types of tree transducers can be defined that act like STSGs. In particular, Graehl and Knight (2004) discuss a class of transducers xR that was sketched in Rounds (1970): "xR transducers are similar to (weighted) Synchronous TSG, except that xR can copy input trees but does not model deleted input subtrees." The definition of xR could presumably be modified to eliminate these differences. In particular, the linear tree transducers xRL already cannot copy.


Jason Eisner - jason@cs.jhu.edu (tech correspondence welcome)