Bruce Nevin responds to 'The Two Chomskys'

[Image: Zellig Harris]

Chris Knight’s explanation for why Noam’s successive models of language are asocial, abstract, and impracticable may have merit. Who can say? Attribution of intent is a dicey business. But there are other sources of the abstractness and uselessness.

Chris, you said “The essence of science is abstraction.” This is wrong. Science proceeds by generalization from particulars to principles. Generalization is not the same as abstraction. To confuse these is a category error.

Now, mathematics in itself is an abstract system, though its applications are not. This is perhaps the source of the confusion that leads to this category error. "Mathematics may be applied in [a] complex situation to figure out" how a principle of science works out in detail, but "very little mathematics is needed for the simple fundamental character of the basic laws." I am quoting here from Richard Feynman’s lecture “The Relation of Mathematics to Physics”, which is the second lecture printed in The Character of Physical Law (1965 London: BBC; 12th printing 1985 Cambridge: MIT Press). Mr. Feynman is, I think, a sound reference point for our physics envy.

The law of gravitation was not found by abstraction, it was found by study of a great deal of data of various kinds and finding a succinct way to characterize a general property of them. The mathematical statement of that generalization is an application of mathematics. Op. cit., the first lecture. The complicated facts of apparent planetary motions was explained by complicated calculations deriving from this principle ― again, applied mathematics. The statement of the principle itself is quite simple. Followers of Ptolemy had a pretty accurate way of explaining planetary motions by complicated calculations of cycles and epicycles. This was based on more complicated assumptions involving nested spheres, with less complete and careful observational data. The law of gravity that Newton worked out expresses ― ‘captures’, as some folks like to say ― a generalization over a very much larger domain of observations and data from which predictions have been made about things not yet observed but subsequently confirmed. But the law of gravity is not an abstraction, it is a generalization expressed using the abstractions of mathematics. And to the present point, Newton did not posit the principle first and then look for data, nor did he reach into mathematics or logic for a pre-existing system of abstract statements that was analogous to the movements of planets and then look for anecdotal planetary data to support or refute various candidate mathematical models. Science does not work that way. And Noam is no scientist.

Logicians and mathematicians developed the syntax of logic as a formal system analogous to language. Noam has assumed that those pre-existing systems, with their crystalline tidiness, must be the underlying Real form of language in all its messiness. That assumption leads to abstractness and uselessness.

A little tour of the history

Zellig Harris developed the science of language from a little mathematics, set theory and linear algebra. Descriptive linguistics identifies sets of morphemes and words generalizing their mutual privileges of occurrence—what can co-occur with what. This is called distributional methodology, and in the eclipsing polemics of the 1960s was derided as 'taxonomic'. Algebraic symbols may represent these sets, e.g. N for nouns and V for verbs, and subsets of them. The sets and subsets are generalizations of distributional data, and the symbols are not abstractions, they are abbreviatory representations of generalizations. A 'sentence form' is a sequence of these representing a set of sentences that conform to the given sequence of form-classes (sets of words) and constants (individual words), as e.g. the sentence form N t be V-en by N (where t is the set of tense morphology). In linear algebra, 'transformation' is one name for a mapping from one (sub)set to another. Harris's transformations were mappings from sentence-form to sentence-form.

Such was the world of transformational analysis that Zellig had been teaching since the late 1930s when Noam became his student in 1945. Zellig was then 36, Noam 17. The Harris and Chomsky families were friends, immigrants from Odessa. According to Naomi Sager, Zellig had been Noam's protector (her word). Noam's father had asked Zellig to take Noam under his wing. Zellig sponsored the young undergraduate to logicians Richard Martin and Nelson Goodman and encouraged their interest in him.

During that period, Zellig was publishing a series of 'structural restatements' of descriptive grammars of languages. This was a test of the comprehensiveness and consistency of his systematization of descriptive methods which had appeared piecemeal in various papers and had been circulated in the book manuscript titled Methods in Descriptive Linguistics. As stated, it was also a demonstration that beneath their relatively superficial differences of notation and style these descriptions were commensurate with each other, kind of analogous perhaps to prenex normal form in logic. He indicated ramifications and applications of this which are out of scope for this overview.

It is important to note that ENIAC, the first demonstration of an electronic general-purpose digital computing machine, was right there at Penn in 1945. (Analog computers had been around for some time.) Mathematical logic is foundational to the architecture, functioning, and programming of digital computers. The computational metaphor, the assumption that the human brain is analogous to the hardware and software of a digital computer, very quickly pervaded many fields as a background assumption. It enabled the ‘cognitive revolution’ in psychology very much in tandem with the ‘generative revolution’ in linguistics.

Cognitive™ psychology folks say cognition is a matter of symbolic rule manipulation and point down the hall where their friends the "Generative™ linguists will talk about how language is an example.

Generative™ linguists point back down the hall to their friends the cognitive psychologists for confirmation of the psychological reality of abstract rule systems for language.

As a student of Martin and Goodman, Noam became deeply involved with the grammar of symbolic logic as developed by Carnap and others. Logic is concerned with the form of inferences that preserve truth value irrespective of the lexical or semantic content. The syntax of logical systems employs rules that manipulate symbols which are abstract. They are not derived from generalizations of data, and to work with them no particular referents for the symbols need be stated. Noam has said that he tried to find a way that Zellig’s distributional methods could lead to a formal grammar of this sort, that he was frustrated that he could not, and that therefore he developed transformational grammar. In the various retellings of the Origin Story, the timing of transitions and influences have become obscure. Indications may be seen in the difference between his descriptive statement of the morphophonemics of Modern Hebrew and the radical revisions of it that he made in the early 1950s under the influence of a new friend, the logician Jehoshua Bar-Hillel.

It would not be surprising that Noam took Zellig’s structural restatements as a model of how one does linguistics. Zellig gave Noam data and descriptive analyses which he had done of both historic and modern stages of Hebrew, some of which was published. Noam’s undergraduate and MA study of Hebrew morphophonemics was a restatement of that material. (Zellig had lived extensively in Israel and was fluent in the language; Noam had not, and was not and is not fluent in Hebrew, but says he did some informant work.)

Noam’s phrase-structure grammar (PSG) is a formalization of (a simplified form of) Leonard Bloomfield's descriptive method called immediate constituent (IC) analysis using 'rewrite rules', an invention that tragic figure of mathematics, Emil Post; e.g. S → NP VP. IC analysis derives from the assertion in gestalt psychology that people intuitively know how to analyze a complex situation into components, each of which they might further analyze. In IC analysis a sentence was first divided into a noun phrase and a verb phrase, and these were further subdivided step by step, until individual words and morphemes are reached. (Zellig (1946) had reformulated the method as word-expansions, which Noam adapted to PSG formalisms only many years later as X-bar notation.) On the analogy of Zellig’s sequences of form-class labels in descriptive linguistics (labels like N and V for classes of words), Noam provided symbols for classes of phrases, e.g. VP for a verb phrase of indeterminate length. The steps of analysis are turned around to steps of synthesis or ‘generation’, beginning with the abstract symbol S for 'sentence'. The sequence of steps is represented atemporally in the form of a directed, rooted tree graph with S at the root, abstract phrase labels at the preterminal nodes, and sequences of form-classes at the terminal ends of the branches. (Or alternatively with labeled brackets, but that notational variant is scarcely ever used.)

All of this probably is utterly familiar material to anthropologists and philosophers as well as to linguists. The reason for recapitulating is to exemplify the distinction between generalization and abstraction. In the formalization of IC analysis, each phrase is represented by a symbol which (except in the simplest cases just prior to terminal symbols) is defined as a string of other such phrase-symbols, with no principled limit on the depth of the resulting pseudo-hierarchy.

At this stage of development, PSG could still claim that the symbols for classes of phrases correspond to generalizations over language data. But the as with the abstract symbols of mathematical logic no referents are specified for the preterminal symbols. There is no lexical content and no semantic value until the terminal symbols have words associated with them, particular members of the form-classes designated by the form-class labels on the lowest branches of the tree.

Soon, the requirements of the rule systems led to frank abstraction and the retroactive valorization of abstractness for its own sake.

Noam’s transformations were a restatement of Zellig's transformations, formulated as operations on PSG trees rather than as algebraic mappings from subset to subset of sentences. Language data took a subsidiary role, anecdotal examples supporting or contradicting some particular rule formulation. At center stage instead were the trees themselves and the rules specifying the operations on trees. As work on this project of restatement proceeded, transformational mappings were empirically found which such rules could not reach without positing abstract symbols with no direct counterpart in actually encountered or plausible strings of words. I will not illustrate the point, the literature speaks for itself.
For example, an enormity disputation has concerned rules that move branches of trees from one node to another, and metarules constraining such movement. In Zellig’s analysis, transposition plays a very small part, and the appearance of movement falls out naturally from reductions coincident with word entry in the construction of a sentence. I have written about that elsewhere.

Abstractness became a cardinal virtue, the distributional basis of all of this was denied and denigrated as ‘taxonomic linguistics’, Noam scoffed at such concerns as ‘data bound’. Instead of writing descriptions of languages, students turned to the mining of fragmentary descriptions for anecdotal examples and counter-examples respecting the properties of Universal Grammar. This is why there are no (zero) broad-coverage Generative™ grammars of any language.

Results in corpus linguistics have long cast all this in doubt. The work of Maurice Gross and colleagues is not the earliest, though his 1990 “On the failure of Generative Grammar” is certainly explicit (Language 55.4:859-885). Today’s Large Language Models (LLMs) have an exclusively distributional basis, an existence proof of Harrisian methodology. A naive computational metaphor is still foundational in Cognitive™ psychology and Generative™ linguistics, but support for this metaphor in neuroscience, never more than presumptive at best, has dwindled to virtually zero over the past decade or two. The brain does not manipulate symbolic representations of the world by rules and operations analogous to those that specify the syntax of symbolic logic and programming 'languages.'

Universal Grammar rests on a 3-legged stool: Paucity of data, cognitive deficiency, and abstract complexity. The great complexity and abstractness of Generative™ grammar is unnecessary, and this has been demonstrated in a number of ways. Infants have been shown to have far greater cognitive capacities than were attributed to them in the 1950s and 1960s when Noam's ideas set. And children are exposed to forms of language ('motherese', etc.) which are explicitly supportive in learning regularities and generalizations and their exceptions and in correcting overgeneralizations and other departures from current norms. Statistical learning theory provides a good account of language (and other) learning as well as of LLMs. The amazing architecture of the cerebellar system is probably instrumental.

No doubt Captains both military and industrial wanted the Star Trek scenario, "Computah! What is the disposition of the enemy in quadrant three, their likely strategy, and our best response?" The generalization of LLMs from large language corpora to nonverbal data of many kinds may afford them this ― if these 'AI' systems can be assured truthful. But I don’t think a case could be made that Noam’s long animosity to Zellig’s work was to delay acquisition of powers we cannot safely manage. Hillary Putnam once started to tell me he had an idea what was behind it, but then cut himself off, saying he did not want to risk losing Noam’s friendship. I have speculated too much about this elsewhere, while trying to avoid psychologizing. Who can say? Attribution of intent is a dicey business. And as my grandfather used to say, people are funny monkeys.

Bruce Nevin, December 21, 2023