Classic Papers in Genetics (1961)
Edited by James A. Peters
Heredity in Populations and Pure Lines
A Contribution to the Solution of the Outstanding Questions in Selection
Translated from Ueber Erblichkeit in Populationen und in reinen Linien,
published by Gustav Fischer, Jena, 1903.
Review of this book in Nature


1 have translated here only the final summary and discussion from Johannsen's long paper on pure lines, which was written in German. This thorough and meticulous investigation of the true significance of selection was a bombshell to evolutionary thought. The efficacy of selection in the production of new species had been one of the mainstays of Darwin's theory of evolution. Johannsen's studies demonstrated conclusively that selection could not extend the limits of previously established variability. This fact became important in arguments against Darwinism, and led to a period when selection was discredited as evolutionarily significant. The mutation theory became the new basis for explanation of evolutionary phenomena.

As has often happened in biology, the final solution of the problem involved a reconciliation of the two viewpoints. Mutation (as a source of variants) and selection (as a method of elimination of some but not all variants) provide the modern basis for explanation of the process of evolution. We owe to Johannsen our modern viewpoint of selection as a primarily passive process, which eliminates but does not produce variations.

Although Johannsen uses the German word "Typus" throughout his paper with reference to his pure lines, I have substituted his own term, "genotype," invented at a later date. To Johannsen goes the credit as well for inventing the word "gene." It should be noted that he wrestles with the various names that had been proposed for the hereditary particles in this paper, on p. 26, but does not at this time suggest the term gene.


ALL THAT WILL BE DISCUSSED HERE gives at one and the same time a complete confirmation and a total elucidation of Galton's well-known law of regression, which concerns the relationship between parent and offspring. Other regression relationships do not concern us here.

Insofar as my research material is concerned, it agrees very well with Galton's Law. This law states that individuals differing from the average character of the population produce offspring which, on the average, differ to a lesser degree but in the same direction from the average as their parents. Selection within a population causes a greater or lesser shift in the direction of the selection of the average for the characteristics around which the individuals concerned are fluctuating.

While as a consequence of this I am not able to continue to regard the population as completely uniform, nevertheless my material can be broken up into "pure lines." It has been demonstrated that in all cases within the pure lines the retrogression mentioned above has been completed: Selection within the pure lines has produced no new shift in genotype.

The shift in the average for a characteristic, which selection in a population can usually produce, is thus an indication that the total population—at least in my material—consists of different "lines" whose genotypes can be more or less differentiated. In the course of ordinary selection a population would become impure; this result is a consequence of the incomplete isolation of these lines, whose genotypes cause deviation of the average character of the population in their directions.

The typical, well-known results of selection, that is, step-wise progress in the direction of the selection in the course of each generation, therefore depends upon step-wise progression in each generation of the differing lines concerned. It is now easily understood that the action of selection cannot go beyond the known limits—it must stop when the purification, or, practically speaking, the isolation of the most strongly divergent pure line is complete. In this connection it must be pointed out that one can never ascertain with certainty the existence of only a single genotype in a sample solely on the basis of concordance between the table or curve of variation shown by that sample and the numerical proportions of the binomial formula. The variation curve of individuals representing a racially pure population in the ordinary sense, frequently, indeed perhaps in most instances, can be shown to be the result of numerous genotypes representing the various pure lines of the population. The average value thus does not always have the significance of a true genotype. A great deficiency of a purely statistical approach is obvious in this regard.

For this reason, I have attempted throughout this paper to distinguish sharply between the concept of the mean (average character, average values, and so on) and the concept of the genotype. The confusion of these two thoroughly different concepts has only too frequently caused misunderstanding and erroneous inferences, perhaps not only in the field of heredity. It must be conceded that it can often be extremely difficult to distinguish between these two concepts without detailed analysis; and in pure lines the two concepts frequently cover the same area. The numerical expression of a genotype is frequently, but by no means always, an average value.

In the case of morphological characters—at least with the entire series of those whose value in systematic investigations has been generally recognized—the distinction between the different genotypes is such that the single individual can usually be recognized, in spite of its variations, as belonging to one or another of the most narrow systematic categories (for example, the "subspecies" of Jordan).

1 Raunkiaer, C., "Kimdannelse uden Befrugtning hos Maelkebotte" (Botan. Tidsskrift, Bd. XXV, Kopenhagen, 1903), pp. 109-119.

These morphological types can usually be organized into precise variation series only with great difficulty, for a mixture of individuals of different genotypes might be combined with a series of individuals which belong to a unique genotype. A mixture such as the Oenothera forms of de Vries or Raunkiaer's Taraxacum "Geschlechter"1 gives with regard to the essential morphological characteristics a different picture than a pure culture of a single form.

With regard to all sorts of characters of a more physiological sort—the non-botanical characters of Hj. Nilssons—such as, for example, most height and other proportions, biochemical properties, reliable numerical relationships, and others, we have a different situation. The distinct, actually existant genotypes, easily demonstrated through isolated cultivation, show only quantitative differences, so that the variation curves of the different genotypes overlap, and one has the transgressive curves of Hugo de Vries. A mixture of individuals, which belong to genotypes clearly distinct with respect to one of these characteristics (compare small and large, as well as narrow and broad beans) can very easily form so continuous a variation series that it is not possible to recognize directly the distinctions between genotypes, and the average value will be erroneously regarded as that of a single genotype. In cases such as these it is impossible to distinguish the genotype to which a single individual belongs. Table 1 is a good illustration of this point.

It is for these reasons, which have been more or less clearly recognized or just sensed, that the study of the characters I might call "truly" morphological, described above, have provided the center of gravity for systematics. The more physiological characters have entered into the sphere of interest of the systematist only in recent years, particularly with regard to lower forms. These reasons also explain why the students of mutation have found the mainstay of their researches in true morphological characters. On the other hand, these characteristics, which essentially determine the entire habitus of the plant, either cannot or can only partially be expressed in numerical measures, and this almost invariably reduces their value to within the limits of the exact methods of measurements and calculation.

Biometrical research, that is, the investigation of the laws of variation and heredity, has thus included primarily the more physiological characters or in general the characters Bateson called "meristic," that is, those which can be expressed clearly in numbers, such as size or weight. And here, where comparisons of individual with individual does not enable one to distinguish differences in genotype from the manifestations of a fluctuating variability, is the stronghold of the Galton-Pearson concept: Here one must—if one fails to consider pure lines—necessarily come to the conclusion that selection of strongly variant individuals (either plus or minus variants) can effect an actual change in the genotype under consideration. It is obvious that this concept, which has been completely acceptable up to this time and which the biometricians Weldon and Pearson have supported, must hinder the acceptance of mutations as something other than and as important as fluctuating variation. With all of the accumulated statistical knowledge of heredity in populations the acceptance of the mutation theory was perhaps felt to be not necessary for biology. I say "perhaps" here to meet to a certain extent the objections of biometricians. As for myself, the magnificient experiments of Hugo de Vries have proven the existence of mutations beyond the shadow of a doubt.

Weight Groups
of Parent
Seeds (1901)
in Milligrams
Variation in Offspring, Divided into Classes by Centigrams
Totals Standard
  10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90   ±
150-250       1 3 12 29 61 38 25 11             180 69.6
250-350     2 13 37 58 133 189 195 115 71 20 2         835 87.0
350-450   5 6 11 36 139 278 498 584 372 213 69 20 4 3     2238 85.1
450-550       4 20 37 101 204 287 234 120 76 34 17 3 1   1138 91.8
550-650       1 9 14 51 79 103 127 102 66 34 12 6 5   609 102.5
650-750         2 3 16 37 71 104 105 75 45 19 12 3 2 494 97.1
Total Material   5 8 30 107 263 608 1068 1278 977 622 306 135 52 24 9 2 5494 95.3
TABLE 1. The relationship of the weight of offspring to the weight of the parent seed. The seeds produced in 1901 were divided into weight classes
and then planted. The seeds produced by the members of each weight class were weighed and classified, with the results shown under "Variation in Offspring."

2 Die Mutationstheorie, vol. 1, p. 97.

It appears to be obvious from the research results I have presented here that the basis of the Galton-Pearson Law, concerning the relationship between parents and offspring, is something other than that which has been taken for granted up to now. The individual peculiarities of the parents, grandparents, or any other ancestor, has—insofar as my researches are concerned—no influence on the average characteristics of the offspring. It is the genotype of the line working in intimate conjunction with the external environment of a specific locality at a particular time that determines the average characteristics of an individual. The "line" is accordingly "completely constant and highly variable," as de Vries has so clearly shown in a similar situation, although apparently in a paradoxical fashion.2

At the same time, it must not be implied that the pure line will be absolutely constant.

First there is the possibility that selection of fluctuating variants through very many generations can eventually shift the genotype of a line. This has not been positively demonstrated—the statements of biometricians apply, as has been frequently pointed out, to populations which are not but could be divided into pure lines. The burden of proof for this possibility lies upon those who would wish to verify the efficacy of this kind of selection.

Second we must consider cross breeding—to take part in which the pure lines must forfeit their pure condition! The whole hybrid question is not, however, part of our discussion.

Third we come to mutations, the possibility of irregular changes in genotype. To define them would be premature in the greatest degree. Their existence in a greater diversity of organisms must first be substantiated. That they do so occur cannot be doubted, in my opinion; I hope to present specific, positive proof in a later publication. No more will be said here than that a mutation in a given direction cannot be specifically identified strictly on the basis of offspring of individuals which deviate irregularly in that direction.

At this point I must refer to the ticklish problem of what might explain the statement of de Vries that one frequently observes "minus" variations are predominant in newly discovered genotypes—an event that not without reason has aroused the skepticism of biometricians. It is to be hoped that the studies presented here will shed some light on this problem, which perhaps only appears to destroy the boundary between fluctuating variation and mutation.

(Postscript: In the final part of his Mutationstheorie—which appeared during the editing of this work—de Vries (l.c., pp. 503-504) has shown in a most ingenious way how in most cases mutations are first expressed. Therein lies an outstandingly important instance for the explanation of the relationship just discussed.)

Hugo de Vries has included in his Mutationstheorie (vol. 1, p. 368 ff.) a separate chapter concerning "Nourishment and Natural Selection" in which the consequences of a rich or scanty nourishment by the maternal plant is discussed. I have no doubt that actual or imaginary differences in nourishment occurring simultaneously with the presence or absence of selection would account for de Vries' example. Also, the phase of ontogeny which de Vries called the "sensitive period" is of particular interest. I could cite a similar phenomenon in my research material only with the greatest difficulty. It should be understood that it is not my intention to try to explain through the "principle of pure lines" and nothing else all the differences in characteristics which selection in conjunction with extreme or experimentally designed habitats might produce. In this regard, which is of strong interest to Neo-Lamarckians, there is still very much research to be done—and certainly with the use of genuinely pure lines as a research material.

My primary purpose is to shed some light on the Galtonian regression between ancestors and descendants, and I believe that my material, which evidently has its natural peculiarities analogous to those of Galton, has its value as a basis for analysis of the Galtonian laws applying to populations. My statements conflict neither with Galton's statements nor with those of de Vries.

If my investigations are sound, and their significance is grasped beyond the special case here discussed, the general results of this work would form a not unimportant support for the modern concepts of Bateson and de Vries on the great significance of "discontinuous" variations, or "mutations," for the theory of evolution. For a selection in cases such as mine is effective only in so far as it selects out representatives of an already existant genotype. These genotypes would not be successively originated through the retention of those individuals which vary in the desired direction; they would merely be found and isolated.

The knowledge that has been gained from studies on pure lines, combined with a knowledge of hybridization, must serve as the starting point in the case of studies on heredity within population in which pure lines can not be completely isolated as a consequence of the necessity of constant cross-fertilization or hybridization. This knowledge is, as has been pointed out earlier, in complete agreement with the basic ideas in the great work of de Vries—as has been seen, my concepts have been arrived at via a somewhat different path than that followed by de Vries, and it is also important to note, based on a different kind of information.

In addition the important question of correlative variation takes on a somewhat changed character depending upon whether one works with pure lines or with populations. In the latter case a given "ratio of correlation" (Pearson's term) will not necessarily represent a strongly legitimate relationship, as I have sought to demonstrate earlier. An indicated correlative relationship is much more significant within a pure line. My summation table speaks very well for this concept, in that it was not possible to change through selection within pure lines the correlation between length and width of the beans, while it was simple to isolate truly different genotypes, as for example narrow and broad forms, from the original population, which appeared to be entirely homogeneous.

Again, we have to reckon with the possibility of mutation; for thereby even the strongest correlative relationship could be destroyed. I do not wish to take up this question at this time; in a later publication I hope to shed more light on it, using the principle of pure lines as the basis for the research.

I would be sorry indeed if the reader of this work would come to feel that the value of the significant work of Galton, Pearson, and other biometrical research workers were to be placed in doubt. I would not have the audacity to criticize the treatment which Pearson in particular has given to the question of the ancestral influence within a specific population. I do think, however, that the principle of pure lines in the hands of a man such as Pearson can carry biometric studies much farther along than his studies of populations. Obviously the relationships studied by Pearson have great scientific significance and they have considerable practical value as well—but they are not suitable to illuminate completely the fundamental laws of heredity.

3Galton's theory is known to me from his original paper in Revue Scientifique, vol. 10, 1876, p. 198 ("Theorie de l'hérédité").
4The position of biometricians on "Weismannism" has been clearly evaluated by Pearson in his characterization of this movement in "Socialism and Natural Selection" (Fortnightly Review, July, 1894. Reprinted in Pearson's Chances of Death, vol. 1, 1897, p. 104). I do not plan to go further into this question at this time.
5Weismann, Vorträge fiber die Descendenztheorie, 1902, p. 421.
6de Vries, "Intracelluläre Pangenesis," Jena, 1889.

And what particularly affects Galton's research, in my estimation, is that the results presented here support in a beautiful way the basic ideas of Galton's "Stirp" theory,3 which was already worked out in 1876. This law includes almost all that is of actual value in the more recent Weismann theory on the "Continuity of the Germplasm." That the speculation of Weismann4 could overshadow the more simply put but no less ingenious and quite original idea of Galton's is perhaps due to some extent to Galton himself, for he has not seen fit in his more recent publications to adhere to his Stirp theory in the light of research progress. The Stirp theory does not correlate too well with Galton's Law of Regression, it is true, but it could scarcely be better supported or illustrated than by the results which I have described: a usually complete regression to the genotype of a pure line seems to me the most beautiful evidence for a slightly modified Stirp concept. It is true that Galton's Stirp concept cannot be maintained unchanged. Although Weismann5 very recently regarded Galton as the "voice" of cellular limitation through "Determinants"—or however one might name these theoretical hereditary corpuscles, de Vries deserves the great credit for having recognized the unitary nature of hereditary particles, which he called "pangenes"—a concept he first published in 18896 and further advanced in the "Mutationstheorie." It seems to me that the Galton-de Vries theory is the only truly useful theory of heredity.

7de Vries has published a special case concerning heredity in a pure "bimodal" line in his Mutationstheorie (vol. 2, p. 509), based on a short communication of mine.

Should the present publication be successful in bringing the principle of pure lines recognition as an absolutely necessary principle in truly intensive research in the study of heredity, then its highest purpose would be achieved. Later publications will attempt to illuminate the activity of lines which vary polymodally. I have investigated only unimodal variation in this paper,7 in order to present my concept in its simplest instance.

The train of thought which underlies this investigation is expressed in its simplicity most clearly by the often cited words of Goethe:

"Dich im Unendlichen zu finden
Musst unterscheiden und dann
"Before the infinite can be thine
You must first break it down and then

Vilmorin has emphasized the differentiation of the parts, Galton has demonstrated the legitimate basis for recombination; I have tried here to combine the points of view for which these two ingenious investigators are honored.