The origin life is the origin of heredity as well, says S.Ananthanarayanan.
Over the course of the century, from 1859 when Darwin proposed the theory of evolution, to the discovery of DNA and the genetic code, in the 1950s, the mystery of life and heredity was laid bare. At its core was the code, built of three letter words, using a four-character alphabet, which helped rebuild millions of proteins, to enable living things to do what sets them apart – to reproduce
The code is a mathematically elegant construction – it is precise, economical, error protected - an end product more efficient than any variant that we can suggest. It is universal and unchanged, from the simplest, single celled organism to the greatest of mammals. By what stages could this code have arisen? Masayori Inouye, Risa Takino, Yojiro Ishida, and Keiko Inouye, from the Rutgers-Robert Wood Johnson Medical School, New Jersey, report in the journal, Proceedings of the National Academy of Sciences (PNAS), propose a new look at the question
In the same way that the most inspiring concept of an architect cannot be realised unless she prepares a blueprint, an organism, no matter how efficient, cannot have a second generation unless it contains within itself the blueprint of its own construction. As living things are essentially their cells and the set of proteins, which cells produce and control the way other cells of the organism behave, the cells of all living things contain a blueprint, in the form of a long (very long – billions of units long) ticker-tape that carries the code for the proteins. The DNA molecule is the tape, the code for the proteins are bits of DNA, called the genes, and the genes are built up of 3-letter words of an alphabet of 4 kinds of chemical groups, the letters, called the bases.
Now, the structure of proteins has got optimised to consist of a chain, often a very long chain, of components from a set of just 20 different units, called amino acids. Within the DNA, each group of 3 letters, formed out of the 4 letters that are available, is called a codon and is the template for creation of an amino acid. The BOX on this page shows how many 3-letter words we can form with 4 alphabets, and it works out to be 64. If the word had only 2 letters, there would be only 16 ways that it could be formed, which is not enough to describe 20 amino acids. We hence need at least 3 letters in the word, and if 64 is a lot more than 20, well, 3 codons have special uses, but the remaining 61 provide alternate forms for the most frequent amino acids – as an insurance to avoid errors when the code in the DNA is transcribed!
How many words can we create?
With 4 alphabets at our disposal, we can choose the first of the three letters in any of 4 ways. For each choice that we make, the second letter can again be chosen in 4 ways. There are hence 4 x 4 = 16 ways to choose the first two letters. Now, for the third letter, again, we have 4 choices. The total number of 3 letter words we can form is thus 4 x 4 x 4= 64.That living organisms are able to implement this mathematically elegant system, using just chemical combinations within the organisms’ cells, shows the great power of the process of evolution and raises a question of how it may have come about. One theory is that the first amino acids were born from the elements in the stormy and energetic environment of the early earth. Amino acids that have been created in laboratory simulations, and traces found in meteorites, suggest that there may have been ten amino acids at the start of life, and these grew into ten more, stabilising at the efficient number of twenty. The work done by the authors of the paper, however, finds that there may have been seven amino acids to start with, and more than one route for their development.
The 4 letters, or chemical groups, which form the codons are: U-for uracil, C for cytosine, A for adenine and G for guanine. The picture shows how the 20 amino acids (and 3 ‘stop’ codons to separate the genes) are formed by combining U, C, A and G. Significantly, we see two amino acids are encoded by only one codon, there are eight coded by two codons, just one coded by three codons, five coded by four codons and three coded by 6 codons. The number of redundant forms, however, does not generally correspond to the abundance of the amino acids, the paper says. For example, among the three amino acids coded by six codons, (green) arginine and serine are not the most frequently found. It is hence likely that the different forms came about by different processes.
In the case of leucine and arginine, the codons share bases in such a way that one codon can transform to another with a change of only one base. This, however, is not true in the case of serine. Here, we have four codons that start with ‘UC’ and two more that start with ‘AG’. It would hence take a change of two bases for a codon in one group to reach a form in the other. Further, the paper notes, single base changes, in the first or second place, leads to six different amino acids that are unrelated to serine. The authors hence suggest that the origin of the two forms which start with ‘AG’ was different from the origin of forms that start with ‘UC’.
To seek evidence of this suggestion, the authors analyse 4,225 protein coding genes of E. coli, a common intestinal bacterium. What they find is that although there are, in serine, theoretically two ‘AG’ codons to four ‘UC’ codons, the occurrence is not in the ratio of 1:2, but is as high as 3:4. The ‘AG’ codons are thus used disproportionately more often, and again, within the ‘AG’ codons, it is more often the ‘AGC’ codon. And then, there are differences in where the two forms of serine occur or are used.
This fits in, the paper says, with the idea that more analysis brings forward, that ‘AGC’ was evolutionarily one of the most primitive codons for serine, itself having descended from a form for GGC, for glycine. The analysis leads to the hypothesis that the codon for first amino acid had the form ‘GG’ and from this the first seven amino acids arose. The remaining thirteen arose from these seven, but the alternate form, ‘AG’ of serine came through an independent route.
More work on the genomes of other bacteria and other life forms, and the roles that the two forms of serine play, could further illuminate the path by which they came to be, the paper says.
------------------------------------------------------------------------------------------ Do respond to : response@simplescience.in-------------------------------------------