This is the second lecture on DNA. In the last lecture, we saw... we talked about how many bases there are. We saw that the cell and DNA contains many more bases than just the four standard bases. In this lecture, I want to talk about pairing. How is pairing controlled? And also, how does DNA replication achieve the amazingly low mutation rate of ten to the minus four. So, usually when we talk about base pairs they're presented in this way, where T really fits only A and G only fits C. It's simply the shapes... don't fit any other way . And, that then would be what matters, which would be then translated to T and A having simply two hydrogen bonds and G and C having three And, that's what allows the DNA to replicate with such high fidelity. DNA replication is really an amazing mechanism. Here, we see... Of course, no one managed to videotape how DNA replicates but it would look about like this - not with colors. And, I will not talk about this whole machinery, I will just talk about... slightly about this DNA polymerase, which allows DNA replication to achieve a very low rate. Let's look at the reaction. So, usually in any reaction we have reactants and products. And, we're interested in the rate of going from reactants to products... or both rates going forward and backward. Rate going forward usually is calculated by the Arrhenius equation, which gives us e to the minus EA - the activation energy - divided by kT. So, the higher the activation energy, the slower the reaction will be. And, activation energy is actually often what enzymes affect and also, in this case, what DNA polymerase affects. The reverse reaction will include the same activation energy often - so you will have to go over the same peak. But sometimes, if reactants and products have different free energy, then there's also Delta G. So, the activation energy going backwards includes another term, Delta G, and the rate going back could be slower or faster. The ratio between the two gives us then e to the minus Delta G over kT. And, when we have a case where we are talking about the right base versus the wrong base we're looking at two different reactions. And then, what we're interested in is the Delta Delta G, the difference between the two Delta Gs, which then gives us the ratio of the products of the right or the wrong type. There's also... one can also talk about the difference in the activation energy, which is important in DNA polymerase and we'll talk about later. But, at first, we'll start with this Delta Delta G. So, does Delta Delta G, the difference in binding strength - difference in the hydrogen bonds... Does this explain how well DNA binds? When we have... we start with a certain ratio of right to wrong bases, and then we go through a reaction where both of them drop by a difference in Delta G, Delta Delta G. Then, the initial ratio is amplified by e to the power of Delta Delta G over kT. When we look then at the bases, and this is... the base pairing energy actually depends on the neighbor bases. And, here I just showed you the Delta Gs for when the previous space was CG. So, when the previous base is CG, and here you see in blue the Watson-Crick base pairs - and in red, the other ones - you see that the difference between... the biggest difference that you can see between right and wrong would be about 2.5 kilocalories per mole. And, if we want to get the ratio that... the amplified ratio that we get out of such a difference in Delta G, it's very easy to calculate. We simply take the 2.5, multiply it by about 4000 to get it into joule, then divide by the Avogadro's number, k and T. We can put k and T... k and the Avogadro's number together to get the gas constant, which is 8.31. And, in the end, we get about 60. So, the biggest difference that we would get just out of the binding energies of base pairs would be a sixty-fold difference, which is very far from ten to the ten. How can we get a better... effective rate. One proposal that was brought forth is something called "kinetic proofreading," whereby having non-reversible reactions we can actually enhance this ratio. Let's see how it works. So, we said that if we start with a certain ratio of right to wrong bases and do a reaction where there is a Delta Delta G, the ratio is amplified by e to the power of Delta G over kT. If we add several more reactions into this where still we have the same Delta Delta G or smaller, then at equilibrium, the amplification is still the same - nothing changes. As long as everything is reversible, the ratios will simply be the ratios of the Delta Gs. However, if we add the... irreversible reaction here, then something different happens. Now, the first reaction will have an amplification of e to the Delta Delta G over kT. The second reaction by itself will also have such an amplification, so that overall we get an amplification of about a square of e to the power of Delta Delta G over kT. So, we... took the ratio that we have and put it to the power of two. So, from sixty you would get like 3600, still far from ten to the ten but a bit closer. In the case of DNA polymerase, this is probably not what happens because we don't know of really an irreversible reaction that uses energy that goes on... when we attach another base Instead, it seems that, in the case of DNA polymerase, what's important is not really the difference in the free energies, but instead, the shape of the bases. So, let's look at what DNA polymerase does. This is a schematic representation of DNA polymerase. Usually, it's thought of as something like a hand with fingers and... thumb, and at the bottom you see some DNA polymerases have also the gray exonuclease activity that we'll talk about in a second. So, dNTP arrives - dNTP is a single base of the DNA that still has the three phosphates attached. These three phosphates carry the energy that will be used to bind this base. Going forward, it seems like the rate of attaching the right base pair to the wrong one is about ten to the four. Again, we don't exactly understand why that is. Now, once... that base attached, there's two possibilities in the case where there is an exonuclease activity. We can either go forward and attach the next base or we can a go a bit backward, move the double-stranded DNA to a single-stranded DNA, and move that to the exonuclease part where the last base pair is removed. And here, the wrong base pair actually has an advantage although this happens... a hundred times more when the wrong base pair is then than the right one. This together seems to give us a rate that is close to ten to the ten. We still don't exactly understand how all these steps work. But, it seems what we see is that, when the base is in the pocket of the DNA polymerase, it seems that the shape of the base and where its every atom is exactly, determines whether the reaction will go forward or go towards the exonuclease activity. So, the exact shape of the DNA polymerase - here I showed Taq polymerase. The exact shape is important. What kind of shapes will the DNA polymerase encounter? Of course, at first there's the four regular Watson-Crick pairs. So, there's C-A-T-G, and they're pairs, This gives us a total of four pairs that DNA polymerase needs to be able to recognize. As we said, just the binding energy gives us about a fifty-fold enrichment for the right base versus the wrong one. But, in addition, the right one has three-fold disadvantage versus the wrong types because there's three times more of the wrong base than the right base. In addition to the four regular bases, we said that at least two more bases, but probably many others, are recognized by DNA polymerase, so that DNA can replicate even if they're there. So, that means that DNA polymerase has at least six pairs that it can recognize, probably many more. In addition, in the cell, we also have many NTPs - not dNTPs. So, NTPs don't have the oxygen removed. These are the RNA bases. They might go into the pocket but have to not attach to DNA. So, the pocket has to.. and it has like a small little, supposedly a little part that doesn't allow RNA to go in. In addition, we talked about all the non-standard bases that can be around in the cell. There are specific mechanisms that lower the concentration to a very low level so that they don't interfere with DNA replication. They could either stop it or they could then be introduced into the DNA. If we disable these mechanisms, we do see that they then enter the DNA. So, this is what a DNA polymerase encounters when it replicates. It needs to be able to recognize several shapes and reject others. In addition to the activity of DNA polymerase, there is activity of correcting errors on the DNA. These errors might both be introduced by the DNA polymerase itself during replication and through UV and other chemical actions that change the DNA itself. The first of these mutation correction mechanisms is DNA mismatch repair, also called "MMR," where this amazing molecule called "MutS" is able to recognize areas where the DNA doesn't... seem to have the right shape, maybe it has a nick or something like this. Because the wrong base was inserted, it recognized these places. And then, another molecule, MutL, recognizes - at least in prokaryotes - a location where there's a CTAG with 6mA, so a methylated amine. When DNA replicates, this methylation is not replicated, and therefore it's possible to recognize where the old strand is the new strand - the new strand will not have the A methylated. And then, at that place... the new strand is nicked, this whole new strand is removed up to the mismatch, and then repolymerized by DNA polymerase - hopefully without an error. So, this is one thing that removes errors mainly during DNA replication. Two other mechanisms are direct reversal and base excision. So, you see here 24 bases that are pretty much specifically recognized for these mechanisms. Direct reversal - you reverse exactly the high frequency errors that can be introduced by chemistry or UV and you revert them to the original base. Bases excision repair - you remove again a recognized error, and then that - or a region around it - and then DNA polymerase replicates again. What we see in all these mechanisms is that DNA actually does error corrections through the fact that every bit of information appears twice on one strand and on the other strand. So, if you know - in the case of MutS - if you know which strand is old and which is new, you can copy the more correct information that doesn't have replication error again. And, in these cases of reversal or base excision, again, you recognize the error and through that are able to correct. So, the error correction here uses the fact that DNA is double-stranded. Let's summarize. The point of these two lectures was to show that what is important in DNA isn't really the bases. It isn't the bases that carry the information of how amazing DNA is. We saw that, in additional to... the four bases, there's many more bases that appear in the DNA carrying information or not carrying information, and many more bases appear in the cell. We saw that dNTP, the concentration of the bases - these triple phosphate groups - are tightly controlled by the cell so that they will not interfere with DNA replication. If more bases were around in the TP state, the DNA would not replicate as well. We saw that it is mainly DNA polymerase that controls pairing and recognizes what is the right pair versus what is the wrong pair, and it isn't really the energy of the binding between the pairs that does it. And finally, we saw that there is a large cloud, a large number of specific or not so specific enzymes, that recognize wrong pairs or wrong bases and excise them or correct them. And, only this whole machinery of all these correction mechanisms allow DNA to work as we see it, with four bases that pair to each other and replicate with a high fidelity. Thank you.