We chose to use the BAC to better mimic physiologically-relevant expression conditions. Sequence information Supplementary Table S2 and additional details about the cloning are available in the Supplementary Data. Unless otherwise specified, all liquid cell cultures were grown under these conditions.
We used non-expressing control strains lacking a reporter gene to measure cellular autofluorescence or autoluminescence for each experiment.
We used a different construct to create non-expressing control strains for each of the four reporter-backbone pairs used in this experiment. For measurements of GFP expressed from the p15A plasmid, we used a p15A plasmid that carried a silicatein gene. For measurements of nanoluciferase expressed from the p15A plasmid, we used a p15A plasmid that carried a GFP gene. All control vectors had the same resistance gene as the vectors expressing the reporter gene, and were transformed into the same E.
The background signal measured from these control strains was used to determine the fluorescence or luminescence signal above which we considered expression to be significant. No IPTG was added to the non-expressing control cultures because we observed in previous experiments that inducing these cells inhibits cell growth. We suspect this is because the cloning vector in the non-expressing control cells contains a base transcript transcribed by the T7 promoter, including a pelB leader sequence.
Overexpression of this sequence appears to be toxic to the cell. Before reading, plates were placed in the shaking incubator for 5 min to resuspend any cells that may have aggregated. All raw measurements are available online Supplementary Table S5. Cultures were analyzed on a BioTek Synergy H4 plate reader. Measured events were triggered on a side-scattering threshold and 30 events were measured from each cell culture.
Two measurement sets were acquired, at high- and low-gain through the FITC channel. At low gain, the mean signals from highest expressing samples were within the measurable range, but means for the lowest expressing samples were off-scale below the lower detection limit. Flow cytometry data were processed in R, using the flowCore package 54 to remove non-expressing near-baseline noise values and log transform the data, and ggplot2 55 to generate violin plots.
OD was measured as described above. Nanoluciferase luminescence measurements were performed using the Nano-Glo Luciferase Assay System kit Promega, N , following the manufacturer's protocol. Lysis buffer was prepared by dissolving Assay buffer was prepared by mixing 9. During this part of the experiment, one of the cell solutions of the AUC codon in the BAC was not transferred from the lysis plate to the assay plate, so only two biological replicates measured for this codon.
Luminescence was measured on a BioTek Synergy H4 plate reader for all visible wavelengths for 1 s at a gain of Raw fluorescence and optical density measurements for each cell culture were imported into R. Wells filled only with cell culture media were used to subtract background optical density and fluorescence.
For the T7-GFP measurements from LB culture, we created a calibration curve comparing the relationship between fluorescence measured at the two gain settings Supplementary Figure S2.
This curve was used to calibrate fluorescence measured at the low gain setting to an equivalent value of fluorescence measured at the high gain setting, which allowed for comparison of fluorescence measured at the two gain settings. All measurements except for the three canonical start codons were within the linear dynamic range at the high gain setting.
For the three canonical start codons, we used the calibration curve Supplementary Figure S2 to convert the low gain reading to the expected high gain reading. High gain fluorescence measurements were carried forward for downstream analysis. Per-cell fluorescence or luminescence was calculated by dividing the fluorescence by the optical density. Mean per-cell fluorescence or luminescence for each start codon in each expression strain was calculated by averaging the per-cell measurements from three biological replicates.
Per-cell expression was normalized by the per-cell expression from the AUG start codon to facilitate comparison of relative expression from different expression systems.
The significance of the translation initiated from each codon was determined by comparing the expression measured from all of the codons in each experiment to the non-expressing control using Dunnett's test 56 , a statistical method for comparing multiple treatments to a single control, using the R multcomp package The cells were resuspended and lysed in 1 ml BPER with 0.
DNase one unit was added and further incubated with frequent vortexing for 10 min. Start codons extracted from annotated features of 69 bacterial genome and plasmid sequences.
The supernatant was decanted and the protein pellet was left to dry under the chemical hood for 20 min at room temperature. Peptide pools were reconstituted and injected onto a C18 reversed phase analytical column, 10 cm in length New Objective. Mobile phase A consisted of 0. Data were first analyzed in Preview to verify calibration criteria and identify likely post-translational modifications prior to Byonic analysis.
All analyses used a custom. Byonic searches were performed using 12 ppm mass tolerances for precursor and fragment ions, allowing for semi-specific N-ragged tryptic digestion.
The resulting identified peptide spectral matches were then exported for further analysis. To analyze initiation codon annotations from model bacterial species, 69 complete bacterial chromosome and plasmid sequences were collected from the National Center for Biotechnology Information databases Supplementary Table S3.
After removal of entries due to pseudogenes and misannotations a set of 84 entries remained Supplementary Table S6 for analysis of initiation codon frequencies across the replicons of model bacterial species. The BioCyc database was consulted extensively during this process We were first motivated to explore non-canonical start codons when we attempted to silence translation of a dihydrofolate reductase DHFR gene.
Surprisingly, we detected significant DHFR expression in recombinant bacterial extract 61 from all five codons Supplementary Figure S1. We initially wondered whether the observed protein synthesis was merely an artifact of using an in vitro translation system, as similar results had been reported using rabbit reticulocyte lysate We analyzed 69 well-annotated bacterial chromosomes and endogenous plasmids 63 to determine which of the 64 codons have been annotated as start codons Supplementary Table S3.
Our approach was similar to previous efforts 5 but with a focus on well-annotated genomes. We designed a set of four plasmids with different copy numbers, promoters and reporters to experimentally quantify translation initiated from all 64 codons in E.
First, we measured the translation of GFP initiated from all 64 codons. The spacer sequence between the RBS and the start codon UAAAUAC was designed to be the optimal length for promoting translation initiation 51 , and also to prevent the inadvertent creation of an in-frame or out-of-frame canonical start codon.
Plasmid sets used to measure translation initiation from non-canonical start codons. Plasmids varied in origin of replication copy number , promoter and reporter gene characteristics.
We measured fluorescence and absorbance via a plate reader from two different growth conditions. Measurements in the second condition had larger dynamic range than the first condition, but were otherwise similar Supplementary Figure S4. In both cases, a strain carrying an empty cloning vector was used as a control for measuring background fluorescence.
We calculated mean per-cell expression fluorescence divided by OD for three biological replicates of strains expressing GFP with each of the 64 codons inserted in place of the start codon in the GFP coding sequence. The expression of GFP initiated from each start codon across the triplicate cultures was compared to the expression of the control cells using Dunnett's test, a method for comparing multiple treatments to a single control 56 , assuming equal variance. Of the 64 start codons tested, translation initiated from 47 at a level significantly greater than the control cells.
Translation initiation from all 64 codons. Normalized per-cell fluorescence measured from three replicate cultures, grown in LB and resuspended in PBS before measurement, with each of the 64 codons as the start codon in the GFP coding sequence. We created a codon table heat map to better visualize trends in the translation initiation strength of each of the 64 codons Supplementary Figure S5. This group is not typically annotated as start codons in bacterial genomes.
We wanted to confirm that the bulk fluorescence measurements we obtained on the plate reader were not arising from a small number of highly-expressing cells. We measured the distribution of fluorescence within the population of each culture on a flow cytometer Supplementary Figure S6.
Additionally, the geometric means of the fluorescence of these populations were well correlated with the mean per-cell fluorescence measured on the plate reader Supplementary Figure S7.
We examined the GFP transcript Supplementary Table S2 for in-frame upstream and downstream start codons as a possible explanation for the observed fluorescence.
We found no canonical in-frame start codons upstream of the GFP coding sequence. There is an in-frame GUG at the 16th codon in the GFP coding sequence, but any resulting protein would be truncated and non-fluorescent given that the minimal sequence needed for GFP fluorescence begins at the sixth codon We cloned a 6x-His tag into the C-terminus of these five genes and, following expression and purification, recovered significant amounts of protein.
Little to no protein was recovered from the CGC culture, as expected. We digested proteins with AspN and analyzed the mixture via mass spectrometry. Each expressed protein released peptides of intact N-termini that included an N-terminal methionine Supplementary Tables S7— Other researchers have also observed methionine in the N-terminal position of proteins whose translation initiates from GUG or UUG start codons 4 , 65 , We next explored translation initiated from non-canonical start codons under more physiologically relevant conditions.
GFP expression was quantified by measuring mean per-cell fluorescence. Nanoluciferase expression was quantified by measuring mean per-cell luminescence emitted from the nanoluciferase-catalyzed conversion of furimazine to furimamide All measurements were repeated in triplicate.
Measurements from serial dilutions of nanoluciferase-expressing cells indicated that, over the range of concentrations used in this work, luminescence was linear with nanoluciferase concentration Supplementary Figure S7.
We therefore transitioned to an expression system with lower background signal to detect non-canonical translation initiation under more biologically relevant expression conditions.
Nanoluciferase was a good reporter for this purpose because cell cultures emit negligible background luminescence and the nanoluciferase assay has a linear dynamic range greater than six orders of magnitude Translation initiation from a subset of 12 codons spanning the expression range.
Of particular note was that translation initiated from UAG a canonical stop codon was the lowest of the 12 codons 0. We next measured nanoluciferase expression from the very-low-copy BAC. As expected from the lower copy number, the absolute luminescence measured from constructs on the BAC was more than an order of magnitude lower than the absolute luminescence measured from the same genes on the p15A plasmid.
The lowest nanoluciferase expression was again initiated from the canonical stop codon UAG 0. N-terminal RNA structure is known to impact translation initiation We simulated RNA secondary structure around the initiation codon for the four reporter plasmids used in this work to evaluate if RNA secondary structure might contribute to translation initiation from non-canonical start codons. Additionally, there was no correlation between initiation codon GC-content and reporter expression Supplementary Figure S10 and Table S4.
These data suggest that differences in translation initiation from the start codons measured in this study were likely not caused by changes in RNA structure around the initiation codon or the GC-content of the initiation codon. We observed translation initiation in E. Most of these codons have never before been identified as start codons. The rate of mRNA codon misreading i.
For single base mismatches, error rates during translation elongation have been estimated between 3. For multiple base mismatches, error rates have been estimated around 3. Given the above, we took 3. We calculated the expected expression due to these errors by multiplying these error rates by the expression from the AUG start codon, and compared the values to the expression level we measured for each codon Supplementary Figure S The translation initiation rates we observed from non-canonical start codons were greater than the reported translation error rates for 17 non-AUG start codons from the T7-GFP plasmids.
Specifically, we observed translation initiation from codons with single base mismatches from AUG at a rate ranging between 0. The translation initiation rates we observed were also strongly correlated to the identity of the non-canonical start codon across different genes, promoters, plasmid copy number and expression strains Supplementary Figure S Wobble base pairing occurs between mRNA codons and tRNA anticodons during translation, and may be part of the mechanistic explanation for the observations that we report in this paper.
Wobble pairs occur at the third position of many codons during elongation We did not perform comprehensive mass spectrometry experiments to identify the N-terminal amino acid from the remaining codons, so we cannot be certain from which codon, with which tRNA and with which amino acid, translation is initiating.
Almost all E. By this same logic, we argue that translation initiation of genes with other non-AUG codons, in which methionine is observed as the N-terminal amino acid, should also not be considered an error. However, those wishing a strict interpretation of the central dogma could consider such events to be errors in translation initiation. All biological processes are governed by processes that imply a certain rate of unlikely events, and such unlikely events are often referred to as errors, failures or leaks.
However, focusing only on the statistically likely outcome risks overlooking any advantageous aspects of rare but purposeful possibilities 89 , Viewing non-canonical start codons without confinement to traditional dogma may reveal them as a potential feature, rather than an error, in gene expression. For example, there may be evolutionary utility to translation initiation from non-canonical start codons.
Research with yeast has shown gradual transitions of genetic sequences between genes and non-genic ORFs in related species We can imagine a scenario wherein, over evolutionary time scales, point mutations could create a weak non-canonical initiation codon downstream of a RBS.
The small amounts of protein produced from such an ORF, if beneficial to the organism, could select for further mutations that increased translation efficiency up to a point where the gene product more directly impacted organismal fitness. Further mutations could then be selected that tune for optimal expression dynamics in a given genetic context.
There may also be regulatory utility to translation initiation from non-canonical start codons. The AUU start codon of infC regulates its translation 93 , and a proposed mechanism for the utility of the GUG start codon is its ability to form stronger transcript secondary structures Average per-cell abundances of proteins in bacteria and mammalian cells span five to seven orders of magnitude 94 , Given that the non-canonical translation initiation shown in this paper spans about four orders of magnitude, it is possible that this level of expression could be physiologically significant and may serve as an additional mechanism for controlling protein synthesis.
Exploring changes in non-canonical translation initiation under different experimental conditions could indicate whether non-canonical translation initiation arises from error, or whether it confers an advantage through conditional regulation of gene expression. Moderate to extensive post-translational modification is sometimes required before a protein is complete.
For example, some polypeptide chains require the addition of other molecules before they are considered "finished" proteins. Still other polypeptides must have specific sections removed through a process called proteolysis. Often, this involves the excision of the first amino acid in the chain usually methionine, as this is the particular amino acid indicated by the start codon.
Once a protein is complete, it has a job to perform. Some proteins are enzymes that catalyze biochemical reactions. Other proteins play roles in DNA replication and transcription. Yet other proteins provide structural support for the cell, create channels through the cell membrane, or carry out one of many other important cellular support functions.
This page appears in the following eBook. Aa Aa Aa. The ribosome assembles the polypeptide chain. What is the genetic code? More on translation. How did scientists discover how ribosomes work? What are ribosomes made of? Is prokaryotic translation different from eukaryotic translation? Figure 1: In mRNA, three-nucleotide units called codons dictate a particular amino acid.
For example, AUG codes for the amino acid methionine beige. The codon AUG codes for the amino acid methionine beige sphere. The codon GUC codes for the amino acid valine dark blue sphere. The codon AGU codes for the amino acid serine orange sphere. The codon CCA codes for the amino acid proline light blue sphere.
The codon UAA is a stop signal that terminates the translation process. The idea of codons was first proposed by Francis Crick and his colleagues in During that same year, Marshall Nirenberg and Heinrich Matthaei began deciphering the genetic code, and they determined that the codon UUU specifically represented the amino acid phenylalanine. Following this discovery, Nirenberg, Philip Leder, and Har Gobind Khorana eventually identified the rest of the genetic code and fully described which codons corresponded to which amino acids.
Reading the genetic code. Redundancy in the genetic code means that most amino acids are specified by more than one mRNA codon. Methionine is specified by the codon AUG, which is also known as the start codon. Consequently, methionine is the first amino acid to dock in the ribosome during the synthesis of proteins. Tryptophan is unique because it is the only amino acid specified by a single codon. The remaining 19 amino acids are specified by between two and six codons each.
Figure 2 shows the 64 codon combinations and the amino acids or stop signals they specify. Figure 2: The amino acids specified by each mRNA codon. Multiple codons can code for the same amino acid. Figure Detail. What role do ribosomes play in translation? As previously mentioned, ribosomes are the specialized cellular structures in which translation takes place. This means that ribosomes are the sites at which the genetic code is actually read by a cell.
Figure 3: A tRNA molecule combines an anticodon sequence with an amino acid. These nucleotides represent the anticodon sequence.
The nucleotides are composed of a ribose sugar, which is represented by grey cylinders, attached to a nucleotide base, which is represented by a colored, vertical rectangle extending down from the ribose sugar. The color of the rectangle represents the chemical identity of the base: here, the anticodon sequence is composed of a yellow, green, and orange nucleotide. At the top of the T-shaped molecule, an orange sphere, representing an amino acid, is attached to the amino acid attachment site at one end of the red tube.
During translation, ribosomes move along an mRNA strand, and with the help of proteins called initiation factors, elongation factors, and release factors, they assemble the sequence of amino acids indicated by the mRNA, thereby forming a protein. In order for this assembly to occur, however, the ribosomes must be surrounded by small but critical molecules called transfer RNA tRNA.
Each tRNA molecule consists of two distinct ends, one of which binds to a specific amino acid, and the other which binds to a specific codon in the mRNA sequence because it carries a series of nucleotides called an anticodon Figure 3. In this way, tRNA functions as an adapter between the genetic message and the protein product. The first letter of a codon is shown vertically on the left, the second letter of a codon is shown horizontally across the top, and the third letter of a codon is shown vertically on the right.
In addition to the mRNA template, many molecules and macromolecules contribute to the process of translation. The composition of each component varies across taxa; for instance, ribosomes may consist of different numbers of ribosomal RNAs rRNAs and polypeptides depending on the organism. However, the general structures and functions of the protein synthesis machinery are comparable from bacteria to human cells. A ribosome is a complex macromolecule composed of catalytic rRNAs called ribozymes and structural rRNAs , as well as many distinct polypeptides.
Prokaryotes have 70S ribosomes, whereas eukaryotes have 80S ribosomes in the cytoplasm and rough endoplasmic reticulum, and 70S ribosomes in mitochondria and chloroplasts.
Ribosomes dissociate into large and small subunits when they are not synthesizing proteins and reassociate during the initiation of translation. The small subunit is responsible for binding the mRNA template, whereas the large subunit binds tRNAs discussed in the next subsection.
The complete structure containing an mRNA with multiple associated ribosomes is called a polyribosome or polysome. In both bacteria and archaea , before transcriptional termination occurs, each protein-encoding transcript is already being used to begin synthesis of numerous copies of the encoded polypeptide s because the processes of transcription and translation can occur concurrently, forming polyribosomes Figure 2.
This allows a prokaryotic cell to respond to an environmental signal requiring new proteins very quickly. In contrast, in eukaryotic cells, simultaneous transcription and translation is not possible. Although polyribosomes also form in eukaryotes, they cannot do so until RNA synthesis is complete and the RNA molecule has been modified and transported out of the nucleus.
Figure 2. In prokaryotes, multiple RNA polymerases can transcribe a single bacterial gene while numerous ribosomes concurrently translate the mRNA transcripts into polypeptides. In this way, a specific protein can rapidly reach a high concentration in the bacterial cell. Bacterial species typically have between 60 and 90 types. Serving as adaptors, each tRNA type binds to a specific codon on the mRNA template and adds the corresponding amino acid to the polypeptide chain.
As the adaptor molecules of translation, it is surprising that tRNAs can fit so much specificity into such a small package. Mature tRNAs take on a three-dimensional structure when complementary bases exposed in the single-stranded RNA molecule hydrogen bond with each other Figure 3. The anticodon is a three-nucleotide sequence that bonds with an mRNA codon through complementary base pairing.
At least one type of aminoacyl tRNA synthetase exists for each of the 20 amino acids. Figure 3. Translation is similar in prokaryotes and eukaryotes. Here we will explore how translation occurs in E. The initiation of protein synthesis begins with the formation of an initiation complex.
Because of its involvement in initiation, fMet is inserted at the beginning N terminus of every polypeptide chain synthesized by E. This interaction anchors the 30S ribosomal subunit at the correct location on the mRNA template. At this point, the 50S ribosomal subunit then binds to the initiation complex, forming an intact ribosome. Figure 4. Translation in bacteria begins with the formation of the initiation complex, which includes the small ribosomal subunit, the mRNA, the initiator tRNA carrying N-formyl-methionine, and initiation factors.
Then the 50S subunit binds, forming an intact ribosome. In prokaryotes and eukaryotes, the basics of elongation of translation are the same. The P peptidyl site binds charged tRNAs carrying amino acids that have formed peptide bonds with the growing polypeptide chain but have not yet dissociated from their corresponding tRNA.
The E exit site releases dissociated tRNAs so that they can be recharged with free amino acids. Elongation proceeds with single-codon movements of the ribosome each called a translocation event. During each translocation event, the charged tRNAs enter at the A site, then shift to the P site, and then finally to the E site for removal. Peptide bonds form between the amino group of the amino acid attached to the A-site tRNA and the carboxyl group of the amino acid attached to the P-site tRNA.
The formation of each peptide bond is catalyzed by peptidyl transferase , an RNA-based ribozyme that is integrated into the 50S ribosomal subunit. The amino acid bound to the P-site tRNA is also linked to the growing polypeptide chain. Several of the steps during elongation, including binding of a charged aminoacyl tRNA to the A site and translocation, require energy derived from GTP hydrolysis, which is catalyzed by specific elongation factors.
0コメント