Strategy for Designing the Synthetic Gene Encoding Human papillomavirus Major Capsid L1 Protein for Heterologous Expression in Escherichia coli System
Abstract
DNA is widely used to construct heterologously expressed genes. The adaptation of the codons to the host organism is necessary in order to ensure sufficient production of proteins. The GC content, codon identity and the mRNA from the translation site are also important in the design of the gene construct. This study performed a strategy for the design of synthetic gene encoding HPV52 L1 protein and several analyses at the genetic level to optimize its protein expression in the Escherichia coli BL21(DE3) host. The determination of the codon optimization was performed by collecting 75 HPV52 L1 protein sequences in the NCBI database. Furthermore, all the sequences were analyzed using multiple global alignments by Clustal Omega web server. Once the model was determined, codon optimization was performed using OPTIMIZER and the web server of the IDT codon optimization tool based on the E. coli B. The generated open reading frame (ORF) sequence was analyzed using Restriction mapper web server to choose the restriction site for facilitating the cloning stage, which is adjusted for pJExpress414 expression vector. To maximize the protein expression level, the mRNA secondary structure analysis around the ribosome binding site (rbs) was performed. A slight modification at the 5’-terminal end waa carried out in order to get more accessible rbs and increasing mRNA folding free energy. Finally, the construction of the synthetic gene was confirmed to ensure that no mutation occurs in the protein and to calculate its Codon Adaptation Index (CAI) and GC content. The above strategy, which leads to a good ORF sequence with the value of the free mRNA folding energy around rbs, is -5.5 kcal / mol, CAI = 0.787 and GC content 49.5%. This result is much better than its original gene. This result is much better compared to its native gene. Theoretically it is possible that this synthetic gene construct generates a high level protein expression in E. coli BL21 (DE3) under the regulation of the T7 promoter.
References
Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, Bray F. 2020. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. The Lancet of Global Health. vol 8(2): 191–203. doi: https://doi.org/10.1016/S2214-109X(19)30482-6.
Bellaousov S, Reuter JS, Seetin MG, Mathews DH. 2013. RNAstructure: Web servers for RNA secondary structure prediction and analysis. Nucleic Acids Research. vol 41: 471–474. doi: https://doi.org/10.1093/nar/gkt290.
Buck CB, Day PM, Trus BL. 2013. The papillomavirus major capsid protein L1. Virology. vol 445(1–2): 169–174. doi: https://doi.org/10.1016/j.virol.2013.05.038.
Buhr F, Jha S, Thommen M, Mittelstaet J, Kutz F, Schwalbe H, Rodnina MV, Komar AA. 2016. Synonymous codons direct cotranslational folding toward different protein conformations. Molecular Cell. vol 61(3): 341–351. doi: https://doi.org/10.1016/j.molcel.2016.01.008.
Chaney JL, Clark PL. 2015. Roles for synonymous codon usage in protein biogenesis. Annual Review of Biophysics. vol 44: 143 – 166. doi: https://doi.org/10.1146/annurev-biophys-060414-034333.
Chin JX, Chung BKS, Lee DY. 2014. Codon Optimization OnLine (COOL): a web-based multi-objective optimization platform for synthetic gene design. Bioinformatics. vol 30(15): 2210–2212. doi: https://doi.org/10.1093/bioinformatics/btu192.
Dewi KS, Fuad AM. 2020. Improving the expression of human granulocyte colony stimulating factor in Escherichia coli by reducing the GC-content and increasing mRNA free folding energy at 5’-terminal end. Advanced Pharmaceutical Bulletin. vol 10(4): 610–616. doi: https://dx.doi.org/10.34172%2Fapb.2020.073.
Gaspar P, Oliveira JL, Frommlet J, Santos MA, Moura G. 2012. EuGene: maximizing synthetic gene design for heterologous expression. Bioinformatics. vol 28(20): 2683–2684. doi: https://doi.org/10.1093/bioinformatics/bts465.
Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A. 2003. ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Research. vol 31(13): 3784–3788. doi: https://doi.org/10.1093/nar/gkg563.
Ghavim M, Abnous K, Arasteh F, Taghavi S, Nabavinia MS, Alibolandi M, Ramezani M. 2017. High level expression of recombinant human growth hormone in Escherichia coli: crucial role of translation initiation region. Research in Pharmaceutical Sciences. vol 12(2): 168–175. doi: https://dx.doi.org/10.4103%2F1735-5362.202462.
Gingold H, Pilpel Y. 2011. Determinants of translation efficiency and accuracy. Molecular Systems Biology. vol 7(1): 1–13. doi: https://doi.org/10.1038/msb.2011.14
Gu W, Zhou T, Wilke CO. 2010. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Computational Biology. vol 6(2): 1–8. doi: https://doi.org/10.1371/journal.pcbi.1000664.
Guimaraes JC, Mittal N, Gnann A, Jedlinski D, Riba A, Buczak K. Schmidt A, Zavolan M. 2020. A rare codon-based translational program of cell proliferation. Genome Biology. vol 21(1): 1–20. doi: https://doi.org/10.1186/s13059-020-1943-5.
Handa Y, Inaho N, & Nameki N. 2011. YaeJ is a novel ribosome-associated protein in Escherichia coli that can hydrolyze peptidyl–tRNA on stalled ribosomes. Nucleic Acids Research. vol 39(5): 1739–1748. doi: https://doi.org/10.1093/nar/gkq1097.
Huang X, Wang X, Zhang J, Xia N, Zhao Q. 2017. Escherichia coli-derived virus-like particles in vaccine development. npj Vaccines. vol 2(1): 1–9. doi: https://doi.org/10.1038/s41541-017-0006-8.
Hunt RC, Simhadri VL, Iandoli M, Sauna ZE, Kimchi-Sarfaty C. 2014. Exposing synonymous mutations. Trends in Genetics. vol 30(7): 308–321. doi: https://doi.org/10.1016/j.tig.2014.04.006.
Li S, Mo X, Wei M, Pan H, Zhang J, Xia N. 2016. Truncated L1 protein of human papillomavirus type 52. (US Patent No. 9,499,591). U.S. Patent and Trademark Office. https://patents.google.com/patent/US9499591B2/en#patentCitations.
Luo Y, Enghiad B, Zhao H. 2016. New tools for reconstruction and heterologous expression of natural product biosynthetic gene clusters. Natural Product Reports. vol 33(2): 174–182. doi: https://doi.org/10.1039/C5NP00085H.
Marchisio MA. 2014. Parts & pools: a framework for modular design of synthetic gene circuits. Frontiers in Bioengineering and Biotechnology. vol 2: 1–10. doi: https://doi.org/10.3389/fbioe.2014.00042.
Mauro VP, Chappell SA. 2014. A critical analysis of codon optimization in human therapeutics. Trends in Molecular Medicine. vol 20(11): 604–613. doi: https://doi.org/10.1016/j.molmed.2014.09.003.
Nakamura Y, Gojobori T, Ikemura T. 2000. Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Research. vol 28(1): 292. doi: https://doi.org/10.1093/nar/28.1.292.
Puigbò P, Bravo IG, Garcia-Vallve S. 2008. CAIcal: a combined set of tools to assess codon usage adaptation. Biology Direct. vol 3: 1–8. doi: https://doi.org/10.1186/1745-6150-3-38.
Plummer EM, Manchester M. 2011. Viral nanoparticles and virus-like particles: platforms for contemporary vaccine design. Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology. vol 3(2): 174–196. doi: https://doi.org/10.1002/wnan.119.
Quan J, Saaem I, Tang N, Ma S, Negre N, Gong H, White KP, Tian J. 2011. Parallel on-chip gene synthesis and application to optimization of protein expression. Nature Biotechnology. vol 29(5): 449–452. doi: https://doi.org/10.1038/nbt.1847.
Reuter JS, Mathews DH. 2010. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics. vol 11(1): 1–9. doi: https://doi.org/10.1186/1471-2105-11-129.
Rosano GL, Ceccarelli EA. 2014. Recombinant protein expression in Escherichia coli: advances and challenges. Frontiers in Microbiology. vol 5: 1–17. doi: https://doi.org/10.3389/fmicb.2014.00172.
Saunders R, Deane CM. 2010. Synonymous codon usage influences the local protein structure observed. Nucleic Acids Research. vol 38(19): 6719–6728. doi: https://doi.org/10.1093/nar/gkq495.
Schiller JT, Day PM, Kines RC. 2010. Current understanding of the mechanism of HPV infection. Gynecologic Oncology. vol 118(1): 12–17. doi: https://doi.org/10.1016/j.ygyno.2010.04.004.
Sievers F, Higgins DG. 2014. Clustal Omega, accurate alignment of very large numbers of sequences. Methods Molecular Biology. vol 1079: 105–116. doi: https://doi.org/10.1007/978-1-62703-646-7_6.
Tumban E, Peabody J, Tyler M, Peabody DS, Chackerian B. 2012. VLPs displaying a single L2 epitope induce broadly cross-neutralizing antibodies against human papillomavirus. PloS One. vol 7(11): 1–11. doi: https://doi.org/10.1371/journal.pone.0049751.
Villalobos A, Ness JE, Gustafsson C, Minshull J, Govindarajan S. 2016. Gene Designer: a synthetic biology tool for constructing artificial DNA segments. BMC Bioinformatics. vol 7: 1–8. doi: https://doi.org/10.1186/1471-2105-7-285.
Wang JR, Li YY, Liu DN, Liu JS, Li P, Chen LZ, Xu SD. 2015. Codon optimization significantly improves the expression level of α-amylase gene from Bacillus licheniformis in Pichia pastoris. BioMed Research International. vol 2015: 1–9. doi: https://doi.org/10.1155/2015/248680.
Wei M, Wang D, Li Z, Song S, Kong X, Mo X, Yang Y, He M, Li Z, Huang B, Lin Z, Pan H, Zheng Q, Yu H, Gu Y, Zhan J, Li S, Xia N. 2018. N-terminal truncations on L1 proteins of human papillomaviruses promote their soluble expression in Escherichia coli and self-assembly in vitro. Emerging Microbes and Infection. vol 7(1): 1–12. doi: https://doi.org/10.1038/s41426-018-0158-2.
Welch M, Govindarajan S, Ness JE, Villalobos A, Gurney A, Minshull J, Gustafsson C. 2009. Design parameters to control synthetic gene expression in Escherichia coli. PloS One. vol 4(9): 1–10. doi: https://doi.org/10.1371/journal.pone.0007002.
Yang X, Cheng Y, Li C. 2017. The role of TLRs in cervical cancer with HPV infection: a review. Signal Transduction and Targeted Therapy. vol 2(1): 1–10. doi: https://doi.org/10.1038/sigtrans.2017.55.
Copyright (c) 2020 Kartika Sari Dewi, Wien Kusharyoto
This work is licensed under a Creative Commons Attribution 4.0 International License.
COPYRIGHT AND LICENSE STATEMENT
COPYRIGHT
Biogenesis: Jurnal Ilmiah Biologi is published under the terms of the Creative Commons Attribution license. Authors hold the copyright and retain publishing rights without restriction to their work. Users may read, download, copy, distribute, and print the work in any medium, provided the original work is properly cited.
LICENSE TO PUBLISH
1. License
The use of the article will be governed by the Creative Commons Attribution license as currently displayed on http://creativecommons.org/licenses/by/4.0.
2. Author’s Warranties
The author warrants that the article is original, written by stated author/s, has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User Rights
Under the Creative Commons Attribution license, the users are free to download, reuse, reprint, modify, distribute and/or copy the content for any purpose, even commercially, as long as the original authors and source are cited. No permission is required from the authors or the publishers.
4. Co-Authorship
If the article was prepared jointly with other authors, the corresponding author warrants that he/she has been authorized by all co-authors, and agrees to inform his/her co-authors of the terms of this statement.
5. Miscellaneous
Biogenesis: Jurnal Ilmiah Biologi may conform the article to a style of punctuation, spelling, capitalization, and usage that it deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers.