Strategy for Designing the Synthetic Gene Encoding Human papillomavirus Major Capsid L1 Protein for Heterologous Expression in Escherichia coli System

Abstract

DNA is widely used to construct heterologously expressed genes. The adaptation of the codons to the host organism is necessary in order to ensure sufficient production of proteins. The GC content, codon identity and the mRNA from the translation site are also important in the design of the gene construct. This study performed a strategy for the design of synthetic gene encoding HPV52 L1 protein and several analyses at the genetic level to optimize its protein expression in the Escherichia coli BL21(DE3) host. The determination of the codon optimization was performed by collecting 75 HPV52 L1 protein sequences in the NCBI database. Furthermore, all the sequences were analyzed using multiple global alignments by Clustal Omega web server. Once the model was determined, codon optimization was performed using OPTIMIZER and the web server of the IDT codon optimization tool based on the E. coli B. The generated open reading frame (ORF) sequence was analyzed using Restriction mapper web server to choose the restriction site for facilitating the cloning stage, which is adjusted for pJExpress414 expression vector. To maximize the protein expression level, the mRNA secondary structure analysis around the ribosome binding site (rbs) was performed. A slight modification at the 5’-terminal end waa carried out in order to get more accessible rbs and increasing mRNA folding free energy. Finally, the construction of the synthetic gene was confirmed to ensure that no mutation occurs in the protein and to calculate its Codon Adaptation Index (CAI) and GC content. The above strategy, which leads to a good ORF sequence with the value of the free mRNA folding energy around rbs, is -5.5 kcal / mol, CAI = 0.787 and GC content 49.5%. This result is much better than its original gene. This result is much better compared to its native gene. Theoretically it is possible that this synthetic gene construct generates a high level protein expression in E. coli BL21 (DE3) under the regulation of the T7 promoter.

References

Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, Bray F. 2020. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. The Lancet of Global Health. vol 8(2): 191–203. doi: https://doi.org/10.1016/S2214-109X(19)30482-6.

Bellaousov S, Reuter JS, Seetin MG, Mathews DH. 2013. RNAstructure: Web servers for RNA secondary structure prediction and analysis. Nucleic Acids Research. vol 41: 471–474. doi: https://doi.org/10.1093/nar/gkt290.

Buck CB, Day PM, Trus BL. 2013. The papillomavirus major capsid protein L1. Virology. vol 445(1–2): 169–174. doi: https://doi.org/10.1016/j.virol.2013.05.038.

Buhr F, Jha S, Thommen M, Mittelstaet J, Kutz F, Schwalbe H, Rodnina MV, Komar AA. 2016. Synonymous codons direct cotranslational folding toward different protein conformations. Molecular Cell. vol 61(3): 341–351. doi: https://doi.org/10.1016/j.molcel.2016.01.008.

Chaney JL, Clark PL. 2015. Roles for synonymous codon usage in protein biogenesis. Annual Review of Biophysics. vol 44: 143 – 166. doi: https://doi.org/10.1146/annurev-biophys-060414-034333.

Chin JX, Chung BKS, Lee DY. 2014. Codon Optimization OnLine (COOL): a web-based multi-objective optimization platform for synthetic gene design. Bioinformatics. vol 30(15): 2210–2212. doi: https://doi.org/10.1093/bioinformatics/btu192.

Dewi KS, Fuad AM. 2020. Improving the expression of human granulocyte colony stimulating factor in Escherichia coli by reducing the GC-content and increasing mRNA free folding energy at 5’-terminal end. Advanced Pharmaceutical Bulletin. vol 10(4): 610–616. doi: https://dx.doi.org/10.34172%2Fapb.2020.073.

Gaspar P, Oliveira JL, Frommlet J, Santos MA, Moura G. 2012. EuGene: maximizing synthetic gene design for heterologous expression. Bioinformatics. vol 28(20): 2683–2684. doi: https://doi.org/10.1093/bioinformatics/bts465.

Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A. 2003. ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Research. vol 31(13): 3784–3788. doi: https://doi.org/10.1093/nar/gkg563.

Ghavim M, Abnous K, Arasteh F, Taghavi S, Nabavinia MS, Alibolandi M, Ramezani M. 2017. High level expression of recombinant human growth hormone in Escherichia coli: crucial role of translation initiation region. Research in Pharmaceutical Sciences. vol 12(2): 168–175. doi: https://dx.doi.org/10.4103%2F1735-5362.202462.

Gingold H, Pilpel Y. 2011. Determinants of translation efficiency and accuracy. Molecular Systems Biology. vol 7(1): 1–13. doi: https://doi.org/10.1038/msb.2011.14

Gu W, Zhou T, Wilke CO. 2010. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Computational Biology. vol 6(2): 1–8. doi: https://doi.org/10.1371/journal.pcbi.1000664.

Guimaraes JC, Mittal N, Gnann A, Jedlinski D, Riba A, Buczak K. Schmidt A, Zavolan M. 2020. A rare codon-based translational program of cell proliferation. Genome Biology. vol 21(1): 1–20. doi: https://doi.org/10.1186/s13059-020-1943-5.

Handa Y, Inaho N, & Nameki N. 2011. YaeJ is a novel ribosome-associated protein in Escherichia coli that can hydrolyze peptidyl–tRNA on stalled ribosomes. Nucleic Acids Research. vol 39(5): 1739–1748. doi: https://doi.org/10.1093/nar/gkq1097.

Huang X, Wang X, Zhang J, Xia N, Zhao Q. 2017. Escherichia coli-derived virus-like particles in vaccine development. npj Vaccines. vol 2(1): 1–9. doi: https://doi.org/10.1038/s41541-017-0006-8.

Hunt RC, Simhadri VL, Iandoli M, Sauna ZE, Kimchi-Sarfaty C. 2014. Exposing synonymous mutations. Trends in Genetics. vol 30(7): 308–321. doi: https://doi.org/10.1016/j.tig.2014.04.006.

Li S, Mo X, Wei M, Pan H, Zhang J, Xia N. 2016. Truncated L1 protein of human papillomavirus type 52. (US Patent No. 9,499,591). U.S. Patent and Trademark Office. https://patents.google.com/patent/US9499591B2/en#patentCitations.

Luo Y, Enghiad B, Zhao H. 2016. New tools for reconstruction and heterologous expression of natural product biosynthetic gene clusters. Natural Product Reports. vol 33(2): 174–182. doi: https://doi.org/10.1039/C5NP00085H.

Marchisio MA. 2014. Parts & pools: a framework for modular design of synthetic gene circuits. Frontiers in Bioengineering and Biotechnology. vol 2: 1–10. doi: https://doi.org/10.3389/fbioe.2014.00042.

Mauro VP, Chappell SA. 2014. A critical analysis of codon optimization in human therapeutics. Trends in Molecular Medicine. vol 20(11): 604–613. doi: https://doi.org/10.1016/j.molmed.2014.09.003.

Nakamura Y, Gojobori T, Ikemura T. 2000. Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Research. vol 28(1): 292. doi: https://doi.org/10.1093/nar/28.1.292.

Puigbò P, Bravo IG, Garcia-Vallve S. 2008. CAIcal: a combined set of tools to assess codon usage adaptation. Biology Direct. vol 3: 1–8. doi: https://doi.org/10.1186/1745-6150-3-38.

Plummer EM, Manchester M. 2011. Viral nanoparticles and virus-like particles: platforms for contemporary vaccine design. Wiley Interdisciplinary Reviews: Nanomedicine and Nanobiotechnology. vol 3(2): 174–196. doi: https://doi.org/10.1002/wnan.119.

Quan J, Saaem I, Tang N, Ma S, Negre N, Gong H, White KP, Tian J. 2011. Parallel on-chip gene synthesis and application to optimization of protein expression. Nature Biotechnology. vol 29(5): 449–452. doi: https://doi.org/10.1038/nbt.1847.

Reuter JS, Mathews DH. 2010. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics. vol 11(1): 1–9. doi: https://doi.org/10.1186/1471-2105-11-129.

Rosano GL, Ceccarelli EA. 2014. Recombinant protein expression in Escherichia coli: advances and challenges. Frontiers in Microbiology. vol 5: 1–17. doi: https://doi.org/10.3389/fmicb.2014.00172.

Saunders R, Deane CM. 2010. Synonymous codon usage influences the local protein structure observed. Nucleic Acids Research. vol 38(19): 6719–6728. doi: https://doi.org/10.1093/nar/gkq495.

Schiller JT, Day PM, Kines RC. 2010. Current understanding of the mechanism of HPV infection. Gynecologic Oncology. vol 118(1): 12–17. doi: https://doi.org/10.1016/j.ygyno.2010.04.004.

Sievers F, Higgins DG. 2014. Clustal Omega, accurate alignment of very large numbers of sequences. Methods Molecular Biology. vol 1079: 105–116. doi: https://doi.org/10.1007/978-1-62703-646-7_6.

Tumban E, Peabody J, Tyler M, Peabody DS, Chackerian B. 2012. VLPs displaying a single L2 epitope induce broadly cross-neutralizing antibodies against human papillomavirus. PloS One. vol 7(11): 1–11. doi: https://doi.org/10.1371/journal.pone.0049751.

Villalobos A, Ness JE, Gustafsson C, Minshull J, Govindarajan S. 2016. Gene Designer: a synthetic biology tool for constructing artificial DNA segments. BMC Bioinformatics. vol 7: 1–8. doi: https://doi.org/10.1186/1471-2105-7-285.

Wang JR, Li YY, Liu DN, Liu JS, Li P, Chen LZ, Xu SD. 2015. Codon optimization significantly improves the expression level of α-amylase gene from Bacillus licheniformis in Pichia pastoris. BioMed Research International. vol 2015: 1–9. doi: https://doi.org/10.1155/2015/248680.

Wei M, Wang D, Li Z, Song S, Kong X, Mo X, Yang Y, He M, Li Z, Huang B, Lin Z, Pan H, Zheng Q, Yu H, Gu Y, Zhan J, Li S, Xia N. 2018. N-terminal truncations on L1 proteins of human papillomaviruses promote their soluble expression in Escherichia coli and self-assembly in vitro. Emerging Microbes and Infection. vol 7(1): 1–12. doi: https://doi.org/10.1038/s41426-018-0158-2.

Welch M, Govindarajan S, Ness JE, Villalobos A, Gurney A, Minshull J, Gustafsson C. 2009. Design parameters to control synthetic gene expression in Escherichia coli. PloS One. vol 4(9): 1–10. doi: https://doi.org/10.1371/journal.pone.0007002.

Yang X, Cheng Y, Li C. 2017. The role of TLRs in cervical cancer with HPV infection: a review. Signal Transduction and Targeted Therapy. vol 2(1): 1–10. doi: https://doi.org/10.1038/sigtrans.2017.55.

Published
2020-12-30
Section
Research Articles
Abstract viewed = 233 times