美國斯坦福大學醫學與退伍軍人事務學院帕洛阿爾托醫學中心、Moderna公司公布了疫苗序列,這代表什麼?這代表為了減少全世界因為肺炎導致生命喪生,這家公司寧願少賺上百億美元,而公開了其試驗的mRNA序列於github。 這就是自由、平等、博愛的價值觀。
內容翻譯:
Departments of Pathology, Genetics, Pediatrics, and Medicine, Stanford University School of Medicine and Veterans Affairs Palo Alto Medical Center *Correspondence:
斯坦福大學醫學與退伍軍人事務學院帕洛阿爾托醫學中心病理學、遺傳學、兒科學和醫學系 * 通信:
Abstract: RNA vaccines have become a key tool in moving forward through the challenges raised both in the current pandemic and in numerous other public health and medical challenges. With the rollout of vaccines for COVID-19, these synthetic mRNAs have become broadly distributed RNA species in numerous human populations. Despite their ubiquity, sequences are not always available for such RNAs. Standard methods facilitate such sequencing. In this note, we provide experimental sequence information for the RNA components of the initial Moderna (https://pubmed.ncbi.nlm.nih.gov/32756549/) and Pfizer/BioNTech (https://pubmed.ncbi.nlm.nih.gov/33301246/) COVID-19 vaccines, allowing a working assembly of the former and a confirmation of previously reported sequence information for the latter RNA.
摘要: RNA 疫苗已成為應對當前大流行和其他眾多公共衛生和醫療挑戰的關鍵工具。隨著2019冠狀病毒疾病疫苗的推出,這些合成的 mrna 已經成為廣泛分佈在無數人群中的 RNA 物種。盡管它們無處不在,序列並不總是適用於這樣的 rna。標准的方法有助於這樣的排序。在本文中,我們提供了最初的現代 RNA ( https://pubmed.ncbi.nlm.nih.gov/32756549/)和輝瑞/生物技術( https://pubmed.ncbi.nlm.nih.gov/33301246/)2019冠狀病毒疾病疫苗的 RNA 成分的實驗序列信息,允許前者的工作組裝和確認先前報道的後者的序列信息。
Objective: Sharing of sequence information for broadly used therapeutics has substantial benefit in design of improved clinical tools and precise diagnostics. As examples of such applications, (i) Any medical or public health analysis relying on high throughput sequencing data to track SARS-COV2 and its variants would benefit from knowledge of vaccine sequences in order to distinguish RNA sequencing reads coming from the vaccine from those of viral origin, and (ii) Diagnostic labs designing nucleic acid surveillance tests (like PCR or LAMP assays) can benefit from vaccine sequence information to avoid confusion between vaccinated and infected test subjects when analyzing assay results.
目的: 為廣泛應用的治療方法共享序列信息,在改進臨床工具設計和精確診斷方面具有重要意義。作為這類應用的例子,(i)任何依靠高通量測序數據跟蹤 SARS-COV2及其變體的醫學或公共衛生分析都將受益於疫苗序列的知識,以區分來自疫苗的 RNA 測序讀數和來自病毒的 RNA 測序讀數; (ii)設計核酸監測試驗(如 PCR 或 LAMP 測試)的診斷實驗室可受益於疫苗序列信息,以避免在分析化驗結果時混淆接種疫苗和受感染試驗對象。
Description: For this work, RNAs were obtained as discards from the small portions of vaccine doses that remained in vials after immunization; such portions would have been required to be otherwise discarded and were analyzed under FDA authorization for research use. To obtain the small amounts of RNA needed for characterization, vaccine remnants were phenol-chloroform extracted using TRIzol Reagent (Invitrogen), with intactness assessed by Agilent 2100 Bioanalyzer before and after extraction.
說明: 在這項工作中,RNAs 是從免疫後保留在小瓶中的少量疫苗劑量中作為廢棄物獲得的; 這些部分必須被否則地丟棄,並經 FDA 批准用於研究用途的分析。為了獲得角色塑造所需的少量 RNA,疫苗殘留物用 TRIzol 試劑(Invitrogen)提取苯酚-氯仿,用 Agilent 2100生物分析儀對提取前後的完整性進行評估。
Although our analysis mainly focused on RNAs obtained as soon as possible following discard, we also analyzed samples which had been refrigerated (~4 ℃) for up to 42 days with and without the addition of EDTA. Interestingly a substantial fraction of the RNA remained intact in these preparations. We note that the formulation of the vaccines includes numerous key chemical components which are quite possibly unstable under these conditions-- so these data certainly do not suggest that the vaccine as a biological agent is stable. But it is of interest that chemical stability of RNA itself is not sufficient to preclude eventual development of vaccines with a much less involved cold-chain storage and transportation.
盡管我們的分析主要集中在廢棄物盡快獲得的 rna 上,我們也分析了加入 EDTA 和不加 EDTA 冷藏42天的樣品。有趣的是,在這些制劑中有相當一部分 RNA 仍然完好無損。我們注意到疫苗的配方包括許多關鍵的化學成分,這些成分在這些條件下很可能是不穩定的,因此這些數據當然不能表明疫苗作為生物制劑是穩定的。但令人感興趣的是,RNA 本身的化學穩定性不足以阻止最終開發涉及冷鏈儲存和運輸較少的疫苗。
For further analysis, the initial RNAs were fragmented by heating to 94℃, primed with a random hexamer-tailed adaptor, amplified through a template-switch protocol (Takara SMARTer Stranded RNA-seq kit), and sequenced using a MiSeq instrument (Illumina) with paired end 78-per end sequencing. As a reference material in specific assays, we included RNA of known concentration and sequence (from bacteriophage MS2).
為了進一步的分析,最初的 rna 被加熱到94 ° c,引發了隨機六角尾適配器,通過模板交換協議(Takara SMARTer Stranded RNA-seq kit)擴增,並使用配對端78-per end 測序的 MiSeq 儀器(Illumina)測序。作為特異性檢測的參考材料,我們包括了已知濃度和序列的 RNA (來自噬菌體MS2)。
From these data, we obtained partial information on strandedness and a set of segments that could be used for assembly. This was particularly useful for the Moderna vaccine, for which the original vaccine RNA sequence was not available at the time our study was carried out. Contigs encoding full-length spikes were assembled from the Moderna and Pfizer datasets. The Pfizer/BioNTech data [Figure 1] verified the reported sequence for that vaccine (https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/), while the Moderna sequence [Figure 2] could not be checked against a published reference.
從這些數據中,我們獲得了關於擱淺的部分信息和一組可用於組裝的片段。這對現代疫苗特別有用,因為在我們的研究進行時,原始疫苗 RNA 序列還不可用。Contigs 編碼全長棘波是由 Moderna 和輝瑞的數據集拼接而成的。輝瑞公司/BioNTech 公司的數據[圖1]驗證了該疫苗的報告序列,而現代 https://berthub.eu/articles/posts/reverse-engineering-source-code-of-The-BioNTech-Pfizer-vaccine/序列[圖2]無法與已發表的參考文獻進行核對。
RNA preparations lacking dsRNA are desirable in generating vaccine formulations as these will minimize an otherwise dramatic biological (and nonspecific) response that vertebrates have to double stranded character in RNA (https://www.nature.com/articles/nrd.2017.243). Numerous recent advances have resulted in approaches to minimize dsRNA (e.g. https://pubmed.ncbi.nlm.nih.gov/30933724/, https://pubmed.ncbi.nlm.nih.gov/31900329/); nonetheless measurement remains a continued necessity. In the sequence data that we analyzed, we found that the vast majority of reads were from the expected sense strand. In addition, the minority of antisense reads appeared different from sense reads in lacking the characteristic extensions expected from the template switching protocol. Examining only the reads with an evident template switch (as an indicator for strand-of-origin), we observed that both vaccines overwhelmingly yielded sense reads (>99.99%). Independent sequencing assays and other experimental measurements are ongoing and will be needed to determine whether this template-switched sense read fraction in the SmarterSeq protocol indeed represents actual dsRNA content in the original material.
缺乏 dsRNA 的 RNA 制劑是生成疫苗配方的理想選擇,因為這將最大限度地減少脊椎動物在 RNA ( https://www.nature.com/articles/nrd.2017.243)中雙鏈特徵的生物反應(和非特異性)。許多最近的進展已經導致了減少 dsRNA 的方法(例如 https://pubmed.ncbi.nlm.nih.gov/30933724/、 https://pubmed.ncbi.nlm.nih.gov/31900329/) ; 盡管如此,測量仍然是一個持續的必要性。在我們分析的序列數據中,我們發現絕大多數的讀數來自期望的感覺鏈。此外,少數反義片段與正義片段不同,缺乏模板切換協議所期望的特徵擴展。用一個明顯的模板開關(作為原發鏈的指示器)檢測這些片段,我們觀察到這兩種疫苗壓倒性地產生了感覺讀取(> 99.99%)。獨立測序分析和其他實驗測量正在進行中,將需要確定在 SmarterSeq 協議中這個模板轉換的讀取分數是否確實代表了原始材料中 dsRNA 的實際含量。
This work provides an initial assessment of two RNAs that are now a part of the human ecosystem and that are likely to appear in numerous other high throughput RNA-seq studies in which a fraction of the individuals may have previously been vaccinated.
這項工作提供了對兩個 rna 的初步評估,這兩個 rna 現在是人類生態系統的一部分,很可能出現在許多其他高通量 RNA-seq 研究中,其中一小部分個體可能以前接種過疫苗。
ProtoAcknowledgements: We thank our colleagues here for their help and suggestions (Nimit Jain, Emily Greenwald, Nelson Hall, Lamia Wahba, William Wang, Amisha Kumar, Sameer Sundrani, David Lipman, Marc Salit), and additionally acknowledge numerous colleagues who have discussed and educated us (directly and indirectly) in areas of RNA synthesis enzymology, vaccine design, and software engineering.
感謝原型: 我們感謝我們在這裡的同事們的幫助和建議(Nimit Jain,Emily Greenwald,Nelson Hall,Lamia Wahba,William Wang,Amisha Kumar,Sameer Sundrani,David Lipman,Marc Salit) ,並感謝許多同事在 RNA 合成學,疫苗酶設計和軟件工程領域(直接和間接)對我們進行了討論和教育。
Figure 1: Spike-encoding contig assembled from BioNTech/Pfizer BNT-162b2 vaccine. Although the full coding region is included, the nature of the methodology used for sequencing and assembly is such that the assembled cDNA-derived contig could lack some sequence from the ends of the vaccine RNA and would lack any indication of base modifications. Within the assembled sequence, this hypothetical sequence shows a perfect match to the corresponding sequence from documents available online derived from manufacturer communications with the World Health Organization [as reported by https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/].. The 5’ end for the assembly matches the start site noted in these documents, while the read-based assembly lacks an interrupted polyA tail (A30(GCATATGACT)A70) that is expected to be present in the mRNA.
圖1: 由 BioNTech/Pfizer BNT-162b2疫苗組裝的 spike 編碼重疊群。雖然完整的編碼區域包括在內,但是用於測序和組裝的方法的性質是,組裝的 cdna 衍生的重疊群可能缺少疫苗 RNA 末端的某些序列,並且缺少任何基礎修飾的跡象。在組合序列中,這個假設序列與從製造商與世界衛生組織的通信中獲得的在線文件中得到的相應序列完全匹配[據 https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/報道]。.程序集的5’端匹配這些文檔中提到的起始位置,而基於讀的程序集缺少一個被中斷的 polyA 尾(A30(GCATATGACT) A70) ,該尾被期望出現在 mRNA 中。
Figure 2: Spike-encoding contig assembled from Moderna mRNA-1273 vaccine. This is a partial sequence of the vaccine RNA. Although the full coding region is included, the assembled contig could lack some sequence from the ends of the vaccine RNA.
圖2: 由 Moderna mRNA-1273疫苗組裝的 spike 編碼重疊群。這是疫苗 RNA 的部分序列。雖然包含完整的編碼區,但組裝的重疊群可能缺少疫苗 RNA 末端的序列。