Low homology between 2019-nCoV Orf8 protein and its SARS-CoV counterparts questions their identical function

SARS-CoV accessory protein Orf8b is involved in suppressing interferon-mediated immune response of the infected cell and this might lead to supposition that the corresponding protein 2019-nCoV Orf8 shares the same role. But the tertiary structures of these proteins are still unknown, and the primary structures demonstrate very low homology and different calculating parameters. This time they both are affected by stabilizing selection and in natural viral populations do not tend to be deleted. The question whether in this case very different proteins could share the same function rises from the present data.

So-called accessory proteins of coronaviruses in many cases are involved in blocking of the interferonmediated immune response of the infected cell [13]. Among them polymorphic protein Orf8 is present in 2019-nCoV and SARS-CoV. In the CoViD-19 pathogen and early strains of SARS coronavirus it exists in the form of a single 121-122 aa protein. In many other SARS strains it is splitted for two different proteins Orf8a and Orf8b. They still are coded by one mRNA, and their coding sequences partially overlap (Fig. 1).
On some different models it was shown that the SARS proteins Orf8b and joint Orf8ab are involved in blocking of interferon induction [15]. They interact directly with interferon regulatory factor 3 (IRF3) [16] leading to its ubiquitination and degradation, that prevents the induction of interferon synthesis by its activated form. Orf8a this time prevents the fast degradation of Orf8b. Besides in one of models it some way prevents expression of Orf8b [15], that itself is difficult to interpret. The nucleotide sequence absent in the strains with splitted Orf8 is marked by arrows. Figure is from the work of Keng and Tan [14]. 2019-nCoV resembles upper variant, but lacks Orf9b and possess Orf10 right of gene N.
It seems that comparing the primary sequences of the coronaviral proteins similar to Orf8b we should achieve high homology and the presence of some invariant structural features, ensuring its function that is really important and supported by stabilizing selection. Otherwise, we should see an example of very fast evolution, changing whether function or the structural base of its performing.

Materials and methods
Sequences of the proteins Orf8 and Orf8a are from the public database of National Institute of Health of USA. Information about genomic polymorphisms of 2019-nCoV is from the database described in the work of von Dorp and coauthors [17].
For multiple sequence alignment the program Clustal Omega was used [18]. The program Protein BLAST [19] was used for search and pairwise alignment of similar sequences by the chosen query. The standard settings were applied. *Corresponding author: viktori-y-99@mail.ru To assess amino acid composition and determine theoretical values of computational protein parameters, for instance isoelectric point and grand average of hydropathicity, the program ProtParam [20] was applied. Among 11 Orf8b protein sequences belonging to SARS-CoV isolated from humans only 2 polymorphic positions were detected. They both are close to C-terminus. That are Lys81Asn and Thr83Ile substitutions. They tend to coincide and are the minority. In position 81 both residues are positively charged, although to varying degrees, in position 83 properties of residue change.

Results and discussion
Other positions seem to be monomorphic in such a sampling.
14 sequences of the joint SARS-CoV Orf8 protein all were found to be identical. This sample obviously does not exhibit all the variation abilities, but it lets achieve the sequence to be relatively conserved in human samples.
Joint Orf8 sequences of SARS-like bat coronaviruses are divided to three completely different types. The most numerous cluster is including 10 closely homologous sequences possessed 17 polymorphic positions, 11 of them in the N-terminal region, if to call so 84 N-terminal residues basing on the length of presumably homologous Orf8b.
Aligning 13 2019-nCoV Orf8 sequences, 2 close bat and 2 pangolin ones, 15 polymorphic positions were found in 84 aa arbitrary N-terminal region, besides 16 aa deletion in one of the pangolin sequences.
These four groups of protein sequences, aligned and roughly evaluated in terms of their variability, were aligned to visualize common features of their regions presumably corresponding to SARS-CoV Orf8b. The most characteristic sequences were selected to show them, giving an idea of the overall picture ( fig. 2).
Thus, sequences of joint SARS-CoV and SARS-CoV like Orf8 from samples taken from human (hS), civet (cS) and bat (bS), of human SARS Orf8b (8b), of 2019-nCoV and nCoV-like viruses Orf8 from human (hC), bat (bC) and pangolin (pC) were included in the multiple alignment.
SARS-CoV Orf8b has strong homology with some of SARS-like viruses from bats, weak homology with Orf8 from 2019-nCoV and related coronaviruses, and very little similarity to the joint Orf8 (Orf8ab) from some SARS-CoV forms from human and their close relatives from bat and civet. With SARS-CoV with joint Orf8 it shares essentially different positions then with 2019-nCoV. The proteins in our sampling share universally only two amino acid positions: Cys40 (Cys83 by numeration of 2019-nCoV) and Leu56 (Leu98).
Such a way, Orf8 protein from SARS pathogens are absolutely divergent not only by splitting of Orf8 coding sequence but also by the history of this sequence. It may be closer to 2019-nCoV in the splitted variant and very far from it in the joint one. However, in both cases the similarities do not seem completely irregular.
Percentage of identity between Orf8b and parts of SARS-CoV and 2019-nCoV Orf8 aligned with it is only 17% for SARS fused form and 25% for 2019-nCoV protein (tab. 1). The most impressing is completely different electric charge of both types SARS-CoV and 2019-nCoV molecules, that mirrors isoelectric point (pI) value. The indexes related to hydrophobicity also differ strongly. These data do not testify in favor of functional homology between SARS-CoV and 2019-nCoV Orf8 proteins, even to a greater extent than low levels of identity.
It is known that the capabilities of the joint Orf8 and the splitted proteins to bind other viral proteins are different [21]. Keng and Tan deduce from these data some significant conformational rearrangement that occurred because of transition from single to splitted protein [14]. But then only the data about interactions between SARS-CoV proteins were used. Adopting the version of functional homology between 2019-nCoV Orf8 and its SARS-CoV counterparts, we should turn attention to the conclusions of Wong and coauthors, that postulate direct interaction between SARS-CoV Orf8b and Orf8ab with human IRF3 [15]. So, we might expect that their counterpart in 2019-nCoV Orf8 shares this function, but too different protein parameters question this point.
The obtained data do not claim to be the final solution to this question, but they draw attention to significant differences among coronaviruses designated as SARS-CoV, as well as to the deep divergence of Orf8 proteins. In this regard, attention should be paid to the independent adaptation of SARS-CoV and 2019-nCoV to reproduction in human cells. In the case of commonality of their functions, we have an example of convergent evolution, which went significantly different ways. If the functions have changed, experimental studies can help clarify this complex picture. The expectation of structural information about the considered proteins becomes more intriguing in the presented context.