Polymorphism of CSN1S1 (g.12164G>A) and CSN2 (g.8913C>A) genes in pure and cross dairy goats

Casein genes directly control milk protein of animals. CSN1S1 (αS1-Casein) and CSN2 (β-Casein) genes influence on milk protein fractions. Genetic polymorphisms of CSN1S1 gene at g.12164G>A locus and CSN2 gene at g.8913C>A locus were identified by PCR-RFLP technique. Animal samples were pure dairy goats providing PE (5 hds.), Saanen (8 hds.) and their crosses providing Sapera (50% Saanen, 50% PE) (51 hds.) and SaanPE (75% Saanen, 25% PE) (3 hds.) from IRIAP dairy goat station. Allele frequency, genotype frequency, heterozygosity value, and Hardy-Weinberg (H-W) equilibrium value were analyzed by Popgen32 program. CSN1S1_g.12164G>A locus resulted in two alleles, i.e. G allele (192 bp, 145 bp, and 101 bp) and A allele (337 bp and 101 bp). The G allele from the highest frequenciest was successively Saanen (0.625), Sapera (0.578), PE (0.400), and SaanPE (0.333). Most dairy goats were heterozygote (Ho>He) and in H-W equilibrium (qcount < qP0.05). Whereas CSN2_g.8913C>A locus was monomorphic for possesing only C allele (233 bp and 162 bp), without A allele (416 bp). The existent g.12164G>A SNP of the CSN1S1 gene of could be a potencial molecular selection marker of milk protein content in dairy goat.


Introduction
Milk yield and its quality are essential factors for national dairy development. Dairy goat is one of small ruminants that is quite adaptive to tropical climatic condition. Beside of dairy cattle, dairy goat can be positioned as a good producer to produce milk [1]. Milk protein is the main precursor of bioactive peptides which play an essential role in biological regulation due to its function as antihypertensive, antioxidant, antimicrobial, immunomodulatory and mineral binding [2]. Milk protein from ruminants consists of two main components, namely casein and whey [3,4,5]. Casein represents about 80% of the total milk protein contents [6]. Milk of dairy cattle contains about 3.5% protein consisting of casein by 78-82% and whey by 18-22% [5]. Casein fraction plays an essential role as a peptide bioactive. Casein in milk is found as suspension particles from casein miscelles and dairy products such as yogurt, cheese and ice cream [7].
Milk casein fractions were determined by four casein genes, namely αS1-casein (CSN1S1), β-casein (CSN2), αS2-casein (CSN1S2), and κ-casein (CSN3) [3,5,8,9]. These casein gene complexes were located closely about 250 kb at chromosome 6 which are very commonly inherited as haplotype [8,10,11,12]. The αs1-Casein gene or CSN1S1 gene in goat had 16.5 kb of transcriptions units consisting of 19 exons, with variation in the base lengths from 24 bp to 358 bp, and having 18 introns [13]. The CSN1S1 gene expressed the highest level of polymorphism compared to the other three casein genes [14]. This gene had a direct effect on quality, character, and composition of milk [15]. The presence of a base mutation in exon 12 at CSN1S1 gene at g.12164G>A locus resulted in the B3 allele (Cozenza et al, 2008). Further β-casein gene or CSN2 gene presented a base length about 10.7 kb consisting of 9 exons [16]. The β-casein was a phosphoprotein sensitive to calcium [17]. In goat, autosomal alleles including A, A1, C, C1, E, and 0 were characterized at DNA levels, while B and D alleles were characterized at protein levels [18]. The presence of a base mutation in exon 7 at the CSN2 gene at g.8913C>A locus resulted in E allele being responsible for the changes of the amino acid of Ser166 to Tyr166 in complete protein [19].
Exploration of genetic variants of the CSN1S1 gene has been carried out in Saanen goat [6,7]. While genetic polymorphisms of the CSN2 gene have been reported, among others, in cattle [14,20], sheep, buffalo, and yaks [21]. Studies on genetic polymorphisms of CSN1S1 and CSN2 genes need to be explored in local PE dairy goat and its crosses to exotic Saanen dairy goat that could be useful genetic informations to be considered as molecular selection tool in achieving genetic improvement of milk production and milk protein contents of these pure and crosses dairy goats.

Location and Period of Research
Observation and collection of blood samples of dairy goats were conducted at IRIAP (Indonesian Researh Institute for Animal Production) dairy goat Station, Ciawi, Bogor. Extraction and genotyping DNA samples were carried out at the Laboratory of Animal Molecular Genetic, Department of Animal Production and Technology, Faculty of Animal Husbandry, Bogor Agricultural University. This research took place from November -December 2019 to January -February 2020.

Materials
Equipments for collecting bloods of dairy goats consisted of needle, tube vaccutainer, and syringes. DNA extraction was carried out according to a standard phenol-chloroform procedure. The PCR RFLP tools consisted of eppendorf tubes and tube rack, 1 micropipette unit along with a tip, vortex machine, microsentrifuge (spin down), and incubator, Genotyping tools were thermocycler GeneAmp® PCR ESCO machine, magnetic stirrer, pistil, microwave, 1 set of gel tray, electrophoresis power supply, and UV-transilluminator.
The materials used in PCR and electrophoresis processes consisted of extracted DNA, PROMEGA Green PCR Master Mix, Thermo Scientific GeneRuler 100 bp DNA Ladder, forward primer and reverse primer, nuclease free water (NFW), amplicon products, 1.5% agarose gel, buffer TBE, PeqGreen fluorosafe. Materials used for RFLP and genotyping provided PCR products, enzyme buffers, and DdeI restriction enzymes.

Primary Design
Primers were designed independently based on DNA sequence data from the National Center for Biotechnology Information (NCBI) for goat species or Capra hircus with GenBank access code AJ504710.2 for the CSN1S1 gene at g.12164G>A locus and GenBank access code AJ011018.3 for the CSN2 gene at g.8913C>A locus using the Molecular Evolutionary Genetics Analysis (MEGA7) program.
Primers were designed and their status were checked using a software-based web including: Bioinformatic and Thermofisher scientific multiple primers. Regarding to identification of the single nucleotide polymorphism (SNP) of the CSN1S1_g.12164G>A locus was framed using primers of F: 5 'GGGCTAATCCAAATTCTCTG-3' and R: 3 'GACTCTGTAGAAGGAATCAG-5'. Further the CSN2_g.8913C>A locus was framed by primers of F: 5'GCACAAAGAAATGCCCTTCC-3' and R: 5'GCTATGCTTATTTTGGAACCATTC-3'.

Blood collection
Blood sample about 5 mL (20%) of each animal was taken using a needle vaccutainer through the jugular vein which already contained 80% Absolute Ethanol (EDTA). Blood samples then were stored at ± 4 o C.

DNA extraction
DNA extraction referred to the method of phenol-chloroform standard [22]. The stages consisted of sample preparation, protein degradation, organic matter degradation, and DNA precipitation. Sample preparation was done for the blood sample put into a 1.5 mL tube, added with a 0.2% to 1,000 μL NaCl solution, vortexed, and incubated for 5 minutes. Then the solution was centrifuged at a speed of 8,000 rpm for 5 minutes and the supernatant was removed. Protein degradation was conducted by the blood sample was added with 350 μL of tris-EDTA sodium (1×STE), 40 μL SDS 10%, and 10 μL prot-K 5 mg mL-1. Degradation of organic matter was proccessed for the mixture was incubated for 2 hours at 55 º C under slow shaking conditions. DNA precipitation was done for the mixture was added with 400 μL of phenol solution, 400 μL CIAA, and 40 μL 5M NaCl, then shaken at room temperature for 1 hour. Samples were centrifuged at a speed of 12,000 rpm for 5 minutes. 400 μL of supernatant was transferred to a new tube, added 800 μL of absolute alcohol and 40 μL of 5M NaCl. The mixture was stored at -20 ºC overnight, centrifuged at 12,000 rpm for 5 minutes. The precipitate obtained was washed with 800 µL of 70% alcohol, precipitated again and dried, so that the alcohol lost. The DNA precipitate was added with 100 μL TE 80%, stored at -20 ºC, and ready for the uses.

Gene Amplification
Gene amplification used an ESCO GeneAmp® PCR thermal cycler machine. A total of 1 µL of the extracted DNA template was inserted into a PCR tube, added 14 µL of premix solution. The PCR premix consisted of 0.2 µL of primer (forward and reverse), 7.5 µL of PROMEGA Green Master Mix and 6.1 µL of nuclease free water (NFW). The mixture was incubated in a thermal cycler for the amplification. The amplification process started with a predenaturation step at 95 ºC for 5 minutes. The second stage consisted of 35 cycles, for each cycle consisting of a denaturation process at 95 ºC for 10 seconds, the primer attachment occurred at 56 ºC for 20 seconds and DNA primer elongation at 72 ºC for 30 seconds. The last step was primary elongation at 72 ºC for 5 minutes.

Electrophoresis
The amplified sample was electrophoresed with 1.5% agarose gel. The gel was made of 0.45 g agarose gel and 30 mL 0.5×TBE heated in a microwave at a medium-high temperature (±3 minutes). Homogenization was carried out with a magnetic stirrer at a speed of 50 rpm (±1 minute), then added by 1 μL of peqGreen and homogenized for 60 seconds.
The gel was printed on the printer tray and allowed to harden. The hardened gel was transferred to an electrophoresis device, then filled with 5 µL amplicon and 2 µL thermofisher 100 bp scientific marker. The amplifiers were electrophoresed at a voltage of 100 V for 35 minutes until the DNA fragments migrated in the gel. The results of the electrophoresis in the form of bands were visible with the help of a UV transilluminator.

Genotyping
A total of 5 μL of PCR products were added with 0.9 μL of NFW, 0.7 μL of buffer and 0.4 μL of restriction enzyme DdeI enzyme then incubated at 37 ºC for 4 hours. 7 μL of cutting DNA was electrophoresed at a voltage of 100 V for 35 minutes on 2% agarose gel. The electrophoretic DNA sample was visualized by means of UV light. DNA fragments emerged from the electrophoresis results were compared with markers to determine the fragment lengths and genotypes.

Data analysis
Genotyping results were calculated for genotype frequency, allele frequency, Hardy-Weinberg equilibrium value, and heterozygosity observation (Ho) value and expectation (He) value were analyzed by the POPGENE32 software version 1.32. The genotype frequency was a ratio of the number of a certain genotype to the number of the observed population. Genotype frequency and allele frequency were calculated by formula [23].
Hardy-Weinberg equilibrium was tested using the Chi-square (χ 2 ) calculation. that resulted in a PCR product with a length of 438 bp. While amplification of the base fragment of β-Casein gene or CSN2 gene resulted in a PCR product by 416 bp. PCR-RFLP methods for the identification of genotypic variants of both of CSN1S1 gene at g.12164G>A locus and CSN2 gene at g.8913C>A locus were restricted by DdeI enzyme. The DdeI is a purified endonuclease enzyme from an escherichia coli strain carrying the DdeI desulfovibrio desulfuricans gene. This enzyme acts at a temperature of 37°C. The DdeI restriction enzyme recognized the base cutting site at 5'-C|TNAG-3' sequences in both CSN1S1 and CSN2 genes.
Genotyping the CSN1S1_g.12164G>A locus of dairy goats observed resulted in three genotypes, namely GG, GA, and AA ( Figure 1). Homozygous GG genotype presented three fragments (192 bp, 145 bp and 101 bp), heterozygous GA genotype generated four fragments (337 bp, 192 bp, 145 bp and 101 bp), and AA genotype exhibited two fragments (337 and 101 bp). The AA genotype occurred by the existing a base transition of Guanine to Cytosine (G>A) at a base 14 in exon 12 of the CSN1S1 gene [16,24]. This base transition replaced a purine base to another purine base (adenine to guanine) or a pyrimidine base to another pyrimidine base (thymine to cytosine). The base substitution mutations in the coding sequence (CDS) region were able to be classified according to their effect on amino acids. A mutation was defined as non-synonymous when a base mutation at a specific point changed the original codon leading into the changes of amino acids [25]. The A allele, which in this case was a mutation in CDS at g.12164G>A locus, changed the amino acid of Arginine (Arg100) to Lysin (Lys) [16]. Non-synonymous mutation at this base point caused casein deficit in cheese yields by 5-7% in heterozygous condition [7]. Further genotyping results of the base cutting site by the DdeI enzyme to the base sequences of 5'-C|TNAG-3' in exon 7 of the β-Casein gene or CSN2 gene at g.8913C>A locus generated two fragments (233 bp and 162 bp) known as CC genotype. One base fragment by 21 bp run out, so that band was not seen on the electrophoresis gel. Genotyping of the CSN2_g.8913C>A locus for all dairy goats evidently resulted in only the CC genotype, without the two remaining AA genotype (416 pb) and AC genotype (416 pb, 233 bp, and 162 bp). The CSN2*E allele was previously identified as a base mutation at the g.8913 locus in exon 7 of this gene and reported as a transversion mutation on the base sequences of TCT|TAT. The base mutation was responsible for the changes of amino acid Ser166 to Tyr166 in the complete protein [12,19].

Genotype and Allele Frequenciest
Genotyping results on the base sequences of 5'-C|TNAG-3' of the CSN1S1 gene at g.12164G>A locus for all dairy goats are presented in Table 1. Pure PE goat presented the same frequencies for AA and GA genotypes instead of a lower frequency for the GG genotype. Meanwhile pure Saanen goat generated a high frequency of the GA genotype than the GG genotype without the AA genotype. However, Sapera and Saanpe cross goats presented quite different genotype frequencies compared to those of the two pure dairy goats. Sapera goats (50% saanen, 50% PE) yielded the highest frequency of the GA genotype, followed by the GG genotype, and the lowest for the AA genotype. Whereas the SaanPE goat (75% Saanen, 25% PE) delivered a high frequency of the AA genotype in the opposite a low frequency of the GG genotype and without the AG genotype. Genetic polymorphism of CSN1S1_g.12164G>A locus was caused by a transitional mutation of the g.12164G>A SNP [16].  Genetic diversity could be determined by the probability of the changing of an allele to another different form due to mutation within one generation [26]. It is said [23] that a population is said to be diverse if there are two or more alleles. Genotype frequenciest of those AA, AG, and GG genotypes of the g.12164G>A locus of the CSN1S1 gene across the dairy goats were diverse. Distribution of the GG genotype frequenciest from the highest was successively for Sapera (0.353), SaanPE (0.333), Saanen (0.250) and PE (0.200). Meanwhile distribution of the GA genotype frequenciest from the highest was successively for Saanen (0.750), Sapera (0.451), PE (0.400), and SaanPE (0.000). While distribution of the AA genotype frequenciest from the highest was succesively for SaanPE (0.667), PE (0.400), Sapera (0.196) and Saanen (0.000). Genotype frequenciest seemed to be different depending on the number of samples tested [27]. The frequenciest of alleles and genotypes can change due to natural selection and evolutionary forces, including mutations, migration (gene flow), non-random mating, genetic drift.
By contrast, the CSN2 gene at g.8913C>A locus was found that all goats had only the G allele (100%) without the A allele. Therefore the g.8913C>A locus of this gene was monomorphic across dairy goats. According to [21] a population was polymorphic for the owning multiple alleles (more than 1 allele) in one locus with an allele frequency of less than 0.99 (99%). According to [28] β-casein as the main casein fraction in dairy goat milk was known to be more monomorphic. However, in some cases there were polymorphisms in different alleles such as A, C, CI alleles in Girgentana goat breed [10]. CSN2 genetic polymorphisms were found in Banat White and Carpatina goat breed [18]. While variations in the E allele occurred in Frisa goat breed with the allelic frequency of 7.9% and the heterozygosity value of 0.0534 [12]. Allele E polymorphisms of the CSN2 gene were found in Piemontese cattle breed [29].

Heterozygosity and Hardy-Weinberg Equilibirium Values
Heterozygosity is one parameter that can be used to estimate the level of genetic variation in a population. If the observed heterozygosity (Ho) value is close to 0 meaning a low heterozygosity degree instead of high heterozygosity degree for that value close to 1. The observed heterozygosity (Ho) value and the expected heterozygosity (He) value of the CSN2 gene at g.12164G>A locus are presented in Table 2. Whilst the CSN2 gene at g.8913C>A locus of all dairy goats had only the CC genotype (100%) leading to the only C allele (100%). The heterozygosity observation values therefore were 0 or monomorphic for all of the dairy goats observed.  Table 2 shows the heterozygosity level from the highest one successively for Saanen (0.75), Sapera (0.450), and PE (0.400), whilst SaanPE goat was monomorphic (0.000). This indicated that SaanPE goat had a very close genetic relationship. However, the calculation of the heterozygosity values especialy in SaanPE, PE, and Saanen dairy goats came from very limited sample numbers. The results of Hardy-Weinberg equilibrium by Chi-Square (X 2 ) testing at g. 12164G>A locus showed that Sapera (0.370), Saanen (2.333) and PE (0.400) goats were in the Hardy-Weinberg equilibrium (X 2 count smaller than X2 table (P<0.05;1) by 3.84. By contrast SaanPE goat was in the Hardy-Weinberg disequilibrium (X 2 count) larger than (X2 table P0.05;1) by 0.05. One of the conditions causing in the Hardy-Weinberg disequilibrium according to [30] was due to natural and assortative mating. The H-W equilibrium of the genotype frequenciest in Sapera, Saanen and PE dairy goats could be an indicator for no selection on the CSN1S1 gene of the g. 12164G>A SNP within these dairy goats. The equilibrium can occur in the absence of mutation, natural selection, genetic drive, migration and assortative mating.

Conclusion
The CSN1S1 gene at g.12164G>A locus in pure and cross dairy goats resulted in three genotypes of AA (0.196-0.667), GA (0.400-0.750), and GG (0.200-0.353) but without AA genotype in Saanen and GA genotype in SaanPE. Saanen goat had the highest heterozygosity degree, while SaanPE goat was in Hardy-Weinberg disequilibrium. By contrast, the CSN2 gene at g.8913C>A locus was monomorphic for all dairy goats for possesing only CC genotype and C allele. The CSN1S1 gene of g.12164G>A SNP could be potencial to be considered as a marker assisted selection for genetic improvement of protein fractions in dairy goats.
High appreciation is delivered to the Division of Animal Breeding and Genetics, Department of Animal Production Science and Technology, Faculty of Animal Science IPB that has provided in kind assistance in this research collaboration as stated in the Decree of the Head of IRIAP No. 143/Kpts/OT 210/ H.5.2/ 04/2020.