Merck
CN
  • Human alpha 2(VI) collagen gene. Heterogeneity at the 5'-untranslated region generated by an alternate exon.

Human alpha 2(VI) collagen gene. Heterogeneity at the 5'-untranslated region generated by an alternate exon.

The Journal of biological chemistry (1992-03-25)
B Saitta, R Timpl, M L Chu
摘要

Cosmid clones containing the 5' region of the human alpha 2(VI) collagen gene have been isolated and characterized. DNA sequencing indicates that the signal peptide and the amino-globular domain are encoded by four exons of 142, 596, 21, and 66 base pairs (bp). However, S1 nuclease and primer extension analyses show that the transcription start site is not present in the 142-bp exon. Two different 5' cDNA clones are generated by the anchored polymerase chain reaction. Using the 5' cDNA clones as probes, two untranslated exons (1, 1A) are found 12 kilobase pairs upstream of the first coding exon. These two exons are alternatively used in human fibroblasts, and most transcripts contain exon 1 sequence. Exon 1 shows, by primer extension and S1 nuclease protection assay, two major and several minor transcription start sites. The promoter region contains a canonical TATA box, seven GGGCGG sequences, two possible CAAT boxes, and two sequences resembling AP2 binding sites. Exon 1A contains three alternative splice donor sites and is located 650 bp downstream of exon 1. The most 3' splice donor site of exon 1A is found within an Alu repeat sequence. Exon 1A is preceded by five GGGCGG sequences and one resembling the AP2 binding site although neither TATA or CAAT boxes are found. Two additional GGGCGG sequences are located at the beginning of exon 1A. This study establishes that the human alpha 2(VI) collagen gene is 36 kilobase pairs long and contains 30 exons. The 5'-untranslated and promoter regions are significantly different from the corresponding segments of the chicken gene. The human gene produces by alternative processing multiple mRNAs differing in the 5'-untranslated region as well as the 3'-coding and noncoding sequences.