The nucleotide sequence of the gene for human protein C

Proc Natl Acad Sci U S A. 1985 Jul;82(14):4673-7. doi: 10.1073/pnas.82.14.4673.


A human genomic DNA library was screened for the gene for protein C by using a cDNA probe coding for the human protein. Three different overlapping lambda Charon 4A phage were isolated that contain inserts for the gene for protein C. The complete sequence of the gene was determined by the dideoxy method and shown to span about 11 kilobases of DNA. The coding and 3' noncoding portion of the gene consists of eight exons and seven introns. The eight exons code for a preproleader sequence of 42 amino acids, a light chain of 155 amino acids, a connecting dipeptide of Lys-Arg, and a heavy chain of 262 amino acids. The preproleader sequence and the connecting dipeptide are removed during processing, resulting in the mature protein composed of a heavy and a light chain held together by a disulfide bond. The heavy chain also contains the catalytic region for the serine protease. Two Alu sequences and two homologous repeats of about 160 nucleotides were found in intron E. The seven introns in the gene for protein C are located in essentially the same positions in the amino acid sequence as the seven introns in the gene for human factor IX, while the first three introns in protein C are located in the same positions as the first three in the gene for human prothrombin.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Blood Coagulation Factors / genetics*
  • DNA Restriction Enzymes
  • Factor IX / genetics
  • Genes*
  • Glycoproteins / genetics*
  • Humans
  • Nucleic Acid Hybridization
  • Protein C
  • Protein Conformation


  • Blood Coagulation Factors
  • Glycoproteins
  • Protein C
  • Factor IX
  • DNA Restriction Enzymes

Associated data

  • GENBANK/M11228