Genes for intermediate filament proteins and the draft sequence of the human genome: novel keratin genes and a surprisingly high number of pseudogenes related to keratin genes 8 and 18

J Cell Sci. 2001 Jul;114(Pt 14):2569-75. doi: 10.1242/jcs.114.14.2569.


We screened the draft sequence of the human genome for genes that encode intermediate filament (IF) proteins in general, and keratins in particular. The draft covers nearly all previously established IF genes including the recent cDNA and gene additions, such as pancreatic keratin 23, synemin and the novel muscle protein syncoilin. In the draft, seven novel type II keratins were identified, presumably expressed in the hair follicle/epidermal appendages. In summary, 65 IF genes were detected, placing IF among the 100 largest gene families in humans. All functional keratin genes map to the two known keratin clusters on chromosomes 12 (type II plus keratin 18) and 17 (type I), whereas other IF genes are not clustered. Of the 208 keratin-related DNA sequences, only 49 reflect true keratin genes, whereas the majority describe inactive gene fragments and processed pseudogenes. Surprisingly, nearly 90% of these inactive genes relate specifically to the genes of keratins 8 and 18. Other keratin genes, as well as those that encode non-keratin IF proteins, lack either gene fragments/pseudogenes or have only a few derivatives. As parasitic derivatives of mature mRNAs, the processed pseudogenes of keratins 8 and 18 have invaded most chromosomes, often at several positions. We describe the limits of our analysis and discuss the striking unevenness of pseudogene derivation in the IF multigene family. Finally, we propose to extend the nomenclature of Moll and colleagues to any novel keratin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Genome, Human*
  • Humans
  • Intermediate Filament Proteins / genetics*
  • Keratin-1
  • Keratin-8
  • Keratins / genetics*
  • Lamins
  • Molecular Sequence Data
  • Multigene Family / genetics
  • Neurofilament Proteins / genetics
  • Nuclear Proteins / genetics
  • Phylogeny
  • Pseudogenes*
  • Terminology as Topic


  • Intermediate Filament Proteins
  • KRT77 protein, human
  • KRT8 protein, human
  • Keratin-1
  • Keratin-8
  • Lamins
  • Neurofilament Proteins
  • Nuclear Proteins
  • Keratins