A method for prediction of the locations of linker regions within large multifunctional proteins, and application to a type I polyketide synthase

J Mol Biol. 2002 Oct 25;323(3):585-98. doi: 10.1016/s0022-2836(02)00972-5.

Abstract

Multifunctional proteins often appear to result from fusion of smaller proteins and in such cases typically can be separated into their ancestral components simply by cleaving the linker regions that separate the domains. Though possibly guided by sequence alignment, structural evidence, or light proteolysis, determination of the locations of linker regions remains empirical. We have developed an algorithm, named UMA, to predict the locations of linker regions in multifunctional proteins by quantification of the conservation of several properties within protein families, and the results agree well with structurally characterized proteins. This technique has been applied to a family of fungal type I iterative polyketide synthases (PKS), allowing prediction of the locations of all of the standard PKS domains, as well as two previously unidentified domains. Using these predictions, we report the cloning of the first fragment from the PKS norsolorinic acid synthase, responsible for biosynthesis of the first isolatable intermediate in aflatoxin production. The expression, light proteolysis and catalytic abilities of this acyl carrier protein-thioesterase didomain are discussed.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • 5-Methyltetrahydrofolate-Homocysteine S-Methyltransferase / chemistry
  • 5-Methyltetrahydrofolate-Homocysteine S-Methyltransferase / genetics
  • Algorithms*
  • Animals
  • Anthraquinones / metabolism
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • DNA Polymerase I / chemistry
  • DNA Polymerase I / genetics
  • Fatty Acid Synthases / chemistry
  • Fatty Acid Synthases / genetics
  • Fungal Proteins / chemistry
  • Fungal Proteins / genetics
  • Models, Molecular
  • Models, Statistical
  • Molecular Sequence Data
  • Multienzyme Complexes / chemistry*
  • Multienzyme Complexes / genetics
  • Protein Structure, Tertiary*
  • Sulfurtransferases / chemistry
  • Sulfurtransferases / genetics

Substances

  • Anthraquinones
  • Bacterial Proteins
  • Fungal Proteins
  • Multienzyme Complexes
  • 5-Methyltetrahydrofolate-Homocysteine S-Methyltransferase
  • Fatty Acid Synthases
  • DNA Polymerase I
  • Sulfurtransferases
  • ThiI protein, bacteria
  • norsolorinic acid

Associated data

  • GENBANK/L42766