Arenicola marina possesses cuticular and interstitial collagens, which are mostly synthesised by its epidermis. A cDNA library was constructed from the body wall. This annelid cDNA library was screened with a sea-urchin-collagen cDNA probe, and several overlapping clones were isolated. Nucleotide sequencing of these clones revealed an open reading frame of 2052 nucleotides. The translation product exhibits a triple helical domain of 138 Gly-Xaa-Yaa repeats followed by a 269-residue-long C-terminal non-collagenous domain (C-propeptide). The triple helical domain exhibits an imperfection that has been previously described in a peptide produced by cyanogen bromide digestion (CNBr peptide) of A. marina interstitial collagen. This imperfection occurs at the same place in the interstitial collagen of the vestimentiferan Riftia pachyptila. This identifies the clone as coding for the C-terminal part of a fibrillar collagen chain. It was called FAm1alpha, for fibrillar collagen 1alpha chain of A. marina. The non-collagenous domain possesses a structure similar to carboxy-terminal propeptides of fibrillar pro-alpha chains. Only six conserved cysteine residues are observed in A. marina compared with seven or eight in all other known C-propeptides. This provides information on the importance of disulfide bonds in C-propeptide interactions and in the collagen-assembly process. Phylogenetic studies indicate that the fibrillar collagen 1alpha chain of A. marina is homologous to the R. pachyptila interstitial collagen and that the FAm1alpha gene evolved independently from the other alpha-chain genes. Complementary analyses indicate that the vertebrate fibrillar collagen family is composed of two monophyletic subgroups with a specific position of the collagen type-V chains.