Prediction of eye, hair and skin colour in Latin Americans

Sagnik Palmal; Kaustubh Adhikari; Javier Mendoza-Revilla; Macarena Fuentes-Guajardo; Caio Cesar Silva de Cerqueira; Betty Bonfante; Juan Camilo Chacón-Duque; Anood Sohail; Malena Hurtado; Valeria Villegas; Vanessa Granja; Claudia Jaramillo; William Arias; Rodrigo Barquera Lozano; Paola Everardo-Martínez; Jorge Gómez-Valdés; Hugo Villamil-Ramírez; Tábita Hünemeier; Virginia Ramallo; Maria-Laura Parolin; Rolando Gonzalez-José; Lavinia Schüler-Faccini; Maria-Cátira Bortolini; Victor Acuña-Alonzo; Samuel Canizales-Quinteros; Carla Gallo; Giovanni Poletti; Gabriel Bedoya; Francisco Rothhammer; David Balding; Pierre Faux; Andrés Ruiz-Linares

doi:10.1016/j.fsigen.2021.102517

Prediction of eye, hair and skin colour in Latin Americans

Forensic Sci Int Genet. 2021 Jul:53:102517. doi: 10.1016/j.fsigen.2021.102517. Epub 2021 Apr 6.

Authors

Sagnik Palmal¹, Kaustubh Adhikari², Javier Mendoza-Revilla³, Macarena Fuentes-Guajardo⁴, Caio Cesar Silva de Cerqueira⁵, Betty Bonfante¹, Juan Camilo Chacón-Duque⁶, Anood Sohail⁷, Malena Hurtado⁸, Valeria Villegas⁸, Vanessa Granja⁸, Claudia Jaramillo⁹, William Arias¹⁰, Rodrigo Barquera Lozano¹¹, Paola Everardo-Martínez¹², Jorge Gómez-Valdés¹², Hugo Villamil-Ramírez¹³, Tábita Hünemeier¹⁴, Virginia Ramallo¹⁵, Maria-Laura Parolin¹⁶, Rolando Gonzalez-José¹⁷, Lavinia Schüler-Faccini¹⁸, Maria-Cátira Bortolini¹⁸, Victor Acuña-Alonzo¹², Samuel Canizales-Quinteros¹³, Carla Gallo⁸, Giovanni Poletti⁸, Gabriel Bedoya¹⁰, Francisco Rothhammer¹⁹, David Balding²⁰, Pierre Faux²¹, Andrés Ruiz-Linares²²

Affiliations

¹ UMR 7268 ADES, CNRS, Aix-Marseille Université, EFS, Faculté de Médecine Timone, Marseille 13005, France.
² School of Mathematics and Statistics, Faculty of Science, Technology, Engineering and Mathematics, The Open University, Milton Keynes MK7 6AA, UK; Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK.
³ Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú; Unit of Human Evolutionary Genetics, Institut Pasteur, Paris 75015, France.
⁴ Departamento de Tecnología Médica, Facultad de Ciencias de la Salud, Universidad de Tarapacá, Arica 1000000, Chile.
⁵ Scientific Police of São Paulo State, Ourinhos, SP 19900-109, Brazil.
⁶ Division of Vertebrates and Anthropology, Department of Earth Sciences, Natural History Museum, London SW7 5BD, UK.
⁷ Department of Biotechnology, Kinnaird College for Women, 93 - Jail Road, Lahore 54000, Pakistan.
⁸ Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú.
⁹ Department of Biotechnology, Kinnaird College for Women, 93 - Jail Road, Lahore 54000, Pakistan; GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia.
¹⁰ GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia.
¹¹ National Institute of Anthropology and History, Mexico City 6600, Mexico; Department of Archaeogenetics, Max Planck Institute for the Science of Human History (MPI-SHH), Jena 07745, Germany.
¹² National Institute of Anthropology and History, Mexico City 6600, Mexico.
¹³ Unidad de Genomica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, Mexico City 4510, Mexico.
¹⁴ Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP 05508-090, Brazil.
¹⁵ Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 90040-060, Brazil; Instituto Patagónico de Ciencias Sociales y Humanas, Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina.
¹⁶ Instituto de Diversidad y Evolución Austral (IDEAus), Centro Nacional Patagónico, CONICET, Puerto Madryn, Argentina.
¹⁷ Instituto Patagónico de Ciencias Sociales y Humanas, Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina.
¹⁸ Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 90040-060, Brazil.
¹⁹ Instituto de Alta Investigación, Universidad de Tarapacá, Arica 1000000, Chile; Programa de Genetica Humana, ICBM, Facultad de Medicina, Universidad de Chile, Santiago, Arica 1000000, Chile.
²⁰ Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK; Melbourne Integrative Genomics, Schools of BioSciences and Mathematics & Statistics, University of Melbourne, Melbourne, VIC 3010, Australia.
²¹ UMR 7268 ADES, CNRS, Aix-Marseille Université, EFS, Faculté de Médecine Timone, Marseille 13005, France. Electronic address: pierre.faux@univ-amu.fr.
²² UMR 7268 ADES, CNRS, Aix-Marseille Université, EFS, Faculté de Médecine Timone, Marseille 13005, France; Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK; Ministry of Education Key Laboratory of Contemporary Anthropology and Collaborative Innovation Center of Genetics and Development, School of Life Sciences and Human Phenome Institute, Fudan University, Yangpu District, Shanghai, China. Electronic address: andresruiz@fudan.edu.cn.

PMID: 33865096
DOI: 10.1016/j.fsigen.2021.102517

Abstract

Here we evaluate the accuracy of prediction for eye, hair and skin pigmentation in a dataset of > 6500 individuals from Mexico, Colombia, Peru, Chile and Brazil (including genome-wide SNP data and quantitative/categorical pigmentation phenotypes - the CANDELA dataset CAN). We evaluated accuracy in relation to different analytical methods and various phenotypic predictors. As expected from statistical principles, we observe that quantitative traits are more sensitive to changes in the prediction models than categorical traits. We find that Random Forest or Linear Regression are generally the best performing methods. We also compare the prediction accuracy of SNP sets defined in the CAN dataset (including 56, 101 and 120 SNPs for eye, hair and skin colour prediction, respectively) to the well-established HIrisPlex-S SNP set (including 6, 22 and 36 SNPs for eye, hair and skin colour prediction respectively). When training prediction models on the CAN data, we observe remarkably similar performances for HIrisPlex-S and the larger CAN SNP sets for the prediction of hair (categorical) and eye (both categorical and quantitative), while the CAN sets outperform HIrisPlex-S for quantitative, but not for categorical skin pigmentation prediction. The performance of HIrisPlex-S, when models are trained in a world-wide sample (although consisting of 80% Europeans, https://hirisplex.erasmusmc.nl), is lower relative to training in the CAN data (particularly for hair and skin colour). Altogether, our observations are consistent with common variation of eye and hair colour having a relatively simple genetic architecture, which is well captured by HIrisPlex-S, even in admixed Latin Americans (with partial European ancestry). By contrast, since skin pigmentation is a more polygenic trait, accuracy is more sensitive to prediction SNP set size, although here this effect was only apparent for a quantitative measure of skin pigmentation. Our results support the use of HIrisPlex-S in the prediction of categorical pigmentation traits for forensic purposes in Latin America, while illustrating the impact of training datasets on its accuracy.

Keywords: Admixture; DNA phenotyping; Eye-colour; Hair-colour; Latin Americans; Pigmentation prediction; Skin-colour.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Datasets as Topic
Eye Color / genetics*
Genetics, Population
Genotype
Hair Color / genetics*
Humans
Latin America
Logistic Models
Phenotype
Polymorphism, Single Nucleotide*
Skin Pigmentation / genetics*