Objective: To develop an R script that can efficiently and accurately filter genome-wide association studies (GWASs) from the GWAS Catalog Website. Methods: The selection principles of GWASs were established based on previous studies. The process of manual filtering in the GWAS Catalog was abstracted as standard algorithms. The R script (gwasfilter.R) was written by two programmers and tested many times. Results: It takes six steps for gwasfilter.R to filter GWASs. There are five main self-defined functions among this R script. GWASs can be filtered based on "whether the GWAS has been replicated" "sample size" "ethnicity of the study population" and other conditions. It takes no more than 1 second for this script to filter GWASs of a single trait. Conclusions: This R script (gwasfilter.R) is user-friendly and provides an efficient and standard process to filter GWASs flexibly. The source code is available at github (https://github.com/lab319/gwas_filter).
目的: 开发一套能高效准确地从GWAS Catalog公开数据库中筛选全基因组关联研究(GWAS)的R脚本。 方法: 参考既往研究制定GWAS的筛选原则。将人工在GWAS Catalog的筛选过程抽象为标准的算法,2名程序员共同撰写R脚本(gwasfilter.R)后,由他人多次对脚本进行测试。 结果: 采用gwasfilter.R筛选GWAS包含6个步骤。该脚本内置5个主要函数,可以同时根据“是否有验证人群”“样本量大小”和“研究人群种族”等条件对GWAS进行筛选。筛选单个性状时,程序用时不超过1秒。 结论: gwasfilter.R操作简便,筛选过程高效而标准化,可灵活应用于GWAS筛选。脚本源代码网址:https://github.com/lab319/gwas_filter。.