The Validate Workflow 

This page is designed to aid users in navigating the Validate Workflow.

What is Validate?

"The purpose of Validate is to provide information on both SNP effect size estimation and identifying SNP capability performance for various GWAS and QTL tools. The eventual goal is two-fold: 1 Publish information about the performance of different tools for different types of simulation parameters (such as population structure and different levels of heritability) somewhere easily viewable for iPlant users. Essentially, we hope to show researchers when best to use a tool as compared with another. 2 Provide a pipeline or workflow for testing installed tools. This is to encourage iterations of the first goal."

- Dustin Landers (The architect of the original Validate program)

The workflow consists of several pieces of software called genome wide association study (GWAS) tools, and software to analyze the GWAS tool performance.

More specifically, the workflow includes:

How to get started

  1. It is highly recommended that you watch the webinar “Getting Started with iPlant” given monthly by Jason Williams as an introduction to iPlant, and some of its features.

  2. There are a series of accounts which need to be setup and software which needs to be downloaded before getting started. Follow this link and return after you have followed the instructions for setting up accounts.

  3. You can acquaint yourself with Atmosphere here, generally though, atmosphere allows you to access a virtual machine where all of the necessary programs have been installed to run the workflow.  This lab has several atmosphere images which have been launched although you will likely only need the validate image unless you want to work with a specific tool.  Validate 0.9 is available as an Atmosphere image under the name Validate Workflow v0.9. 

  4. It can also be helpful to, Check here to learn more about stampede and check here to learn more about the Agave API. Stampede is housed at TACC and is the world's largest supercomputer dedicated to science. The Agave API is a tool for creating and implementing apps into stampede. The workflow can be operated exclusively on stampede, however, this process is under development and the following pages will be for use in atmosphere.


Learn about CyVerse's allocation policies here.

Next Steps

After looking through and completing the above you are ready to begin, you can either start at the simulate page,found here, or you can continue to scroll through this page to learn more about the validate project, and find links to useful information.

To learn more about the various offerings of the iPlant collaborative please check out the main page for getting started with the iPlant collaborative.

If you are interested in further developing the Validate workflow check here or here for a more statistically oriented guide.

When working in the atmosphere images terminal it can be helpful to know how to use iCommands. These are essentially commands to move data to and from the data store.

You may also want to learn how to install R packages into atmosphere, instructions for which can be found here .    

For viewing the source code and additional information on any of these programs, please check the main Github repository.

Many other forms of Statistical method comparison software exist, such as DSCR. A basic comparison between the function and intent of DSCR and the Validate workflow can be found here.

The Validate workflow is still in development and we are testing it currently. If you notice any issues or have any comments we would greatly appreciate them!
Please contact us at Thank you for using our tools!

Particularly large datasets may require instances with more memory. Some GWAS tools can be computationally demanding, and if an instance lacks sufficient memory, the process may be "killed" mid computation. In our experience, filesets in excess of 1GB need at least 4GB of memory to guarantee processing.

Change Log

Version 0.9

Version 0.7

Version 0.5

Version 0.3