This box searches only this space. The box at the upper right searches the entire iPlant wiki.

Skip to end of metadata
Go to start of metadata

Team Name

Reetu and Friends

Team Members

Reetu Tuteja - Created the Sequenceserver2.0 app on Discovery Environment, presentation, video tutorial 

Jennen Maryniak - Presentation, plots

Jiatian Wang - Initial Makeflow and Work Queue scripts

Hayden Dunn

Erika Tapia - Benchmarking and documentation

Andy Garcia

Nick Reppe - Benchmarking, Demo/Procedure

Project Summary:

CyVerse provides life scientists with a powerful computational infrastructure to handle huge datasets and complex analyses that enable data-driven discovery. CyVerse's extensible platforms provide data storage, bioinformatics tools, image analyses, cloud services, APIs, and more.

 

The Discovery Environment (DE) is a key product of the CyVerse cyber-infrastructure, providing a modern web interface for powerful computing. The visual and interactive computing environment (VICE) is the recently introduced feature within CyVerse’s Discovery Environment (DE) for running interactive apps.

 

Our midterm project is an implementation of CyVerse's Discovery Environment using an app we created via VICE within CyVerse. 

Project Description



Code Availability

Initial Scripts:

https://github.com/KartinJulia/SequenceServer

Sequences used were provided by Team BlastEasy :

https://github.com/raptorslab/blastEasy/tree/master/queryseq

Project Timeline/Plan

Did our best to learn and implement ways distributed computing using HPCs and a Work Queue platform, but ultimately decided on using CyVerse's Discovery Environment as a possible solution.


Installing/Running Instructions

It all starts with an account on CyVerse where you can access the Discovery Environment once logged in. An app which previously existed can be used to generate a potential database. We created an app called "sequenceserver" to utilize the output of that app as the database to test a query against. If you already have a database, there is no requirement to run the app called “Create BLAST database-2.6.0+".

The instructions are in the pdf:  Procedures for DE sequence analysis.pdf

 

Benchmarking

Benchmarking was ran by running the sequenceserver app we created in the Discovery Environment (DE) within CyVerse. Before launching the app, the number of cores was specified in the drop-down field "Number of threads":


The running analysis was accessed which allowed us to paste the protein sequences in Sequence Server to run BLAST:


All protein sequences ran through Sequence Server were 100 residues in length and the number of cores were pre-selected before launching the app analysis. A significant decrease in run-time was observed as more cores were involved in running queries. Running queries on 8 cores clearly reduces run-time as compared to running queries on simply 1 core. 

Number of Protein Sequences

Time (seconds)

1 CORE

Time (seconds)

2 CORES

Time (seconds)

4 CORES

Time (seconds)

8 CORES

13.280.550.450.75
58.383.221.862.23
1013.295.233.473.24
5045.2819.4113.5910.00
10084.8634.5818.3920.41
500427.94156.7280.74119.46


Presentation

https://docs.google.com/presentation/d/1D755s013X6MVVlTszQXayMuy0A3CpYIpEdLfHAsX_ig/edit?usp=sharing

https://www.loom.com/share/aac933be1c2d45b7ba504870915de1d7

 

  • No labels