Also check CyVerse YouTube channel
Goals, Steps, Apps Used
Annotating genomes with MAKER-P
|Subramaniam||DE||App|| ||Highlights from some iPlant session presentations at the International Plant and Animal Genome Meeting XXIII (January 2015, San Diego). YouTube, pub 2/2015.|
|Applied Concepts in Cyberinfrastructure: Exoplanets video||Lyons, Merchant||DE||YouTube|| ||The Applied Concepts in Cyberinfrastructure course tackles new problems by collaborating with researchers across the University of Arizona. This video features Dr. Jared Males, Department of Astronomy and Steward Observatory and NASA Sagan Fellow, whose big data challenge was how to process tens of thousands of images in order to find the small cluster of pixels that represents a real exoplanet. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
Commonly used procedure for de novo whole genome assembly of Illumina reads using the DE: Assemble reads, Assess assembly
- SOAPdenovo 2.0.4
- Assess assembly vs. whole genome
Assemble and Annotate Brassica Rapa Transcriptome in the Cloud through the iPlant Collaborative and XSEDE
|Devisetty||DE||App||RNA-Seq||Report a hybrid approach that combines the transcripts generated from de novo and reference-based strategies to generate a transcriptome assembly and subsequently annotating them.|
Atmosphere Cloud -- Data transfer, Volumes, and Imaging
|Williams||Atmo||App||Images||YouTube, pub 2/2014.|
|Atmosphere: Launching, connecting, suspending, terminating instances||Williams||Atmo||App||Images||YouTube, pub 2/2014.|
Use next generation sequence data produced from Reduced Representation Libraries (RRL) such as Restriction site associated (RAD) tags.
All necessary Python modules are already installed on instance.
Prepare raw RAD Illumina data for analysis by removing low quality reads and demultiplexing a set of barcoded samples. Use Stacks to assemble RAD tags de novo from parents and progeny of an F1 mapping cross. Call SNPs, genotypes, and haplotypes of these individuals within Stacks.
Introduce new users to BATools and the BATools Wrapper Script.
BATools 0.0.1 is an Atmosphere image that has R version 3.0.1 installed. BATools, an R package for Whole Genome Prediction, is also installed on this image.
|BisQue Analysis Module: NuclearDetector3D||Narro||BisQue||YouTube||Image Analysis||Demo of using the NuclearDetector3D analysis module to detect and quantify nuclei. September 2016. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
Bisque bioimaging platform
|Merchant||BisQue||YouTube||Image Data||7 minute Flash talk that provides overview of key Bisque features. iPlant presentation at the International Plant and Animal Genome Meeting XXIII (January 2015, San Diego). YouTube, pub 2/2015.|
|BisQue Features||Narro||BisQue||YouTube||Image Data and Analysis||Video describes the major features of the Bisque Image Analysis Environment. September 2016. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
|BisQue Overload and Upload||iPlant||BisQue||App||Image Data||Video tutorial. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
|BisQue Overview and Upload Tutorial||CyVerse||BisQue||App||Image Data||(If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
|Bisque Overview Modules |
(CLICK TO DOWNLOAD PDF)
High-level overview of the Bisque system, from the perspective of a module developer.
Key concepts involved in the process of integrating a new image analysis module into the Bisque platform. The scope of this document is restricted to batch-mode binary image analysis programs running in a Unix-style environment. Bisque offers a robust way to augment such a program with a graphical user interface, and share it with others. The present document focuses on an approach to such an augmentation using Python, a popular scripting language. Python offers many facilities that support this task, and the Bisque developers have created a convenient Python API wrapping the REST interface of Bisque.
|BLAST a Transcriptome||Hilgert||DE||Workflow||NGS|
Reduce number of transcripts and level of redundancy in an assembled transcriptome, and identify coding sequences that can be submitted to BLASTP searches.
Eliminate small transcripts, Reduce transcript redundancy, Identify and translate coding sequences, Submit translated transcriptome to BLASTP or Submit translated transcriptome to Delta-BLAST
- Select contigs
- CD-HIT-est 4.6.1
- Transcript decoder 1.0
- Blastp-2.2.29 or DeltaBLAST-2.2.29
Bringing Authentic Genomics Research into the Classroom: Analysis of Maize Stress Response (Green Line)
|Makarevitch||DE, DNA Subway||App||RNA-Seq|
Analysis of plant response to abiotic stress. YouTube, pub 2/2015. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)
Create a large RNA-seq dataset that includes several maize inbreds subjected to various environmental stresses that can be interrogated to answer a variety of questions. Investigating maize global genome expression in response to abiotic stress provides students with opportunities to ask interesting, novel, and relevant questions, while developing their skills in big data analysis.
|Bulk metadata upload video tutorials (DE)||Walls||DE|| ||Metadata||Watch this video tutorial to see how to apply metadata in bulk to one or more files in the CyVerse Data Store, using a specially designed tool in the DE. This process is useful if your metadata is already entered into a spreadsheet. It is particularly helpful if you have many files that have the same attributes but the same or different values for each attribute. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
Identify changes in gene expression levels between at least two sequenced transcriptome samples (18 separate tutorials).
Tutorials: Eliminate small transcripts, Reduce transcript redundancy, Identify coding sequences, Rename transcripts, Split RefSeq file, Map transcripts, Combine mapping outputs, Identify best matches, Reformat Blat results, Annotate transcripts, Map RNA-Seq reads to transcripts, Reformat mapping output, Count mapped reads, Trim count tables, Combine counts, Determine differential expression, Separate transcripts by type, Generate transcript lists.
- Select contigs, CD-HIT-est 4.6.1, Transcript decoder 1.0, Linux stream editor, Split FASTA file, Blat (with options), Concatenate Multiple Files, Best Hit for Blat Output, Cut Columns, Rename contigs 2.0, Bowtie-2.2.1--Build-and-Map, SAM to sorted BAM, Index BAM and get stats, Join multiple tab-delimited files, DESeq, Numeric Evaluation of a Data Column, Cut Columns
|Cluster Orthologs and Paralogs and Assemble Custom Gene Sets||DeBarry||DE||Workflow|| |
Input entire protein-encoding gene or transcript repertoires from genomes of interest, and cluster homologs (orthologs and paralogs), then query clusters to assemble gene sets based on presence/absence and copy number
- Translation of CDS from Transcript Data (app: Transcript Decoder 1.0)
- Rename Sequences and Prepare Input Files (app: fastaRename)
- All-by-All BLASTp and Parse
- Cluster Homologs, optionally, add unclustered sequences to OrthoMCL output, generate reports on the number of clusters in and between species
- Query Clusters and Assemble Custom Gene Sets with queryOrthoMCL Map Fasta Headers to clusterReport and/or queryOrthoMCL output with flattenClusters
|CoGe1||Lyons||CoGe||App|| ||Intro to CoGe, a Powered by iPlant platform for comparative genomics (genomes of plants and animals). YouTube, pub 10/2014.|
(If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)
|CoGe2||Lyons||CoGe|| || ||Intro to CoGe, a Powered by iPlant platform for comparative genomics (genomes of plants and animals). YouTube, pub 10/2014.|
(If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)
|Collaborating with Image Data in Bisque||Fedorov, Kviekval, Walls||Bisque||Webinar||Image Data||Overview of how to manage data and metadata, share data, annotate data and search it in BisQue. Webinar on April 22, 2016.|
(If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)
Data Store Intro 1 – Mod 1
|Lyons||DE, Data Store||App|| ||Intro to iPlant and Data Store for BIO5 researchers, Part 1. YouTube, pub 9/2014.|
Data Store Intro 2 – Mod 1
|Lyons||DE, Data Store||App|| ||Intro to iPlant and Data Store for BIO5 researchers, Part 2. YouTube, pub 9/2014.|
Data Store Intro 3 – Mod 1
|Lyons||DE, Data Store||App|| ||Intro to iPlant and Data Store for BIO5 researchers, Part 3. YouTube, pub 9/2014.|
Data Store Intro 4 – Mod 1
|Lyons||DE, Data Store||App|| ||Intro to iPlant and Data Store for BIO5 researchers, Part 4. YouTube, pub 9/2014.|
|DE Quick Start||Williams||DE||DE||Manual||Become familiar with the Discovery Environment by learning how to create a multiple sequence alignment.|
Detect and call variants from sequence reads using Bowtie and SAM Tools.
Align reads, Reformat file, Identify variants, Verify variants.
- SAM to sorted BAM
- Calling SNPs INDELs with SAMtools BCFtools
- SAMTOOLS-0.1.19 VCF-Utilities varFilter
|Characterizing Differential Expression with RNA-Seq (Tuxedo Method)||McKay||DE||Workflow||RNA-Seq||Learn to identify all changes in gene expression levels between at two or more sequenced transcriptome samples.|
Epigenetics Part I – Bisulfite Sequence Analysis and Adenosine to Inosine Modifications
|Song, Lu, Barthelson||DE||Workflow|| |
This CyVerse Focus Forum webinar is the first of two webinars on epigenetics analysis in CyVerse. It will provide biologists with an overview of how to use applications available in the Discovery Environment to carry out epigenetics analyses on their datasets. Published 7/22/16.
The webinar will include mapping bisulfite sequencing reads using Bismark and ZED-align, and getting methylation ratios of individual cytosines across the genome. We will also discuss getting Differentially Methylated Regions (DMRs) using aligned reads from the outputs of popular bisulfite sequence aligners, Bismark as well as ZED-align. Finally, we will briefly describe a third aligner, GSNAP, which is useful for bisulfite sequencing and for adenosine to inosine modification.
|Evaluate and Pre-Process Sequencing Reads||Barthelson||DE||Workflow||NGS|
Clean and filter Illumina reads using DE apps.
Evaluate the quality of reads in a set of sequence files, Remove adapter sequences, Filter the sequences by their quality, Reevaluate the cleaned reads, Evaluate the cleaned reads using a different method
- FastQC 0.10.1 (multi-file)
- FastQC 0.10.1 (multi-file)
- Prinseq-Graph-noPCA evaluate reads
Evolinc is a two-part pipeline to identify lincRNAs from an assembled transcriptome file (.gtf output from cufflinks) and then determine the extent to which those lincRNAs are conserved in the genome and transcriptome of other species.
The first part of the pipeline is the lincRNA identification. Note, currently Evolinc only identifies intergenic non-coding RNAs. We will incorporate identification of all lncRNAs (including natural antisense, overlapping, and those of intra-genic/intronic origins) in a later version. The second part is the comparative genomics and transcriptomics analysis. You feed the output from first part to second part. The pipelines were kept separate in case users did not want to perform an evolutionary analysis on the identified lincRNAs.
|External Scripting for BisQue Workflows||Fedorov, Kviekval||BisQue||Webinar||Image Analysis||Provides software developers with an overview of how to use the BisQue API to script processing tools for BisQue data including uploading images and datasets with external metadata in CSV and other sources, analyzing and annotating images with external tools, and collecting and summarizing information stored in BisQue. Webinar on May 25, 2016.|
Introduce new users to the FaST-LMM software for GWAS analysis.
|fastStructure||Devisetty||Atmo||App||Images|| fastStructure is a fast algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x (Anil Raj et al., Genetics Jan 2014).|
|fRNAkenseq (HTseq-with-BAM-input) Manual||Hubbard||CoGe||App|| |
Utilize fRNAkenseq, affectionately abbreviated as fRNAk.
To complete the first two steps of RNA seq analysis- mapping and transcript quantification - simply navigate from the main page to MapCount. Select the libraries for which you want to quantify gene expression. Choose the genome representing the organism your samples are from. This genome will be pulled from the databank of over 20,000 fasta and annotation pairs available to fRNAk. These genomes will be processed by fRNAk using BowTie2 in order to enable use of the TopHat2 mapping algorithm which requires index FASTA files (Langmead et al., 2012). Also, choose the number of processors to devote to the mapping algorithms in order to parallelize their operations.
Functional Analysis of Your RNAseq Data
Describe set of tools for functional modeling of RNA-Seq data, including Gene Ontology and pathways enrichment. YouTube, pub 2/2015.
Describe how researchers can use iPlant tools to rapidly add functional information to their own transcript data, providing an initial set of annotations that can subsequently be used for functional modeling; utilize “real-life”, publicly available RNA-Seq data sets to demonstrate the applicability of these functional modeling tools and attendees are encouraged to bring their own data sets; discuss future plans for improving functional modeling tools on iPlant and community feedback is welcome.
|Genome Annotation using MAKER||Subramaniam|| ||App|| ||YouTube, pub 10/2015.|
|Genome-wide Association Study (GWAS) Using a Genotyping-by-sequencing Approach||Wang, Noutsos||DE||Workflow||GWAS||Learn to identify genetic variants that are associated with a trait.|
Get Started with CyVerse webinar
|Williams||DE||App||Overview||YouTube, pub 8/2016.|
|GWAS / GTL Apps Overview||Stapleton||DE||App|| ||Available tools for GWAS within the iPlant cyberinfrastructure as of July 2014. YouTube, pub 7/2014.|
|iCommands||Lyons||Data Store|| || || |
|Installing R packages on Atmosphere||Kling||Atmo||App||Images||Install R packages on Atmosphere: Launch instance, transfer files to instance, install R package, request imaging.|
|Integrating an Analysis Module into Bisque-1 |
(CLICK TO DOWNLOAD PDF)
|Kharitonovam, Predoehl||BisQue||App|| |
Specifics of using Bisque to run and share an analysis module. Extends and builds on the material of a related document, “Bisque module integration.” Intended for users who would like to integrate their analysis code into the Bisque system. Assuming that you already have that code and the input required to run it (files and parameters), this overview outlines steps needed to get started with Bisque.
|Devisetty||DE||App||RNA-Seq||Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. |
Learn how to annotate and identify using KOBAS 2.0.
Image is a build of KOBAS 2.0. Identify statistically enriched pathways, diseases, and GO terms for set of genes or proteins, using pathway, disease, and GO knowledge from multiple famous databases.
|MAKER-P Genome Annotation using Atmosphere||Stein||DE||App||Images|
Learn how to run MAKER-P on an Atmosphere image using example small genome assembly.
Launch MAKER-P Atmosphere image, Run MAKER-P on example small genome assembly.
MAKER-P Genome Annotation using cc-tools and Atmosphere
Learn how to run MAKER-P with cctools on an Atmosphere image using example genome assembly.
Launch MAKER-P Atmosphere image, Run MAKER-P with cctools on example genome assembly.
|Metadata in BisQue|| Narro||BisQue ||YouTube|
|Series of videos describing BisQue's support for metadata. Demos the basics of creating a metadata template, applying it to an image, annotating the image using the metadata template, searching and browsing a dataset of images that have been annotated using a metadata template. September 2016.|
|mini SOAPdenovo||Williams||DE||App|| |
Gain familiarity with a commonly used procedure for de novo whole genome assembly of Illumina reads using the DE.
Assembly of paired and unpaired Illumina reads (app: Soapdenovo2)
Analysis of assembly quality for comparison to what was accomplished in one of the Assemblathon procedures that used Soapdenovo1
|NCBI Sequence Read Archive (SRA) Submission||DeBarry||DE||Workflow|| |
Make submissions to the NCBI Sequence Read Archive (SRA), including compressed FASTQ and an XML metadata file, organized into a submission package.
Upload compressed sequence files into DE; Create submission package folders and add compressed sequence files; Add metadata to every folder in submission package; Validate submission package and submit to SRA; Submission of package and validation; If necessary, correct errors and resubmit.
Next Generation Sequencing: Getting Started, Read Evaluation, and Cleanup
|Barthelson||DE||App||NGS||Webinar on next-gen sequencing read cleanup (the most important thing of all!). YouTube, pub 4/2015.|
NGS Eclipse Plugin
|Duitama||DE||App||NGS||YouTube, pub 2/2014.|
Overview of GWAS Theory
|Lorenz||DE||App|| ||Basic science of GWAS. YouTube, pub 7/2014.|
Overview of the iPlant Collaborative
|Vaughn||All|| || ||Highlights from some iPlant session presentations at the International Plant and Animal Genome Meeting XXIII (January 2015, San Diego). YouTube, pub 2/2015.|
Power and Limitations of GWAS
|Lorenz||DE||App|| ||Power and limitation of a GWAS approach to understanding genetics and breeding potential. YouTube, pub 7/2014.|
QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data. QIIME is designed to take users from raw sequencing data generated on the Illumina or other platforms through publication quality graphics and statistics. QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.
QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data. QIIME is designed to take users from raw sequencing data generated on the Illumina or other platforms through publication quality graphics and statistics. This includes demultiplexing and quality filtering, OTU picking, taxonomic assignment, and phylogenetic reconstruction, and diversity analyses and visualizations. QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.
|QUAST 4.0|| Devisetty|| Atmo||Image ||NGS ||QUAST is a tool for evaluating genome assemblies by computing various metrics.|
Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075
|RNA-Seq 1||Barthelson||DE||App||RNA-Seq||Basic steps to RNA-seq analyses using iPlant. YouTube, pub 10/2014.|
|RNA-Seq 2||Barthelson||DE||App||RNA-Seq||Review steps in RNA-seq analysis using iPlant. YouTube, pub 10/2014.|
|RNA-Seq 3||Barthelson||DE||App||RNA-Seq||Step-by-step instructions on how to do RNA-Seq analysis using iPlant. YouTube, pub 10/2014.|
| RNA-Seq Methods and Algorithms (Part I )||Pimentel||DE||App||NGS||Part I (Intro and Overview of RNA-Seq) webinar on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
RNA-Seq Methods and Algorithms (Part II)
|Pimentel||DE||App||NGS||Part II (Alignment Algorithms) of webinar on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
RNA-Seq Methods and Algorithms (Part III)
|Pimentel||DE||App||NGS||Part III of webinar (Quantification) on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
RNA-Seq Methods and Algorithms (Part IV)
|Pimentel||DE||App||NGS||Part IV (Differential Expression) of webinar on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
RNA-Seq Methods and Algorithms (Part V)
|Pimentel||DE||App||NGS||Part V (Live Kallisto Shell demo) of webinar on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
RNA-Seq Methods and Algorithms (Part VI)
|Pimentel||DE||App||NGS||Part VI (Live Sleuth Demo in R) of webinar on Kallisto and Sleuth ; new tools for working with RNA-Seq datasets. YouTube, pub 10/2015.|
|rnaQUAST 1.1.0||Devisetty||Atmo||Image||RNA-Seq||rnaQUAST is a tool for evaluating RNA-Seq assemblies using reference genome and gene data database. In addition, rnaQUAST is also capable of estimating gene database coverage by raw reads and de novo quality assessment using third-party software (STAR, TopHat, GMAP etc.).|
|rnaQUAST 1.2.0||Devisetty||Atmo||Image||RNA-Seq||rnaQUAST is a tool for evaluating RNA-Seq assemblies using reference genome and gene data database. In addition, rnaQUAST is also capable of estimating gene database coverage by raw reads and de novo quality assessment using third-party software (STAR, TopHat, GMAP etc.).|
|Taxonomic Name Resolution Service (TNRS)||Hilgert||TNRS||App||App|
Become familiar with TNRS to identify, correct, and update scientific names of plants.
Compile and submit a list of names, process names, download, and examine results.
|Transcriptome Assemblies: Approaches||Barthelson||DE||App||Transcriptomes||YouTube, pub 4/2015. (If the link no longer works, go to CyVerse YouTube channel and search on Transcriptomes.)|
|Transcriptome Assembly (De Novo)||Barthelson||DE||Workflow||Transcriptomes||Learn to assemble a transcriptome without a reference genome and to evaluate the assembly.|
Transposable Elements, Gene Discovery, and DNA Barcoding (Yellow and Blue Lines)
|Burnette||DE, DNA Subway||App|| ||Identify a TE family in the genome of an organism. YouTube, pub 2/2015.|
Using iPlant Tools and Plastome Sequencing As a Springboard into Comparative Genomics
Use iPlant Tools and Plastome Sequencing As a Springboard into Comparative Genomics. YouTube, pub 2/2015.
Plastome Organization and Sequence for the Mimosoid Legume Leucaena Trichandra: Sequenced the plastome of L. trichandra, one of the first plastome sequences from the diverse Mimosoideae and a species involved in the origin of four tetraploid species of Leucaena. De novo assembly of a 300bp insert paired-end Illumina HiSeq library generated a 164,692bp plastome containing 112 unique genes arranged in the typical large single copy (LSC), inverted repeat, and small single copy regions.
Utilizing iPlant to Unearth Long Non-Coding RNAs
Utilize iPlant to Unearth Long Non-Coding RNAs and Characterize Their Evolution in the Plant Family Brassicaceae. YouTube, pub 2/2015.
Address issues in identification of biologically relevant lncRNAs by using comparative genomics and transcriptomics to recover and curate lncRNAs in the plant family Brassicaceae, using CyVerse resources and CoGe.
|Validate Workflow v0.9||Carpenter||Atmo||App||Images|
Learn to navigate the Validate Workflow.
- Simulate Demonstrate