2009 Tech Talk

Date

Presenter

Contact

Title

Host

URL/Link

Abstract/Notes

Aug 19 2009

Andres Varon

 

POY (Phylogenetic Analysis of DNA and other Data using Dynamic Homology)

Andy Lenard

POY 4 is an open source, phylogenetic analysis program for molecular and morphological data. Version 4 supports Maximum Parsimony as its optimality criterion, analyzing the standard non-additive, additive, and matrix characters, commonly found in other phylogenetic analysis programs, and most importantly dynamic homology characters (DH) which allow the use of unaligned sequences as characters.

Sep
2
2009

Andy, Edwin, Sriram, Nirav, Sonya

Nirav

Topics of Inetrest to iPC from OSCON 2009

Nirav

OSCON is the "Open Source Convention" and has a impressive line up of speakers from various open source projects and opportunities to learn best practices from tutorials and talks.
We will explore ways we can incorporate/leverage what was learnt into iPC projects.

Sept 30 2009

Stephen Kobourov

Nirav

GMap, Putting Data on the Map

Nirav

 

Information visualization can be invaluable in making sense out of large
data sets. However, traditional graph visualization methods often fail
to capture the underlying structural information, clustering, and
neighborhoods. GMap, an algorithm for visualizing graphs as maps,
provides a way to overcome some of the shortcomings with the help of the
geographic map metaphor. While graphs, charts, and tables often require
considerable effort to comprehend, a map representation is more
intuitive, as most people are very familiar with maps and even enjoy
carefully examining maps. The effectiveness of GMap is illustrated with
examples from several domains, namely TV shows and Amazon books.

Oct 14 2009

Rutger Vos

Nirav

NeXML, Treebase, PhyloWS

 

 

Discussion with iPC team on the use of triple stores. We also will review roadmap and progress with Treebase, NeXML and PhyloWS

Dec 2nd 2009

Sheldon McKay

Nirav

A Survey of Genome and Comparative Genome Browsers

Nirav

The need to visualize genome-scale data has been addressed by genome
browser applications, which typically present a graphical rendering of
a reference sequence along with annotations such as gene models,
experimental data from expressed sequence tags, microarray
experiments, etc. Increasing availability of newly sequenced genomes
has also led to growth in the field of comparative genomics and, with
it, an emerging class of software known as comparative genome or
synteny browsers. There is currently an embarrassment of riches in
web-based software for visualizing genome annotations, alignment and
co-linearity, with attendant heterogeneity in approaches to processing
and displaying the data. I will review examples of commonly used
genome and comparative genome browsers, with an emphasis on Generic
Model Organism Database (GMOD) supported software and recent
improvements to deal with very dense information from high throughput
microarray and next-generation sequencing experiments.

Dec 9th 2009

Damian Gessler

Nirav

An Introduction to the Semantic Web

 

 

Dr. Damian Gessler, iPlant Semantic Web Architect, will present an introduction on the semantic web.  He will discuss what it is, how it differs from web services, how it fits into NSF's multi-$100 million efforts in data and service persistence, access, and integration, and how it fits into iPlant's unique set of challenges.  Today, the only thing greater than the plethora of technology choices available to us is the gap between any single technology and its ability to solve the challenging data and service integration problems ahead.  Semantic web technologies offer unique assets by allowing internet-scalability over semantically difficult problems. Many problems in plant science are intransigent to solution via lexical and syntactical aggregation.  These problems require that contextual information is made amenable to high-throughput reasoning and discovery.  Dr. Gessler will present the results of research aimed specifically at addressing this problem.

Dec 16th 2009

Eric Lyon

Nirav

CoGe: A new kind of Comparative Genomics

 

 

Transforming genomes of information into knowledge continues to present a
significant challenge. This transformative process often requires the trained brain of a
biologist relying on pre-built computational systems to access, analyze, and visualize
genomic data. At least four step are involved: data acquisition, analysis, data and
results visualization, and experimental validation and refinement. Equally daunting are
two chronic, computational infrastructure challenges: updating existing genomic
resources with new data, and deploying new analytical tools.
CoGe is a web-based software system designed to meet all of these challenges.
CoGe currently stores genomes from over 7,000 organisms, comprising over 130
billion basepairs of genomic sequence data and their overlying annotations. It uses a
novel genomic visualization system and a suite of interconnected and interactive tools
permitting researchers anywhere in the world to quickly identify and validate genomes
and genomic regions of interest, and characterize many patterns of genome evolution
including synteny, whole genome duplication events, post-polyploid fractionation,
subfunctionalization, deletions, local duplications, inversions, translocations, misannotations,
motif patterns, and conserved noncoding sequence. By using a webbased
system, it is trivial to link into CoGe’s analytical subsystems. This has proven to
be an efficient way to visually proof and validate large datasets derived from automated
computational pipelines, thus avoiding lists of data that cannot easily be evaluated by
an end-user. Additionally, CoGe’s computational infrastructure substantially decreases
the time for professionals to integrate new genomic data and analytical tools. By
utilizing a database schema that is theoretically scaleable for many hundreds of
thousands of genomes, deployment of new genomic information is seamless with tool
integration. Likewise, when new analytical tools are developed for a studying a
particular set of genomes, they are seamlessly integrated with all genomes residing in
CoGe.
CoGe’s ability to allow researchers to rapidly identify genes and genomic regions
of interest, and visualize their evolution in comparison to any number of other genomes
and genomic regions, constitutes a powerful new tool for any biologist. CoGe is
publicly available at: [http://synteny.cnr.berkeley.edu/CoGe
] Current Database Statistics:
Organisms: 7,400
Genomes: 7,950
Nucleotides: 135,000,000,000