This box searches only this space. The box at the upper right searches the entire iPlant wiki.

Skip to end of metadata
Go to start of metadata

GitHub: https://github.com/JLHonors/ACIC2019-Midterm.git


List of client's needs:

  • Convenience

- Easy to Use (a web interface instead of local cmd line)

- Able to add/update database

- Makeflow integration (mentioned in word doc in dropbox)

  • Scalability/Performance

- Able to support many users concurrently

 


Project plan draft

 

Major Components:

  • Multiple Instance of Server [Primary]

  1.  Multiple instances of SequenceServer with a load balancer 

    Spin up more instances (docker images of SequenceServer). A load balancer placed in front of all the instances to distribute the load.


  2.  A Web server and multiple BLAST worker server     
    Split the component that interface with BLAST from SequenceServer, place this portion of workload on worker servers, and there can be many of them. A web server (the rest of SequenceServer) will be responsible for handling incoming search requests, and other management tasks.

  • Caching [Primary]

  1. (shared/non-shared) web cache in front of all the server instances at the HTTP level, possibly utilizing existing solution (e.g. memcached)

  2. Cache at application level, prevent repeat run of BLAST for identical parameter, store in an existing solution or SQL database

  • Optimize SequenceServer (docker image) [Primary]

  1. Switch web server

    Default for docker image is WEBrick, should opt for Nginx or Apache

  2. Faster storage for database

    Store the databases on faster storage, such as ramdisk (ramfs, tmpfs)

  3. Profile performance for bottleneck

    Check if there is any bottleneck in the ruby codebase for SequenceServer, since there is not much we can do about BLAST codebase

  • Web interface for database management

  1. Additional web UI

  2. Authentication required for permission to add/update

  • Web interface for Makeflow integration

  1. Web UI for submitting Makeflow file and corresponding input files.

  2. Authentication required for permission to run.


Presentation: https://docs.google.com/presentation/d/1AVyCtN7EsDBOcPszcIXVKQP-UoyA3Ghw_60neRHekj0/edit?usp=sharing

 


Concept Map:


Questions for Wilson:

  1. Sample Search that represent a certain use case (used to get a baseline for performance)

  2. Do we need to implement a web interface for add/update BLAST database? Is this a priority task?

  3. Do we need to support Makeflow? Do we need to have a web interface for it (e.g. submitting Makeflow file and input files)? Is this a priority task?

  4. What authentication to use if we were to implement a web interface for managing BLAST database and support for Makeflow

  5. During of search spikes (e.g. for a in-class usage), 20 minutes? 1-hour?

  6. Are the search result from BLAST constant given that the parameter(param to BLAST) does not change and no update to corresponding database? Is there any other factor that may affect the result?

     


Development Plan: 

Other Links:

HW4 Objectives: https://docs.google.com/spreadsheets/d/1jSQXwG1s-UfqROXjei97_1Oj37CTCvrbwJ_M9jhP5rE/edit#gid=0

Project Needs: https://docs.google.com/document/d/14OMVOZHsw7HmFnZzlOO5WWh33hFD-fOnskBeu-R5aRA/edit

General Notes: https://docs.google.com/document/d/1DVjNuqDKj7I_2RISIJr6gtqSKn9b3kct7pK391DAHkg/edit?ts=5d7a7a96

 


Members:

Rafael Barreda

Michael Burman

Jackson Lindsay

Kyle Strokes

John Xu

 

  • No labels