List of client's needs:
- Easy to Use (a web interface instead of local cmd line)
- Able to add/update database
- Makeflow integration (mentioned in word doc in dropbox)
- Able to support many users concurrently
Project plan draft
Multiple Instance of Server [Primary]
Multiple instances of SequenceServer with a load balancer
Spin up more instances (docker images of SequenceServer). A load balancer placed in front of all the instances to distribute the load.
- A Web server and multiple BLAST worker server
Split the component that interface with BLAST from SequenceServer, place this portion of workload on worker servers, and there can be many of them. A web server (the rest of SequenceServer) will be responsible for handling incoming search requests, and other management tasks.
(shared/non-shared) web cache in front of all the server instances at the HTTP level, possibly utilizing existing solution (e.g. memcached)
Cache at application level, prevent repeat run of BLAST for identical parameter, store in an existing solution or SQL database
Optimize SequenceServer (docker image) [Primary]
Switch web server
Default for docker image is WEBrick, should opt for Nginx or Apache
Faster storage for database
Store the databases on faster storage, such as ramdisk (ramfs, tmpfs)
Profile performance for bottleneck
Check if there is any bottleneck in the ruby codebase for SequenceServer, since there is not much we can do about BLAST codebase
Web interface for database management
Additional web UI
Authentication required for permission to add/update
Web interface for Makeflow integration
Web UI for submitting Makeflow file and corresponding input files.
Authentication required for permission to run.
Questions for Wilson:
Sample Search that represent a certain use case (used to get a baseline for performance)
Do we need to implement a web interface for add/update BLAST database? Is this a priority task?
Do we need to support Makeflow? Do we need to have a web interface for it (e.g. submitting Makeflow file and input files)? Is this a priority task?
What authentication to use if we were to implement a web interface for managing BLAST database and support for Makeflow
During of search spikes (e.g. for a in-class usage), 20 minutes? 1-hour?
Are the search result from BLAST constant given that the parameter(param to BLAST) does not change and no update to corresponding database? Is there any other factor that may affect the result?