A lot has changed in the SLIC interface since last year. We are happy to present to you an updated SLIC Portal, which includes a lot of new features:
- Browsing the categories is now much easier thanks to the hierarchical "tree" that lists talks and courses
- We will be uploading a lot more course lectures from previous years!
- Searching slide words now takes into an account the state of the categories' checkboxes
- users can search globally or narrow their search down to a specific category
- the search terms stay in effect when the user opens a video page (effectively searching within the selected presentation)
- search results can be further sorted based on various criteria (e.g. by displaying the most recently presented talks first)
- Video page and the corresponding presentation information have been updated to provide more details
- Slides' timing information has been added to show when a slide has been shown multiple times
For a more detailed report about the new features keep reading this post. Also, you can check the SLIC Help page for a quick overview of the site's features.
As usual, don't hesitate to contact us with feedback.
(Click on the thumbnail above to see the full-resolution image).
?Notes from Alexander Danehy, the SLIC Portal interface developer:
Upon laying eyes on the development branch of SLIC Portal, many new features are instantly noticeable. To keep things clear, we'll start at the top of each page and make our way to the bottom.
- INDEX -
This landing page is internally known as "browse", so you may see me refer to the front page with that nomenclature, occasionally.
We took many of the frequently accessed features and put them in the search bar, to minimize click-through requirements. Such features include narrowing down by presentation year, the ability to sort by various, numerical criteria, and a new feature, which allows the user to filter out slides which do not appear in a video. Alongside these features is a new button, specific to the next section in this post.
Since the last release of SLIC: Portal, we have gone through a great deal of discussion centered about the notion of browsing. As a long-time user of computer and Internet technology, I've grown accustomed to the widely accepted methods for finding information. Along the way, Google seems to have been the authority on what's "best" for users, in this context. As such, discretizing "Search" and "Browse" has been a complete non-issue. Usually, a user has a general idea of what she wants, so she types in some relevant keywords then browses through the results that remain.
However, in SLIC, we've decided to employ "Searching" separately from "Browsing". That's not to say that the traditional, Bing-type search-then-browse methodology is not present, in our system. On the new SLIC Portal, you will find a file tree, similar to that found in many operating systems. As long as I've been a part of SLIC, the notion of a "Course" has always been separate from "everything else", otherwise known as a "Talk". As such, there are two roots to the tree.
The "Talks" tree is organized by "who held the presentation" (organization), "what kind of presentation it is" (category), and "the event containing the presentation" (event). Multiple talks can occur at an event. We feel that this structure is sensible since it essentially equates to "largest to smallest element". Think of it as a date format: 2013-08-01. In this format, the largest time period is listed first (year), followed by the second-largest (month) then the smallest is last (the day). The same basic structure holds for "Courses". "Institution" -> "Department" -> "Course Title" -> "Semester". Similarly, multiple lectures occur in a single course.
Arriving at such a structure hasn't been easy. The SLIC team sank weeks into deciding what the "best" approach can be. Truthfully, I found that process fascinating. We all use user interfaces founded in one paradigm or another and we all just assume that such an implementation is obviously the "best". But throughout the aforementioned process, I discovered that everybody has an opinion on what would be most functional and practical. Stranger yet: everybody provides infallible reasoning for his or her opinion.
Having nothing selected is equivalent to having everything selected. Selecting any single node in the tree will then narrow the list of visible presentations to those associated with the selected node and its parents. The tree is a straightforward, one-to-one projection of the database and can theoretically grow infinitely, as the database does.
Along with the tree are buttons to make large selection changes easy and quick. "Check All" and "Uncheck All" do precisely what they say they do. We also have added the ability to hide the tree. As the collection of talks and courses grows, so will the tree. After a while, it is likely to become unwieldy and hiding it will clear up a lot of real estate on one's screen.
This is largely a new home for data displaced by the newly available and visible features. It shows the number of presentations relevant to one's current filter set (text search, year range and tree node selection) and the slide count, therein. This also houses the "Reset" button. Clicking this will clear the search terms, reset the tree to "nothing selected" and reset the year range to be all-inclusive. It fully-resets the state of the page as if the user has just arrived.
Going forward, we will refer to a collection of slides on the page as a "slide scroller".
If a slide appears in a video more than once, there is now a visual indicator overlaid on the slide. Clicking such a slide will present the user with a list containing each time that slide appears. When the user clicks one of those times, the page will then act accordingly (if on the VIDEO page, it will seek, if on the front page, it will load the VIDEO page then seek).
- Video -
The same way that the application will remember a user's display type choice (slide bars or just cards), the application remembers where a user left off, in a video. If a user watches a video partially then leaves, ANY TIME she returns, the video will seek to the last-seen point and activate the current slide.
If a user drags the playhead (seeks in the video), the application will now be aware of the currently-shown slide and seek to it. When doing this, the application will center the currently active slide on the page to make it apparent which slide is considered "active".
Lest we forget: we're tracking video views, now.
SLIC is fully-portable and will work on ANY LAMP or WIMP stack (it will probably work anywhere, for that matter but it hasn't been tested anywhere else). It also fully supports SSL, should an administrator choose to do so. ALL of SLIC's front-end code amounts to approximately three megabytes; and that's if a user hit EVERY file along the way, which doesn't happen under normal circumstances.
I think we can at last call the multithreaded spot detector finished. In the last month of her Honors Thesis work, Salika Dunatunga and I labored hard to accelerate the convolution step, and it is working great. This speedup is in addition to the earlier-reported speedup from using multiple threads to parallelize the detection task. The final result is, I have to say, really nice -- we get in the neighborhood of a 20x acceleration in Bisque of the detector phase. What once took 25-30 minutes now takes about one minute.
Convolution is really central to the spot detector, and accelerating that portion turned out to be relatively complicated. But let me first give some context. Convolution is a fundamental signal-processing operation; in our case, we use it to deliberately blur the image. (Why? Because an appropriate amount of blur makes tiny sprinkles of noise disappear. It hides irrelevant jitter we wish to ignore.) Computationally, convolution is like a number of famous processes (for example, sorting) in that there are dumb and smart ways to do it. The original version of the software was not just single-threaded, it was doing convolution the dumb way -- the most straightforward algorithmic interpretation of the mathematical definition -- and consequently, it was slow. The abovementioned time was mostly spent on convolution.
The first smart way to convolve is to exploit the natural separability of the Gaussian blurring operation. For mild, light quantities of blur, this method works great. It doesn't work with all forms of convolution, but it does apply for Gaussian blur. Unfortunately, as the size of the blurring kernel increases (i.e., when you want an even more unfocused result) the computational demands grow quadratically. That's what the blue line in the graph below shows: sigma represents the "heaviness" of the blur. If you want a really heavy blur, it's not the best algorithm (in this graph, up is bad and down is good).
There's another way to perform convolution, and it revolutionized the field of signal processing when it was discovered (or, rediscovered) in 1965 by Cooley and Tukey. It turns out there is a fast way to compute something called the discrete Fourier transform. That was a revolutionary discovery because if you so "transform" an image and a convolutional kernel into their Fourier representations, some very simple (read: fast) math yields a signal equal to . . . the transform of the convolved image! Ok, I realize that doesn't sound very exciting, but trust me, it is a big deal. Thus, by a somewhat byzantine-seeming series of operations, one can perform convolution. The tortuous path one must follow means there is extra overhead cost, but for heavy blur it is worth it: fast Fourier transform (FFT) convolution gets more costly with more blur, but the growth rate is smaller than a separable implementation. The orange line in the above graph shows the speed of Fourier-transform-based convolution with different quantities of blur.
As you can see on the above graph, the crossover point between the two methods is around a 24-pixel value for sigma in a Gaussian blur kernel, for our sort of system inputs. Around twenty-four pixels -- that is a very useful quantity to know! Because when the user requests a certain amount of blur, now we can select the faster of the two available methods: is it over 24 or not? The graph itself represents a great deal of work by Salika. That 24-pixel threshold depends on a number of factors and reasonable assumptions, and it might need to be revisited one day. Even if it is not exact, anyway, in the neighborhood of 24 pixels it is fair to expect that the two methods are roughly equal-performing, and that's all the empirical evidence we need to choose a good convolution method.
Thread-safe FFT convolution
While Salika was investigating the above phenomena, we painfully discovered our FFT convolution code was incompatible with our parallelized code. In other words, the FFT convolver we planned to use was not thread-safe. I've written before about empirically discovering that code is thread-unsafe, and it's always a disappointment. This threatened our grand plan of choosing the best convolution method based on blur-size. I set about seeking an thread-safe alternative. What I found was that the well-known library known as FFTW does offer limited re-entrant behavior, but only on its transform methods. None of its support methods! So I wrapped up the support methods in a nice class, figured out a good re-entrant interface for convolution using FFTW, tested it thoroughly, and integrated it into the spot detector. This was a lot of work, but the details aren't so interesting to recount. Anyway, the current revision of the spot detector now supports multi-threading both convolution implementations, which I think is impressive. One nettlesome unanswered question concerns the crossover point: as you might have perceived, the current libFFTW-based convolution is somewhat different from the implementation used to generate the above graph -- but I trust the performance difference is modest.
The multithreaded spot detector also needed a bit of spring cleaning. The source code had just undergone some significant remodeling, and there was the software equivalent of debris scattered here and there for me to tidy up. Another analogy would be that of a book editor. The manuscript was finished, but some paragraphs were too long, some didn't make perfect sense anymore in the context of other changes, and so on. For the past few days I've been identifying trouble-spots: places where the program could fall into an infinite loop (threads are tricky this way), clunky blocks of logic to be streamlined, gaps where potential error codes could go unnoticed -- cracks in the hull, you might say. I think the program is much more water-tight and seaworthy than it was a month ago. I've been testing it continuously while making changes, of course, and it seems ready to be tried out in production.
During the Spring 2013 semester, we were hard at work, adding a lot of improvements, including better search and navigation. A power failure that took the main server offline in the middle of the semester, tested our backups and preparedness to handle unexpected outages.
In my previous update (https://pods.iplantcollaborative.org/wiki/display/ipg2p/2013/05/01/SLIC+Update+for+the+end+of+fall+2012+-+beginning+of+spring+2013), I mentioned the three new members who joined the SLIC team. Here's an overview of what the SLIC team focused on, and who was in charge of each of the areas:
Break up a video into segments by matching slides to the frames in which they appear.
Creates continuous segments, each featuring a single slide or denoting a segment where no slide was shown.
- measure how well automatic video-to-slide matching algorithm works
- compare its results to the human-generated output
- measure the accuracy to adjust the algorithm to perform better
Click on the image above to see the groundtruth interface layout.
Input: segmentation file
Input two files with frame and the slide match numbers
Compute the percentage of correct matches between the two files
Show the visual differences in a webpage: frame image and the slide matches for each file
The purpose of this work is almost the same as with the segmentation groundtruth work. We also adapted the script to work with the legacy segmentation format.
- a web-based administration interface to allow for management of various database elements
- upload the required video and presentation files
- monitor the progress of a presentation through the processing stages
Adding a new presentation.
Listing all available presentations.
SLIC Portal web interface
I described most of the work on the SLIC interface in my previous blog post. Adding the navigation tree was the major accomplishment of the spring semester. We also spend time on writing documentation and end-user help features for the SLIC page.
Tiled view of the presentations.
Sorting presentations by various criteria.
Slide matching code
I am continuously researching and implementing new approaches, which will improve the heart of the SLIC project – the slide-to-video-matching code. The transformations that I calculate in my code, not only increase the accuracy of matching slides to the corresponding video frames, but will also contribute to the aesthetic quality of the resulting videos. Some of the improvements will include backprojecting high-quality slides into the video, controlling the transparency of a speaker if a slide has been projected over them, and automatically brightening and color-correcting the scene. Ultimately, the results of this work will give the user much better control of the overall presentation-viewing experience.
These features and updates will allow students to incorporate the SLIC Portal into their studies. We are ready to ask students in the ISTA / CS classes to test-drive the new interface and provide their feedback. In the near future, it would be possible to conduct a study to scientifically show the e?ectiveness of this system in helping students learn and review the material.
We have implemented a lot of changes to the web interface, making it more robust and user-friendly. Look for a new release of the SLIC Portal interface soon!
This is just a short note to say that I've ported Salika's multithreaded spot detector onto iPlant machines, so now it runs in Bisque. Thus the module we call the Pollen Tube Tracker now runs the multithreaded code.
It turns out that the iPlant machine used for Bisque is not quite as abundant with thread resources as our development systems. In retrospect this is unsurprising: the development systems we use are heavy-duty CPU servers with no other purpose in life except to run Computer Vision programs. Whereas iPlant's Bisque server is not as beefy (less CPU and less memory) and is juggling a lot of tasks, including a fairly complicated Nginx webserver, the Bisque code, and everybody else's Bisque modules. So, although we could achieve something like a twenty-fold speedup in the lab, on the Bisque server we will have to content ourselves with less. I've configured the spot detector to use eight threads, and my tests indicate we get approximately a 4.5-fold speedup, i.e., spot detection formerly took about thirty minutes, and now it takes a bit less than seven minutes. I'd call that a huge improvement.
As I said last month, even in a lab setting, one of the bottlenecks is file IO. There's an irreducible time expense just to read the images off the disk. I believe on the Bisque server, there is more than one hardware constraint that limits performance, and the second big one is RAM. If I crank up the number of threads on the Bisque server (say, to 40), the detection process uses up all its RAM allowance, and the system swaps out detector RAM pages back onto the disk. In other words, it's like the data has to commute through the IO bottleneck numerous times. That really hurts performance. So, more threads can actually be worse. Again, in retrospect this is not surprising, but I did not see it coming; I didn't realize that RAM was quite as tight as all that. We might want to do some tuning to find the best choice for the number of threads.
However, it's probably not worth it to glean much more performance from this first stage of the module, the detector. The second stage of the module, the tracker, is now the one we probably should try to accelerate. Improvements here depend on work done by Ernesto Brau and Jinyan Guan, who've done lots of tracker work and, I think, have found ways to accelerate the tracker used by our module. I hope to be able to say more about this soon.
As I mentioned last time, an undergrad and I have been working on a faster version of the spot detector. I'm glad to say that code is working really well, and offers something in the neighborhood of a 20-fold speedup on typical input datasets. We launch one thread per z-depth in a timeseries of a z-stack, and process them all in parallel. A typical dataset has around 35 or so increments in the Z direction, so we can achieve some very nice parallelism. You might be asking yourself, if you launch 35 or so threads, why not get a 35-times speedup? We can't quite achieve that because you can never parallelize everything. The big sequential factor in our case is reading the file input: the hard drive (or whatever) doesn't sprout new output cables on demand, unfortunately.
Actually, it's a little worse than that. The custom library that we use in our research group was begun in the '80s, back when parallel computing was only for the very elite. Its file IO functions employ what we call "static variables," in a thin software layer to make sure that library users are using and releasing IO resources responsibly. It's sort of a "nanny," and it really does help rapid software development to have one's library automatically check up on your application code. Unfortunately, the nanny cannot tolerate multiple threads -- the conceptual equivalent of a babysitter assuming they're supposed to watch over one kid, when in fact there are several indistinguishable clone children in the same house. Confusion would ensue.
Code like this is called "thread-unsafe," although that's a term with no strict definition. It took a surprisingly long time for us to remember that our library's file IO was not thread-safe, since (so to speak) the nanny never said "I'm confused," but instead would just occasionally, yet rarely, blow up. Human metaphors fail; the program would crash and we didn't know why. Eventually we realized we were calling thread-unsafe code when we loaded the images.
There was still a little nagging uncertainty in my mind that our program could, somehow, call other thread-unsafe code elsewhere the library, but after a few weeks of stress-testing (for example, using FAR too many threads, and running the program on ALL datasets), I'm feeling more confident that we've got it solid now.
Salika is eager to try to push the spot detector even farther, and now we are exploring a signal processing technique known as separable kernels, which might offer a performance improvement. This one is not a slam-dunk; the potential for performance improvement depends on one parameter choice from the module end user. We honestly don't know whether any speedup is possible, let alone how much. So, she is going to have to do some investigation. However, we are following the principle of optimizing the bottlenecks: the signal-processing step in question really does take a significant chunk of the overall spot-detection time. If we can speed it up by a factor of 2 (or ten or twenty) then you'll really notice the effect. I'll have to let you know what we find.
From early March until now, I've been collaborating with an undergraduate researcher (Salika Dunatunga) on improving the performance of our pollen tube tracker. Specifically, we are focusing on speeding up part of the inference engine (the brown box in this diagram), using one particular technology, threads. There is ample room for this kind of improvement! The module currently takes hours to produce high-quality tracks, which is annoying to anyone in a hurry (i.e., everyone), but fortunately we know a good strategy for making it better.
As I've mentioned before, the inference engine works in two stages, a detector stage followed by a tracker stage. The detector stage takes images as input -- messy, noisy grids of millions of pixels, each pixel having at an X, Y, Z location and time coordinate plus a corresponding image intensity there. From that input, it produces a few neat text files of numbers. Ideally, the text file output catalogs the positions of the pollen tube tips. (It inevitably makes a few errors of omission and hallucination too, but those mistakes are not in our scope today.) We are focused on speeding up the detector.
The detector stage, thus, has to wade through hundreds of images, to glean a relative handful of interesting information. The good news is that the images can be safely treated as independent data. In other words, the conclusions we draw from image #42 depend solely on its pixels, and not on the pixels of any other image. Perhaps that sounds obvious, but it need not be true. Imagine, for example, that image 42 and image 43 are adjacent in Z-depth, and a pollen tube tip is perfectly between them. The tip would fluoresce equally brightly in both, but an ideal detector would "know" to detect a tube tip once, neither missing the tip nor doubly-detecting it. To do so, it might have to consider neighboring images. Ravi's lab avoids this problem by setting the Z-stack spacing to a relatively large value, so the tips tend to appear at most once. Our tracker (stage 2) also copes by presuming the fallibility of the detector -- it "knows" the detector will occasionally miss its target.
Another piece of good news is that modern computers are trending towards greater and greater parallelism. It's currently easier to make an eight-core processor than make one processor eight times faster. And there's a programming technology called multi-threading which lets one in essence multi-execute one program simultaneously -- run multiple inputs through one set of instructions, at the same time. The details are hairy, and often confusing. It's dangerously easy for the parallel paths to interfere with each other. Salika and I are both fairly new to this style of programming, and we are learning it at the same time. We are using basic pthreads, and have encountered and overcome a fair share of enigmatic bugs (in both our own and in others' code).
We are not done yet, but a couple of weeks ago I think we turned a corner -- we finally have two threads working pretty well. I think if we can do two, we probably can do twenty with not too much more work (yet more than you might expect -- just like it takes more work to plan and organize a dinner for twenty than for two). So I predict we will be able to boast of some significant speedups soon.
Last semester (fall 2012), the SLIC team released a brand new version of the SLIC website (http://slic.arizona.edu/).
From Oct 25, 2012 to approximately Feb 8, 2013, Alexander Danehy, the main student developer in charge of the SLIC Portal, added a lot of tweaks to the main interface. One of the big changes was the inclusion of course lectures, and the subsequent ability to browse our video collection by different categories.
We've installed jsTree as the primary browsing/filtering system. The tree allows fine-granularity selection from our predetermined hierarchy. It's worth noting that we've discussed, tried and tested a lot of options along the way:
a. drop-down checkboxes
b. walking menu
c. walking checkboxes
d. walking, no-click menu
e. single-selection drop-down lists (previously, user could only select from categories)
A few changes were made to the video page as well:
- added the ability to jump to a slide by its number
- added information about the current number of slides shown
- show segmentation data for slides - whether or not a slide is active, its time in the video (or multiple times, if it was shown repeatedly), or if isn't shown at all
- added a visual indicator of a slide not being shown in a video (on the video, as well as in the SLIC view on the main page)
- recruited a new team member to work on the multiple slide appearances
- modified advanced search - no longer an arrow but a full-text button; added sorting
In an effort to make searching even more powerful, we looked into automatic speech extraction and transcript generation. One of the students, Kavinfranco Devadhas, who joined the project in the fall, began researching the CMU Sphinx project -- an open-source speech recognition software. Getting it to work and modifying it to suit our needs proved to be a challenging exercise. However, he was able to make some progress: he successfully added slide words to the dictionary to improve recognition by about 10-20%. He also worked on a publication idea, which required using slide words to automatically tune the Sphinx parameters. The preliminary results showed that this could be a promising direction. Unfortunately, due to a job offer, he was not able to continue this work in the spring semester. The project is in a state at which the next student can continue to build it up and test new ideas.
Another focus of the SLIC team was the development of a dedicated mobile app. Benjamin Dicken joined the SLIC project at the very end of the spring 2012 semester. He spent the summer trying to revive the Android app written by the former SLIC member, Steven Gregory. Unable to get it to work (due to the drivers' incompatibility and a wide variety of other problems), he continued to develop the mobile API written for the Android app that was originally further extended by another former student, Derek Leverenz. Instead of starting a new Android app from scratch, we decided to leverage Ben's Objective-C knowledge, and by the start of the fall semester, we had a proof-of-the-concept prototype interface for iOS.
At the end of November, we met with Wayne Peterson, who is the UA assistant director of Web / Mobile Services. We discussed the possibility of putting the SLIC app in the university enterprise development repository to allow for the internal testing and the future release of the app. We got a very enthusiastic response. Unfortunately, due to Ben's graduation in the spring of 2013, and an overwhelming number of other commitments, he was not able to continue working on the app in the spring. The project was put on hold until the team finds a new developer. In the mean time, all design decisions related to the main website are made to make sure that the website is usable by the mobile users (as soon as we switch to the HTML5-based video player).
At the end of the fall semester, the SLIC team got 3 new members: Haziel Zuniga, Mark Fischer and Matthew Burns. Stay tuned to learn about their projects and contributions.
This is a minor item but I'm proud of it. Part of my research support is to advance vision algorithms like my work on locally-linear structures, for example the trail finding algorithm that Scott Morris, Kobus Barnard and I have developed. This algorithm is, we believe, widely applicable to stringy-looking things (in the absence of tip growth), and one domain we've looked to cross into is that of neuron phenotyping. So the IVI lab has a bit of collaboration with the Restifo Lab in the Department of Neuroscience. We've developed image-inference software for them before, and we still provide a bit of support.
Recently they've decided they would like to run their custom software themselves. As I've said before, we research people often (unintentionally) make that somewhat difficult: we run our own code, so we know just how it has to be configured and run. In this case, the configuration was the snag: they want to run on a Macintosh (I hear that's a popular brand nowadays). Our code for them required Gnuplot, and specifically an older output option for Gnuplot. Well, you can't get that option on a Macintosh.
You can't -- until now! Although I had little experience developing software on a Mac, I found out about Macports, which is a framework for installing things like Gnuplot on Macintosh, and I learned enough to modify the Gnuplot port, stuff in the missing option and get it working. Furthermore, I submitted a patch and ticket to Macports.org. Six days ago my patch was incorporated into the trunk Portfile. So, this little support request has the wider consequence of advancing (in a small way) the evolution of Macintosh open-source software.
So this is from a few weeks back, but still worth a mention. For the month of February, I worked with a small group (Martha Narro, Ravi Palanivelu, Nirav Merchant, Salika Dunatunga, and my advisor Kobus Barnard) to develop a good demo of iPlant's Bisque platform in action. Martha and I worked hard writing a clear, focused tutorial, reminiscent of the "lab protocol" format that Ravi recommended as a good way to speak to biologists in a way that builds on their existing knowledge.
During that same period, I made a large number of minor changes to tweak and streamline the Pollen Tube Tracker. For example, the input parameters were rather cryptically named, at least from the point of view of a plant biologist. I added more help text and altered the wording to communicate their meanings better. One instance of that is the "spot size" parameter. It was formerly specified in pixels, and instead we changed it to microns, and the module extracts the pixel-to-micron conversion factor from the image metadata. Sometimes these changes have unanticipated consequences. When I made the above change, soon the tracker started to take days and days to run. Why was that? It turns out if the user sets the minimum spot size smaller than the equivalent of 2 pixels, then spot detector goes bonkers -- it sees one-pixel spots everywhere. That's not really a problem for the spot detector, but its excessive output consequently clogs up the tracker.
So that's why user testing is important. Research people like me tend to write interfaces for a single user (oneself), and it's a bit humbling to have to face again the challenge of HCI, when the user cannot literally read (i.e., share) the mind of the developer.
The goal was the March 2 RCN meeting, and iPlant had, I think, a good showing. I think all the iPlant representatives talked to a ton of people interested in various computing requirements. Many of the questions I received had the common theme of "That looks similar to what I need, but . . ." and in all cases, the specifics of user data, or image models, were very significant. One colleague of Ravi's had color images, and wanted to track both pollen tube tips and pollen tube nuclei. Another wanted to track locally-linear structures (actin fibers) that are moving in all kinds of crazy ways: squirming, twisting, growing and breaking. I bet in 100 years these sort of adaptations will be easy, but we aren't there yet.
So I've been working on my Bisque module -- specifically, I'm trying to make the output of the pollen-tube-tracker module immediately useful, at least a bit. (The module generates comprehensive output in CSV or XML form, but no one can read all that at a glance.) Specifically, what our pollen expert requested was a plot of pollen tube velocities, versus time. That sounds like a very reasonable request, no? He sketched out for me a simple cartoon, with one trace per tube, with tube velocity as the ordinate against time as the abscissa:
It turns out, that's not so easy.
How Bisque Works
One reason it isn't easy is because a Bisque module is doesn't get to speak directly to the users. I want to explain what I mean, so here's a little sketch of how the architecture looks from my perspective:
You, the biologist user, are running a browser and working through the interface Bisque provides. In this diagram, I'm "mod.py," a Python script that gets invoked every time the user picks my module to perform an analysis. My job is to mediate between Bisque and the binary executables (represented by engine.bin) that do the hard work. In the pollen-tube case, there are three binary executables, and they expect to run in a Unix environment, with their input in the form of named files (named just so) in a local filesystem. The engines need command-line arguments, and if anything goes wrong they send a message in English to standard output. Whereas Bisque, instead, gives me the URL of the input files. It gives me user parameters in a slightly different format (e.g., sizes could be in microns rather than pixels). It doesn't understand English error messages.
My job (as mod.py) is to be the bridge: download the images from the URLs and save them locally. Adapt the parameters. Run the engines and interpret their outputs. Detect errors. Translate engine output into a Bisque-acceptable response. Like any translator, this means mod.py has to understand both parties' "languages" and sometimes do a bit of extra explaining, but it's not that complicated. The Bisque developers have provided code to help out with the hardest parts, like that of downloading images and uploading XML output. Thus mod.py is roughly 700 lines of python -- not very long.
The high-level interface to the module is defined by a module-definition file, which is mdef.xml in the above diagram. It defines a module's inputs and outputs. This is how, for example, you the user can give me, the script, a bunch of numeric parameters to control the tracker. Inside mdef.xml there is a list of input parameters; when you choose my module, Bisque reads mdef.xml and lets you tweak the parameters before clicking the "run" button. When "run" fires, Bisque shunts those parameters over to mod.py.
So Where's My Chart?
It's not enough to say that mdef.xml spells out the input parameters for mod.py; more accurately, it acts as a contract between Bisque and mod.py. The output from mod.py, which is exclusively XML, must conform to the structure delineated in mdef.xml. So it defines the output interface (from mod.py and to Bisque), too. Bisque by default supports a few kinds of output. The output XML can define an overlay of aggregate graphical primitives (called GObjects) on the input image. You can produce some text-based summary output. And Bisque will always let you have the raw XML, if you ask for it.
If you want a chart, it gets a bit tricky. You can make simple charts by specifying a single xpath, which will extract a single vector of variables from the output. You can plot them, or draw a histogram, or a few other things, but mdef.xml is essentially an interface. An interface should answer the question "what is the output?" rather than "how do we make the output?"
Furthermore, mod.py cannot really generate a plot easily. Its job is to bridge between Bisque and the engine, not to offer reinterpretations of the engine output. (In fact I violate my own advice here a bit, because my module does augment the engine output with a reinterpretation that lacks the time dimension. I call the outputs "time-indexed" and "time-collapsed," because it's easier to visualize the time-collapsed output in Bisque, although the time-indexed one is more complete. It's kind of a hack. If possible I'd like to eliminate that behavior.)
Not quite there
As I mentioned earlier, we are partway along building a generative model for flatbed-scanner seed images. We have a silhouette component to the model, but we also want to use edge information, since edges are going to provide essential clues to where one kernel ends and another begins (even if they touch). To that end, I've got a Canny edge detector working on seed images. Here's a picture:
The input to the edge detector was a silhouette of seed kernels. The picture here shows randomly colored edge pixels, i.e., the edge detector has identified which pixels are edges of shapes, and (though you cannot see it), it has stored the edge direction, as well as the location, at each colorful point.
One difficulty here is that there is no perfect edge detector; this one includes a number of parameters (thresholds and a scale factor) that help it distinguish true edges from image noise. Those factors depend on image characteristics like seed size. If one tries different seeds, or a seed sample with more dirt, or if one changes the scan resolution, the edge detector might not do as well. We will have to pay attention how to make the system robust enough to be useful.
The next step is to integrate the edge information into the image model, i.e., into the likelihood function. That's more a problem of modeling than programming, but fortunately it's an issue our lab has been working on for quite awhile.
Update (17:58): Here's a visualization of the edge direction of those edge pixels, as a tangent line, in case the above was not clear. Each edge point has not only a location, but also a direction.
The pollen-tracking module for Bisque is good enough to push into the world. I believe it works well enough to be useful, and I hope the end users will concur. I look forward to their feedback about how it can be improved. I myself know a number of improvements I'd like to see. But software can eternally be improved, and I know iPlant wants to launch Bisque on November 1, so let's call version 1 complete. Kobus and I have talked, and we need to coordinate with Nirav and Ravi how to wrap up v1 nicely.
I have noticed that the browser used for Bisque matters. The experience is nicer on Chrome than Firefox. In particular, the graphical rendering and the user interface are a bit quirky in Firefox, but not in Chrome (which I believe is the developers' primary dev platform).
Ok, The Bisque situation is much improved since I last posted. It seems obvious now the "right" approach all along was to install a separate Bisque engine on bovary itself. Wish I'd perceived that earlier. Anyway, now I can develop modules there without disrupting anyone else, and I don't have to wrestle with firewalls or configuration riddles-of-the-sphinx.
I've got the 4D module alive there, i.e., it starts; but I've encountered (or re-encountered?) a bug in the C++ inference code. So we begin another round of getting the CVPR code fixed up, with the original developer. Hope that doesn't take too long.
The past couple of weeks the SLIC team has been busy increasing the security of the interface and ensuring its robustness against SQL injections and other malicious attacks. In the process, we have modularized some of the components, fixed minor bugs and added new features. As soon as we can reliably receive feedback from the site, we will be ready to release this new version to the world.