Big data can mean big headaches for scientists. A new library of software tools from Janelia speeds analysis of data sets so large and complex they would take days or weeks to analyze on a single workstation—if a single workstation could do it at all.
In an age of “big data,” a single computer cannot always find the solution a user wants. Computational tasks must instead be distributed across a cluster of computers that analyze a massive data set together. It's how Facebook and Google mine your web history to present you with targeted ads, and how Amazon and Netflix recommend your next favorite book or movie. But big data is about more than just marketing.
New technologies for monitoring brain activity are generating unprecedented quantities of information. That data may hold new insights into how the brain works—but only if researchers can interpret it. To help make sense of the data, neuroscientists can now harness the power of distributed computing with Thunder, a library of tools developed at the Howard Hughes Medical Institute's Janelia Research Campus.
Thunder speeds the analysis of data sets that are so large and complex they would take days or weeks to analyze on a single workstation—if a single workstation could do it at all. Janelia group leaders Jeremy Freeman, Misha Ahrens, and other colleagues at Janelia and the University of California, Berkeley, report in the July 27, 2014, issue of the journal Nature Methods that they have used Thunder to quickly find patterns in high-resolution images collected from the brains of active zebrafish and mice with multiple imaging techniques.
Importantly, they have used Thunder to analyze imaging data from a new microscope that Ahrens and colleagues developed to monitor the activity of nearly every individual cell in the brain of a zebrafish as it behaves in response to visual stimuli. That technology is described in a companion paper published in the same issue of Nature Methods.