Scientific researchers, financial analysts, and pharmaceutical firms have long-used incredibly large datasets to answer incredibly complex questions. Large datasets, especially when analyzed in tandem with other sets, can reveal patterns and relationships that would otherwise remain hidden. In this sense, big data analysis is similar to a microscope or telescope. Big data makes visible those things within the data which standard data processing simply cannot reveal.
As a product manager within the Global Market Data group at NYSE Technologies, I was consistently surprised and impressed with the ways in which both customers and partners analyzed and used the incredibly large sets of market trade, quote, and order book data produced each day.
On the sell side, clients often analyzed data spanning many years in an attempt to find patterns and relationships that could aid fund portfolio managers in constructing and maintaining strategies appropriate for long-term investment activity. On the buy side, clients were most interested in data spanning the previous week or two as analysts looked closely at the behavior of trade/quote activities of disparate assets. University and college clients would often seek data spanning decades. Regardless of the specific use case, clients required technology to process and analyze substantial and unwieldy amounts of data.
Various technologies are employed to meet the needs of these various use cases. For historical analysis, high-powered data warehouses such as those offered by 1010 Data, ParAccel, EMC, and others, are incredible tools. Unlike databases, which are designed for simple storage and retrieval, data warehouses are optimized for analysis. In other words, data warehouses are built, from the ground up, to answer questions using stored data. As a result, they are much faster and efficient at performing complex analysis than a simple SQL database.
Complex event processors such as those from One Market Data, KDB, and Sybase give high-frequency and other algorithmic traders the ability to analyze market activity—across a wide array of financial instruments and markets—at any given microsecond throughout the trading day.
The aforementioned technologies are now being deployed within industries that seek to benefit from the answers and insights big data can provide. Even business intelligence tools such as those offered by Tableau and Microstrategy, have the capacity to deal with very large and highly complex datasets. To a lesser extent, Microsoft Excel has been retooled to handle big data with newly architected pivot tables and support for billions of rows of data within a single spreadsheet.
But big data is only useful if analysts ask the right questions and have at least a general idea of the relationships and patterns big data analysis may illuminate.
Read the full article at ReadWriteWeb: How Cloud Computing Democratizes Big Data