Big Data Market Leaders
Big Data investments and IT budget allocations continue to grow. Some analysts project the Big Data market is currently $5 billion and will grow beyond $50 billion in the next five years. So who is poised to capitalize? There are both the traditional vendors and some new name pure plays in competition for claiming market share.
Currently, approximately five percent of the current vendors are pure-play companies that have recently started promoting their Big Data tool sets - many based on Hadoop. The rest of the field includes the usual big players in enterprise technology such as IBM, Intel, HP, Oracle, Teradata, Fujitsu, and others. They currently account for about 95 percent of the overall Big Data based revenue.
The challenges for many of the IT decision makers and purchasers are complicated - as the CIO do I stick my neck out and take a chance with nascent technology from a startup that appears to have developed the best innovative tools but brings the baggage of an insecure or uncertain financial future. Will this company and their product exist in two years? Can I get the required maintenance and support needed for a successful deploymenr? Or do I play it safe and look to IBM or the other large vendors to provide me a homogeneous integrated solution that just may in the end not perform as well as one of the pure play options?
Understanding the Big Data Vendor Selection Process.
Ten or more years ago when we were building data warehouses and not Big Data applications, understanding and defining the ETL (Extraction, Transformation, and Load) process was a critical success factor in deployment. Building the data warehouse was only part of the problem, using the data warehouse post construction was an equal challenge. Was the data structured in a meaningful way to efficiently query it without wait 24 or more hours for the answer.
Today the issues are the same but the tools used are better. When we build Big Data systems, we are challenged with Big Data storage requirements that dwarf systems of the last decade. Additionally, Big Data analysis and the associated Big Data speed requirements for crunching the data using resident in memory processing facilitieshas enabled vendors to innovate widely in the development of their tool sets.
How do I make the best informed Big Data decision for my organization and who are the newly emerging Big Data vendors?
The four leading vendors that have emerged in the Big Data space in the last two years are Vertica, Teradata Aster (formerly Aster Data), Greenplum, and Splunk.
Vertica, Teradat Aster, and Greenplum offer with massively parallel, columnar analytic databases that deliver highly optimized and fast data loading with near real-time query capabilities. The three were innovating in the data warehouse space and offered Big Data products long before Hadoop emerged as the mainstream Big Data open source tool of choice.
Vertica is now owned by HP. The company from their website claims they offer “high-speed, self-tuning column-oriented SQL database management software for data warehousing and business intelligence.” Among its more remarkable features available in its latest Vertica Analytic Platform, are new elasticity capabilities to expand or contract deployments and several new in-database analytic functions.
Vertica 5.1 includes a revamped client framework for easier integration with third-party BI, ETL, analytics, and other ecosystem solutions such as Hadoop distributions based on Apache Commons Release 1.0.0, including Hortonworks Data Platform v1.
Teradata Aster (previously Aster Data prior to the Teradata acquisition) has introduced SQL-MapReduce framework, combining the best of both data processing approaches. The Teradata Aster MapReduce Platform combines MapReduce, the language of Big Data analytics, with SQL, the language of business analytics.
This makes it easier to analyze large volumes of complex data such as Web logs, machine data, and text, while also making it easier to perform more rich analysis than is possible with traditional SQL technology. Aster Database 5.0 offers greater development flexibility and includes pre-built MapReduce modules for behavioral click stream interpretation, marketing attribution, decision tree analysis, and other analysis. This is just one example of the special tools needed to rapindly deploy a new system.
Greenplum is now owned by EMC. Greenplum’s collaborative analytic platform, Chorus, provides a social environment for Data Scientists to experiment with Big Data. Chorus resembles Facebook in that is designed specifically for social collaboration, except between data scientists rather than consumers.
Other products include Greenplum Unified Analytics Platform, Greenplum Data Computing Appliance, Greenplum Database, Greenplum Analytics Lab, and Greenplum HD.
Splunk specializes in processing and analyzing log file data to allow administrators to monitor IT infrastructure performance and identify bottlenecks and other disruptions to service. The company recently went public and immediately grew to a $3 billion market value. Erik Swan, the company’s CEO, describes Splunk as “Google for machine data” in a recent interview.
Enjoyed the article?
Sign-up for our free newsletter to kick off your day with the latest technology insights, or share the article with your friends and contacts on Facebook, Twitter or Google+ using the icons below.
Please login first in order for you to submit comments