To many people, "Big Data" is just a buzzword. To others, it's a source of considerable concern and stress. For those with the right tools, capabilities and mind-set, Big Data is a huge and ever-expanding opportunity.
Big Data arises when you collect so much input from your [web traffic / sales / scientific research / financial / geographic / demographic] systems, that you can't see the wood for the trees. But you still need to be able to aggregate, analyse, and report on that data.
When your relational database is maxing out your massive RAID array and is really starting to groan under the strain, it's probably time to consider spreading the load. Don't worry - there are tools that can help.
While the likes of Oracle and Teradata offer some pretty heavyweight Big Data and analytics solutions, there is an open-source framework called Hadoop, released through the Aache Foundation, which has an impressive user base.Although it only hit v 1.0 in December last year, for years Hadoop has been underpinning the huge flow of data for outfits such as Amazon, Ebay, Facebook, Last.fm, Linkedin, Rackspace, and Spotify. Even Microsoft recognises the power and reach of Hadoop, and is making the platform available through its Azure cloud services.
Hadoop is designed to run on massively distributed nodes / clusters. Much of its existence is owed to technologies originating from Google, but Yahoo! has subsequently contributed a lot of the code and just happens to be running Hadoop across more than 100,000 CPUs.
You don't have to be a search engine, massive media streaming outlet, or government-funded research lab to benefit from Hadoop, and thanks to Cloud services like Amazon and Azure offering scalable solutions, you don't need your own underground bunker full of servers to start realising the information locked away in your huge databases.
If you are looking to gain a broader view of Big Data solutions, for instance in order to help get buy-in with project stakeholders who aren't so deeply involved in the technical minutiae of database management or business intelligence (e.g. Board of Directors, Project Managers, Business Analysts), we would be more than happy to deliver an objective overview of Big Data at your offices, aimed at laying bare the pros and cons of investing time and resources into processing large data sets, with real-world case studies and ample discussion.
Otherwise, if you'd like to know more about Hadoop, you might be interested in our 2-day Big Data with Apache Hadoop training course which is aimed at people directly involved in data collection, warehousing, analytics, and reporting. We cover topics such as Hadoop Architecture & Common Utilities, Hadoop Distributed File System (HDFS), MapReduce, and more.
Please drop us a line or give us a call on 020 3137 3920 if you'd like to have a chat about what you need to get out of Big Data.
Scribbled by Tom