According to TDWI recent report on BIG data, there are 3 Vs of big
data – Volume which is multiple terabytes or over petabytes, Variety which is
numbers, audio, video, text, streams , weblogs, Social media etc & velocity
which is the speed with which it is collected.
Today, enterprises are exploring big data to discover facts they
didn’t know before. This is an important task right now, because the recent
economic recession forced deep changes into most businesses, especially those
that depend on mass consumers.
Using advanced analytics, businesses can study
big data to understand the current state of the business and track customer
behavior.
Here are few examples of Big Data to get the idea:
- Twitter produces over 90 million tweets per day
- Wal-Mart is logging one million transactions per hour
- Facebook creates over 30 billion pieces of content every day ranging from web links, news, blogs, photos etc.
- 72 hours of videos are added to Facebook every minute
Big Data Analytics usability - think about the possibilities of
real-time location data with regard to promoting coupons or customized offers
to consumers who pass by a retailer’s location, Insurance companies can analyze
the data collected by electronic toll transponders to accurately determine a
driver’s speed, location, and mileage – and adjust insurance rates accordingly.
Because it's early on, big-data technologies are still evolving
and haven't yet reached the level of product maturity.
Discovery analytics against big data can be enabled by different
types of analytic tools, including those based on SQL queries, data mining,
statistical analysis, fact clustering, data visualization, natural language
processing, text analytics, artificial intelligence, and so on.
Solutions getting most advantages by Big Data Analytics:
- Customer analytics – Churn, segmentation, Cross Sell & Behavior analytics
- Fraud detection
- Risk analytics
- Advanced data visualization
Today various technology platforms are becoming available for big
data analytics – Hadoop-Mapreduce, Teradata, Greenplum, Kognitio.
Hadoop has become more popular amongst all the tools as it is open
source with less total cost of ownership & allows combination of any form
of data without needing to have any data types or schemas defined.
With massively parallel
processing using MapReduce functionality it gives power to get the results
quickly. It can scale up
& out by adding new nodes. This also allowes fail safe mechanism and all
time availability.
Big players like Google, Yahoo, Facebook, Linkedin have already proved the Hadoop
usability.