Big Data Defined
What is Big Data? Big Data means all data, including both transaction and interaction data, in sets whose size or complexity exceeds the ability of commonly used technologies to capture, manage, and process at a reasonable cost and timeframe.
In fact, Big Data is the confluence of three major technology trends
Big Transaction Data: Traditional relational data continues to grow in on-line transactional processing (OLTP) and analytic systems, from ERP applications to data warehouse appliances,
along with unstructured and semi structured information.
The landscape is complicated asenterprises move more data and business processes to public and private clouds.
•Big Interaction Data: This emerging force consists of social media data from Facebook,Twitter, LinkedIn, and other sources. It includes call detail records (CDRs), device and sensor information, GPS and geolocational mapping data, large image files through Manage File
Transfer, Web text and clickstream data, scientific information, emails, and more.
• Big Data Processing: The rise of Big Data has given rise to frameworks geared for data-intensive processing such as the open-source Apache Hadoop, running on a cluster of commodity hardware. The challenge for enterprises is to get data into and out of Hadoop rapidly, reliably, and cost-effectively.
How Big Is Big?
While experts agree that Big Data is big, exactly how big is a matter of debate. IDC forecasts a roughly 50 percent annual growth rate for what it calls the world’s “digital universe,” more than 70 percent of which IDC estimates is generated by consumers and over 20 percent by enterprises.
Between 2009 and 2020, the digital universe will swell by a factor of 44 to 35 zettabytes.
What can your organization do with Big Data? How can you take advantage of its big opportunities? How can you avoid its risks? An increasing number of organizations tackling Big Data are deploying more advanced massively parallel processing (MPP) databases, Hadoop distributed file systems, MapReduce algorithms, cloud computing, and archival storage. It’s crucial for organizations to enable business to access all data so they can apply it across Big Data infrastructures.
Data integration enables your organization to hit the Big Data sweet spot—combining traditional transaction data with new interaction data to generate insights and value otherwise unachievable.
A prime example is enriching customer profiles with likes and dislikes culled from social media to improve targeted marketing. Without data integration, Big Data amounts to lots of Big Data silos.
As Big Data comes into focus, it’s capturing the attention of CIOs, VPs of information management (IM), enterprise architects, line-of-business owners, and business executives who recognize the vital role that data plays in performance.
according to a 2011 Gartner survey of CEOs and senior executives.7 Big Data is relevant to virtually every industry:
•Consumer industries: From retail to travel and hospitality, organizations can capture Facebook posts, Twitter tweets, YouTube videos, blog commentary, and other social media content to better understand, sell to, and service customers, manage brand reputation, and leverage wordof- mouth marketing.
•Financial services: Banks, insurers, brokerages, and diversified financial services companies are looking to Big Data integration and analytics to better attract and retain customers and enable targeted cross-sell, as well as strengthen fraud detection, risk management, and compliance by applying analytics to Big Data.
•Public sector: Federal Networking and Information Technology Research and Development (NITRD) working group announced the Designing a Digital Future report. The report declared that “every federal agency needs a Big Data strategy,” supporting science, medicine, commerce, national security, and other areas; state and local agencies are coping with similar increases in data volumes in such diverse areas as environmental reviews, counter terrorism and constituent relations.
•Manufacturing and supply chain: Managing large real-time flows of radio frequency identification (RFID) data can help companies optimize logistics, inventory, and production while swiftly pinpointing manufacturing defects; GPS and mapping data can streamline supplychain efficiency.
•E-commerce: Harnessing enormous quantities of B2B and B2C clickstream, text, and image data and integrating them with transactional data (such as customer profiles) can improve e-commerce efficiency and precision while enabling a seamless customer experience across multiple channels.
•Healthcare: The industry’s transition to electronic medical records and sharing of medical research data among entities is generating vast data volumes and posing acute data management challenges; biotech and pharmaceutical firms are focusing on Big Data in suchareas as genomic research and drug discovery.
•Telecommunications: Ceaseless streams of CDRs, text messages, and mobile Web access both jeopardize telco profitability and offer opportunities for network optimization. Firms are looking to Big Data for insights to tune product and service delivery to fast-changing customer demands using social network analysis and influence maps.
According to Gartner, “CEO Advisory: ‘Big Data’ Equals Big Opportunity,” March 31, 2011.
Article Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica Platform Overcoming the Obstacles of Existing Data Infrastructures Traditional approaches to managing data are insufficient to deliver the value of business insight from Big Data sources. The growth of Big Data stands to exacerbate pain points that many enterprises suffer in their information management practices:
•Lack of business/IT agility The IM organization is perceived as too slow and too expensive in delivering solutions that the business needs for data-driven initiatives and decision making.
•Compromised business performance IM constantly deals with complaints from business users about the timeliness, reliability, and accuracy of data while lacking standards to ensure enterprise-wide data quality.
•Over reliance on IM The business has limited abilities to directly access the information it needs, requiring time-consuming involvement of IM and introducing delays into critical business processes.
•High costs and complexity The enterprise suffers escalating costs due to data growth and application sprawl, as well as degradation of systems performance, leaving it poorly positioned for the Big Data onslaught.
•Delays and IT re-engineering Costly architectural rework is necessary when requirements change even slightly, with little reuse of data integration logic across projects and groups.
•Lost customer opportunities Sales and service lack a complete view of the customer, undercutting revenue generation and missing opportunities to leverage behavioral and social media data.
Of these problems, addressing the limitations of existing CRM systems and exploiting Big Data from social media sources to attract and retain customers and improve cross-sell effectiveness are of keen interest to executives. Organizations are transitioning to CRM 2.0, which depends fundamentally on a complete and accurate customer view from large and diverse data sources.
The latest release of the Informatica Platform, Informatica 9.1, was developed with the express purpose of turning Big Data challenges into big opportunities.
Informatica 9.1 is engineered to empower the data-centric enterprise to unleash the business potential of Big Data in four areas:
•Big Data integration to gain business value from Big Data
•Authoritative and trustworthy data to increase business insight and consistency by delivering trusted data for all purposes
•Self-service to empower all users to obtain relevant information while IT remains in control
•Adaptive data services to deliver relevant data adapted to the business needs of all projects The next section outlines capabilities in Informatica 9.1 and how it enables your organization to tackle Big Data opportunities.
Big Data Integration
Informatica 9.1 delivers innovations and new features in the three areas of Big Data integration:
Connectivity to Big Transaction Data. Informatica 9.1 provides access to high volumes of transaction data, up to a petabyte in scale, with native connectivity to OLTP and on-line analytical processing (OLAP) data stores. A new relational/data warehouse appliance package available in Informatica 9.1 extends this connectivity to solutions purpose-built for Big Data.
•Maximize the availability and performance of large-scale transaction data from any source
•Reduce the cost and risk of managing connectivity with a single platform supporting all database and processing types
•Uncover new areas for growth and efficiency by leveraging transaction data in a scalable, cost-effective way
Connectivity to Big Interaction Data. Access new sources such as social media data on Facebook,Twitter, LinkedIn, and other media with new social media connectors available in Informatica.Extend your data reach into emerging data sets of value in your industry, including devices andsensors, CDRs, large image files, or healthcare-related information for biotech, pharmaceutical,
and medical companies.
• Gain new insights into customer relationships and influences enabled by social media data
• Access and integrate other types of Big Interaction Data and combine it with transaction data to sharpen insights and identify new opportunities
• Reduce the time, cost, and risk of incorporating new data sets and making them available to enterprise users These capabilities open new possibilities for enterprises combining transaction and interaction data either inside or outside of Hadoop.
•Confidently deploy the Hadoop platform for Big Data processing with seamless source-and target data integration
•Integrate insights from Hadoop Big Data analytics into traditional enterprise systems to improve business processes and decision making
•Leverage petabyte-scale performance to process large data sets of virtually any type and origin Big Data integration involves the ability to harness Big Transaction Data, Big interaction Data, and Big
Big Data Integration in Action
Every new data source is a new business opportunity. Whether it’s social media data posted by your Facebook fans, sensor-based RFID information in your product supply chain, or the enterprise applications of a newly acquired company, your ability to harness this information bears directly on your bottom line.
Unleashing the potential of Big Data requires the ability to access and integrate information of any scale, from any source. In many cases, it means combining interaction data with transaction data to enable insights not possible any other way. One example is using social media data to drive revenue by attracting and retaining customers.
With 50 million tweets on Twitter and 60 million updates on Facebook daily and going up, consumers are sharing insights into what they like and don’t like. Suppose your company could learn from a Facebook fan that her son is looking for colleges, she’s shopping for a new car, and she likes Caribbean cruises? That’s invaluable intelligence for targeted marketing and customer loyalty projects.
Informatica can harness social media data to enrich customer profiles in CRM applications with customer likes, dislikes, interests, business and household information, and other details. Support for Hadoop gives you data interoperability between the distributed
processing framework and your transactional systems, with flexibility for bidirectional data movement to meet your business objectives.