Coke Vs Pepsi of :Datawarehousing ETL Vs ELT

The Coke and Pepsi are always fighting to have bigger pie in international drinks market.
Both are present in 180+ countries agressively pursuing the pie of market share.

The Datwarehouses are different animals on block. They are databases But they are not normalized. They do not follow all 12 Codd Rules. But yet source and Target are RDBMS.
The Structure Where its saved Whether in Star Schema or Snow-flake is denormalized as possible like flat file structures. More Constraints slows down the join process.
So there are less restrained much Faster file based alternatives for databases which Emerged for need to store unstructured data and achieve 5V massive volume, variety, velocity etc.. Read below links:
Which are have also found favour in ETL world with Hadoop. Now Every ETL allows hadoop connector or adapter to Extract data from hadoop HDFS so service in HDFS and similar.
ETL process

ETL Extract-Transform-Load
ETL where transformation happens in staging area.
Extract data from sources , put in staging area cleanse it, transform data and Then Load in Target Datawarehouse. So popular Tools like informatica, datastage or ab-initio use this approach. Like in Informatica for fetching data or Extract Phase we can use fast source-qualifier transformation OR use Joiner transformation when we have multiple different databases like both SQL Server and Oracle although may be slow but can take both input but Source qualifier may require single vendor but is fast.
After Extracting We can use Filter transformation to filter out unwanted rows in staging area. Then load into target Databases.

ELT Extract Load and then Transform.
Extract data from disparate sources , Load the data into RDBMS engine first after . Then use RDBMS facility to Cleanse and Transform Data. This Approach was popularised By Oracle because Oracle Already had Database Intellectual property and was motivated to increase its usage.So Why does cleansing and Transformation outside the RDBMS into staging area rather within RDBMS engine. Oracle ODI Oracle Data integrator uses this concept of ELT not ETL bit reversal from routine.

So like Pepsi Vs Cola wave of Advertisement and gorilla marketing or To showcase Each other products strengths and hide weakness Games continue here Also in ETL world of data warehousing. Each one has its own merits and demerits.

Cloud Computing relation to Business Intelligence and Datawarehousing
Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:

Future of BI
No one can predict future but these are directions where it moving in BI.

Next generation Application development

The Next generation application development will not only take care of utilizing 50 or 100+ processors which will be available in you laptop or desktop or mobile but by using parallel processing available at clients
I covered 7 points last article this is part -2 of
also Next genration ERP read first:
8. More pervasive BI eating App: Business Intelligence application development will go deeper in organisation Hierarchy
Oraganisation Hirearchyfrom more strategic level BI  and Middle management level to more pervasive  transactional processing level , and Office automation System level BI (shown in diagram as knowledge level or operational level.)

How it will affect architecture of Enterprise product Read SAP HANA
Understanding Management aspect to little contrary view but related.. there will be need for more deeper strategic Information system to make more unstructured decision making.

pervasive BI bound to eat up Application development market also fulled by in-memory products like cognos TM1, SAP HANA etc..but also changes, cross functional innovation happening at enterprise level.
As with these products no need for separate Database for datawarehouse and for operational systems. This unification of Operational data store ODS and data warehouse DW. on reporting level both Business intelligence BI and operational reporting will be accessing same database and that will be using in Memory technology.

9. Bigdata as everyone knows is Hot: more unstructured data than structured data today present for you is like open laboratory to experiment. More of it will find place in strategic management system and Management Information system.
10. Application security will be important as never before: its already there .
The intensity can be gauged from fact that changes in top 10 OWASP list is happening as never before and positions are changing in terms of top most risk ranking.

2010 A2 was Cross site Scripting XSS but 2013 at ranking to of perceived risk is Broken Authentication and session management. Changes do happen but here ranking and no of incident changing fast because momentum is fast.
11. More will continue when I find time next time….

Ubiquitous Computing is were everyone is moving now

Ubiquity in next frontier where software is moving what are important characteristics of ubiquitiy

If we see here how different stack are built over a period of time For instance: Oracle Stack from storage using sun technology and data base oracle in middleware: Oracle fusion middleware, Operating system solaris, and ERP solutions like peoplesoft, Sielble, and Oracle financials and retail apps..On all these areas solutions should work across what was missing was communication piece for which also Oracle acquired lots of communication companies…Now Same way

Microsoft Stack: Windows OS server /networking , HyperV hypervisor,SQL server database, biztalk middleware,MSBI Bi, dynamics as ERP with financial/CRM etc module..there is PAAS which can leverage this all across Called software are cutting these boundaries..

If we take definition of Ubiquitous computing it collective wisdom of moving toward miniaturization, inexpensive, seamlessly integrated and wireless networked devices working on all daily use items and objects like watch to fridge etc..same vision on which long back

all models of ubiquitous computing share a vision of small, inexpensive, robust networked processing devices, distributed at all scales throughout everyday life and generally turned to distinctly common-place ends.We have ambient intelligence which are aware of people needs by unifying telecom,networking and computing needs creating context aware pervasive computing. On back hand where we have all the data stored in cloud storage ..we have integrated stack..not every component of stack needs to talk to this new ubiquitous computing devices and software.

what technologies are colliding there:

Data communications and wireless networking technologies: moving towards new form of devices sensitive to environment and self adjusting , without wire connecting to each other creating meshup network. drive towards ubiquitious computing is essential to networks drive towards wireless networking.
Middleware: We have PAAS PlAform As Service in cloud mere all miniaturized device have limited storage will store data. To leverage this data as well to work all across the virtualization like we have Microsoft azure as discussed above and Oracle fusion middleware
Real-time and embedded systems: all real time messages needs to captured using Real time OS RTOS and passed to devices to interactivity with outside world dynamic.
Sensors and vision technologies: Sensors sense and pass information important part of ubiquitous computing.sensors in fridge senses out of milk and starts interacting with mobile to sent information to retail store to send delivery (its a typical example).
Context awareness and machine learning: device is aware whether its near to bank or near to office or police station and start reacting to relevant application this is geolocation..going deep watch when we go inside water start beaming depth from the river ded comes out and shows time..on same display context aware..still when it goes near to heat heat sensor sends temperature to display.
Information architecture: huge data will be generated from this network now this data needs to be analysed depending on its type its storage ans retrival architecture varies..big data will not stored same way RDBMS is stored.
Image processing and synthesis: and bio metric devices needs to get image of the to authenticate and send information. Image processing algorithm like edge detection algorithm will run over this huge data to get satellite data captured and fed into edge detection algorithm to find water bodies using huge variation in reflectance level as we move from sand to water..

There wold be huge usage of there in next generation BI systems.

So tools like uBIquity will make difference in future:

As BI becomes pervasive everyone would surely want to use it.. its natural evolution process for and user to get attracted to BI system where user can create his own query to find it become pervasive ti would enter into every device and here were it will strat interacting with ubiquity…ubiquity is future in BI.

Master Data Management Tools in market.

MDM:-> What does it do?

MDM seeks to ensure that an organization does not use multiple version/terms (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations.Thus CRM, DW/BI, Sales,Production ,finance each has its own way of representing things

There are lot of Products in MDM space One that have good presence in market are:

Tibco Information collaboration tool leader

Collaborative Information Manager.

– work on to standardize across ERP,CRM,DW,PLM

– cleanising and aggregation.

– distribute onwers to natural business users of data(sales,Logistics,Finance,HR,Publishing)

– automated Business Processes to clollaborate to maintain info asset and data governace poilcy

– built in data models can extended (industry template,validation rule)

– built in process to manage change elliminate confusion manageing change ,estb clear audit and governace trail for reporting.

– sync relevant subset of info  downstream application trading partner and exchanges.SOA to pass data to as web service to composite applications.

IBM MDM Inforsphere MDM Server

Still its incomplete i will continue to add on this.

Product detail(

source: (

Short Notes below taken from source:+ My comments on them.

Informatica MDM capabilities:

Informatica 9.1 supplies master data management (MDM) and data quality technologies to

enable your organization to achieve better business outcomes by delivering authoritative, trusted data to business processes, applications, and analytics, regardless of the diversity or scope of Big


Single platform for all MDM architectural styles and data domains Universal MDM capabilities

in Informatica 9.1 enable your organization to manage, consolidate, and reconcile all master

data, no matter its type or location, in a single, unified solution. Universal MDM is defined by four


• Multi-domain: Master data on customers, suppliers, products, assets, locations, can be managed, consolidated, and accessed.

• Multi-style: A flexible solution may be used in any style: registry, analytical, transactional, or


• Multi-deployment: The solution may be used as a single-instance hub, or in federated, cloud, or service architectures.

• Multi-use: The MDM solution interoperates seamlessly with data integration and data quality technologies as part of a single platform.

Universal MDM eliminates the risk of standalone, single MDM instances—in effect, a set of data silos meant to solve problems with other data silos.

• Flexibly adapt to different data architectures and changing business needs

• Start small in a single domain and extend the solution to other enterprise domains, using any style

• Cost-effectively reuse skill sets and data logic by repurposing the MDM solution

“No data is discarded anymore!

U.S. xPress leverages a large scale of transaction data and a diversity of interaction data, now extended

to perform big data processing like Hadoop with Informatica 9.1. We assess driver performance with image files and pick up

customer behaviors from texts by customer service reps. U.S. xPress saved millions of dollars per year by reducing fuels and optimizing

routes augmenting our enterprise data with sensor, meter, RFID tags, and geospatial data.” Tim Leonard Chief Technology Officer

Source: U.S. xPress Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform.

Reusable data quality policies across all project types Interoperability among the MDM, data quality, and data integration capabilities in Informatica 9.1 ensures that data quality rules can

be reused and applied to all data throughout an implementation lifecycle, across both MDM and data integration projects (see Figure 3).

• Seamlessly and efficiently apply data quality rules regardless of project type, improving data accuracy

• Maximize reuse of skills and resources while increasing ROI on existing investments

• Centrally author, implement, and maintain data quality rules within source applications and propagate downstream

Proactive data quality assurance Informatica 9.1 delivers technology that enables both business and IT users to proactively monitor and profile data as it becomes available, from

internal applications or external Big Data sources. You can continuously check for completeness, conformity, and anomalies and receive alerts via multiple channels when data quality issues are


• Receive “early warnings” and proactively identify and correct data quality problems before they happen

• Prevent data quality problems from affecting downstream applications and business processes

• Shorten testing cycles by as much as 80 percent

Putting Authoritative and Trustworthy Data to Work

The diversity and complexity of Big Data can worsen the data quality problems that exist in

many organizations. Standalone, ad hoc data quality tools are ill equipped to handle large-scale

streams from multiple sources and cannot generate the reliable, accurate data that enterprises

need. Bad data inevitably means bad business. In fact, according to a CIO Insight report, 46

percent of survey respondents say they’ve made an inaccurate business decision based on bad or

outdated data.9

MDM and data quality are prerequisites for making the most of the Big Data opportunity. Here are

two examples:

Using social media data to attract and retain customers For some organizations, tapping

social media data to enrich customer profiles can be putting the cart before the horse. Many

companies lack a single, complete view of their customers, ranging from reliable and consistent

names and contact information to the products and services in place. Customer data is

often fragmented across CRM, ERP, marketing automation, service, and other applications.

Informatica 9.1 MDM and data quality enable you to build a complete customer profile from

multiple sources. With that authoritative view in place, you’re poised to augment it with the

intelligence you glean from social media.

Data-driven response to business issues Let’s say you’re a Fortune 500 manufacturer and

a supplier informs you that a part it sold you is faulty and needs to be replaced. You need

answers fast to critical questions: In which products did we use the faulty part? Which

customers bought those products and where are they? Do we have substitute parts in stock?

Do we have an alternate supplier?

But the answers are sprawled across multiple domains of your enterprise—your procurement

system, CRM, inventory, ERP, maybe others in multiple countries. How can you respond swiftly

and precisely to a problem that could escalate into a business crisis? Business issues often

span multiple domains, exerting a domino effect across the enterprise and confounding

an easy solution. Addressing them depends on seamlessly orchestrating interdependent

processes—and the data that drives them.

With the universal MDM capabilities in Informatica 9.1, our manufacturer could quickly locate

reliable, authoritative master data to answer its pressing business questions, regardless of

where the data resided or whether multiple MDM styles and deployments were in place.


Big Data’s value is limited if the business depends on IT to deliver it. Informatica 9.1 enables your

organization to go beyond business/IT collaboration to empower business analysts, data stewards,

and project owners to do more themselves without IT involvement with the following capabilities

Analysts and data stewards can assume a greater role in

defining specifications, promoting a better understanding of the data, and improving productivity

for business and IT.

• Empower business users to access data based on business terms and semantic metadata

• Accelerate data integration projects through reuse, automation, and collaboration

• Minimize errors and ensure consistency by accurately translating business requirements into

data integration mappings and quality rules

Application-aware accelerators for project owners:

empowers project owners to rapidly understand and access data for data

warehousing, data migration, test data management, and other projects. Project owners can

source business entities within applications instead of specifying individual tables that require

deep knowledge of the data models and relational schemas.

•Reduce data integration project delivery time

•Ensure data is complete and maintains referential integrity

• Adapt to meet business-specific and compliance requirements

Informatica 9.1 introduces complex event processing (CEP) technology into data quality and

integration monitoring to alert business users and IT of issues in real time. For instance, it will notify an analyst if a data quality key performance indicator exceeds a threshold, or if integration processes differ from the norm by a predefined percentage.

• Enable business users to define monitoring criteria by using prebuilt templates

• Alert business users on data quality and integration issues as they arise

• Identify and correct problems before they impact performance and operational systems

• Speeding and strengthening business effectiveness Informatica 9.1 makes “MDM-aware”

everyday business applications such as, Oracle, Siebel, SAP for CRM, ERP, and

others by presenting reconciled master data directly within those applications. For example,

Informatica’s MDM solution will advise a salesperson creating a new account for “John Jones”

that a customer named Jonathan Jones, with the same address, already exists. Through

the Salesforce interface, the user can access complete, reliable customer information that

Informatica MDM has consolidated from disparate applications.

She can see the products and services that John has in place and that he follows her

company’s Twitter tweets and is a Facebook fan. She has visibility into his household and

business relationships and can make relevant cross-sell offers. In both B2B and B2C scenarios,

MDM-aware applications spare the sales force from hunting for data or engaging IT while

substantially increasing productivity.

• Giving business users a hands-on role in data integration and quality Long delays and

high costs are typical when the business attempts to communicate data specifications to

IT in spreadsheets. Part of the problem has been the lack of tools that promote business/IT

collaboration and make data integration and quality accessible to the business user.

As Big Data unfolds, Informatica 9.1 gives analysts and data stewards a hands-on role. Let’s

say your company has acquired a competitor and needs to migrate and merge new Big Data

into your operational systems. A data steward can browse a data quality scorecard and identify

anomalies in how certain customers were identified and share a sample specification with IT.

Once validated, the steward can propagate the specification across affected applications. A

role-based interface also enables the steward to view data integration logic in semantic terms

and create data integration mappings that can be readily understood and reused by other

business users or IT. Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform