Business Intelligence BI is used to analyse data to direct the resources towards area which is most productive or profitable. Slicing and dicing data available in cube(multi-dimensional data structure) along the critical areas of business defined by dimensions along which analysis has to be performed. Traditional databases are just 2 dimensional but if we need to move beyond 2 dimensional Analysis we have to used Analytical queries. Analytical Queries using functions like rank, tile etc. Are Not very Easy to conceptualize to complex requirements. These BI tools take the data and generate Analytical Queries in background seamlessly without user being aware of it.
The Need for Datawarehouse data modelling
The data from desperate Sources is accumulated into single datawarehouse or Can in subject specific data warehouse (also called Data marts). The datawarehouse consit of Facts table which is surrounded by dimension table (with ratio of about 80:20). Most data is found in Fact table. if there are more dimension to analyse beyond this 20 limit Then we can say “Too much analysis becomes paralysis”. Since there would be then lots of combinations of dimensions on which data could be analysed leading to lots of complex choices to deal with which may not be worth the effort.
So after proper business Analysis The key drivers of business are chosen along with those drivers parameters , Key performance indicators KPI , or facts are created.And also the dimensions which are critical to business are chosen to represent dimension tables.
Like for Clinical Research Company most critical driver is No. of Patients. Because if there are patients recruited by investigator (doctor) then only you can perform drug testing.
So Facts inside fact table used of analysis are measure centred around like number of patient enrolled, No. of patient suffered Adverse Effect, No. of randomized sample patient population , cost of enrolling patient etc… Now this will analysed across dimension like Geographic area (countries),Time (year 2012,2013 etc..), Study (analyse study wise),
Now Can Fact table will have numeric values facts and key of dimension tables. if its Star Schema then 2nd Normal Form of RDBMS (All non-prime attributes dependent on Prime attribute) but may have transitive dependency. Transitive dependency is removed by 3rd Normal form. A–>B (read B dependent on A), B–>C then A–>C should not be there.
When in perfect 3rd normal form Star Schema will become Snow-Flake schema with dimensions decomposed further into sub-dimensions.
Example of Snow flake schema you can see Item dimension is further broken into supplier table.
Cloud Computing relation to Business Intelligence and Datawarehousing
Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Also it compares Architecture of 2 Popular BI Tools.
Cloud Data warehouse Architecture:
Future of BI
No one can predict future but these are directions where it moving in BI.
A day Time Schedule of BI Engineer:
There is lifecycle flow in BI projects from requirement gather High Level Design HLD, Low Level Design LLD, using design to create report,
Choose Right Tool for report
Type of Reports:
1. Ad-hoc reports, Slice-dice (cube),
2. Event based alerts reports, Scheduling options,
3. Operational reports,
4. Complex logic report (highly customized),
5. BI App report embedded with 3rd party API,
6. Program generated report, report exposed as web service,
7. In-memory system based report like IBM Cognos TM1 or SAP HANA
8. Reports from ERP (has totally different dynamics like Bex Analyser SAP Sales and Distribution Reports). Which goes deep into domain and fetch Domain specific as well cross domain functionality. More Domain reports in SAP are fetch using SAP ABAP Report other option SAP Bex Analyser, or Web dynapro or SAP BO crystal report, or Web-Intelligence like this more than 20 favours of reports software exist with 1 Report development product like SAP BO and SAP.
Figure: Some reporting Flavours Of SAP BO List is still not Full here. 9/14 reporting toll shown here.
Which will see later this in More detail. (Can Read link above for architectural differences between two system).
9. Enterprise Search: This is Also part of BI ecosystem. Microsoft FAST or Endeca can search Enterprise repository having indexes related index to point to Right data not just document. Like SharePoint CMS searches,brings indexed documents ,set rights to view or edit, set user profile, But its not pointing to right Answers based on data.
Even Single unified metadata using Customer Data Integration CDI, can corelated equivalence between entities Across disparate ERP. Enterprise search can use this intelligence and maintain repository to throw answers to user question in search like interface.
Lets look At Each of this option in 1 product Like SAP BO/BI:
2. Crystal Reports
6. Outllooksoft (planning now BPS SAP Business planning and simulation)
7. Business Explorer
Similar 14 report option Exist in BO itself. IBM Cognos (8 software) has its own options,
Microstrategy also has some (5-7) set of Reporting options.
First few days should understand business otherwise cannot create effective reports.
9:00 -10am Meet customer to understands key facts which affect business.
10-12 prepare HLD High level Document containing 10,000 feet view of requirement.
version 1. it may refined later subsequent days.
12-1:30 attend scrum meeting to update status to rest of team. co-ordinate with Team Lead, Architect and project Manager for new activity assignment for new reports.
Usually person handling one domain area of business would be given that domain specific reports as during last report development resource already acquired domain knowledge.
And does not need to learn new domin..otherwise if becoming monotonus and want to move to new area. (like sales domain report for Chip manufactuers may contain demand planning etc…)
1:30-2:00 document the new reports to be worked on today.
2:30-3:30 Look at LLD and HLD of new reports. find sources if they exist otherwise Semantic layer needs to modified.
3:30-4:00 co-ordinate with other resource reports requirement with Architect to modify semantic layer, and other reporting requirements.
4:00-5:00 Develop\code reports, conditional formatting,set scheduling option, verify data set.
5:00-5:30 Look at old defects rectify issues.(if there is separate team for defect handling then devote time on report development).
5:30-6:00 attend defect management call and present defect resolved pending issue with Testing team.
6:00-6:30 document the work done. And status of work assigned.
6:30-7:30 Look at report pending issues. Code or research work around.
7:30-8:00 report optimisation/research.
8:00=8:30 Dinner return back home.
Ofcourse has to look at bigger picture hence need to see what reports other worked on.
Then Also needed to understand ETL design , design rules/transformations used for the project. try to develop frameworks and generic report/code which can be reused.
Look at integration of these reports to ERP (SAP,peopesoft,oracle apps etc ), CMS (joomla, sharepoint), scheduling options, Cloud enablement, Ajax-fying reports web interfaces using third party library or report SDK, integration to web portals, portal creation for reports.
So these task do take time as and when they arrive.