Architecting analysis of Structured and unstructured data

For Analysis of Structured data maintained in Data warehouse. Datawarehouse structured data has following characteristics:

Subject oriented , Integrated, NonVolatile(data does not change),Time variant(historical data varies on time).
read more definitions from guide: https://docs.oracle.com/cd/B10500_01/server.920/a96520/concept.htm

But unstructured data follows characteristics of 5V volume,variety,variability,value,velocity of data. So There is no structure to data in form of fixed columns and rows. So data warehouse needs to incorporate and drive intelligence taking this 5V challenge. read more
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/

For Unstructured data we have technology like Hadoop with HIVE acting as datawarehouse.

Apache Hadoop ecosystem covers most unstructured data analysis tools and technology namely

Apache Hadoop hive: datawarehouse of unstructured data using hadoop.
Apache Hadoop Hbase: SQL like query support to access hadoop covered data.
Apache Hadoop HDFS: distributed database Hadoop filesystem.
Apache Hadoop Mahout: Analytics engine on top of Hadoop ecosystem.

Trend of favouring Real time data quick Feedback rather than Batch processing. Hadoop is good for batch processing large parallel loads

Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1. 
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
https://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
https://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/

Data Integration , map Reduce algorithm , virtualisation relation and trends

In year 2011 This reply i did to a discussion. would later structure it into proper article.

As of 2010 data virtualization had begun to advance ETL processing. The application of data virtualization to ETL allowed solving the most common ETL tasks of data migration and application integration for multiple dispersed data sources. So-called Virtual ETL operates with the abstracted representation of the objects or entities gathered from the variety of relational, semi-structured and unstructured data sources. ETL tools can leverage object-oriented modeling and work with entities’ representations persistently stored in a centrally located hub-and-spoke architecture. Such a collection that contains representations of the entities or objects gathered from the data sources for ETL processing is called a metadata repository and it can reside in memory[1] or be made persistent. By using a persistent metadata repository, ETL tools can transition from one-time projects to persistent middleware, performing data harmonization and data profiling consistently and in near-real time.

———————————————————————————————————————————————-
– More then colmunar databases i see probalistic databases : link:http://en.wikipedia.org/wiki/Probabilistic_database

probabilistic database is an uncertain database in which the possible worlds have associated probabilities. Probabilistic database management systems are currently an active area of research. “While there are currently no commercial probabilistic database systems, several research prototypes exist…”[1]

Probabilistic databases distinguish between the logical data model and the physical representation of the data much like relational databases do in the ANSI-SPARC Architecture. In probabilistic databases this is even more crucial since such databases have to represent very large numbers of possible worlds, often exponential in the size of one world (a classical database), succinctly.

————————————————————————————————————————————————
For Bigdata analysis the software which is getting popular today is IBM big data analytics
I am writing about this too..already written some possible case study where and how to implement.
Understanding Big data PDF attached.
———————————————————————————————————————————————–
There are lot of other vendors which are also moving in products for cloud computing..in next release on SSIS hadoop feed will be available as source.
— Microstraegy and informatica already have it.
— this whole concept is based on mapreduce algorithm from google..There are online tutorials on mapreduce.(ppt attached)
—————————————————————————————————————————————–

Without a doubt, data analytics have a powerful new tool with the “map/reduce” development model, which has recently surged in popularity as open source solutions such as Hadoop have helped raise awareness.

Tool: You may be surprised to learn that the map/reduce pattern dates back to pioneering work in the 1980s which originally demonstrated the power of data parallel computing. Having proven its value to accelerate “time to insight,” map/reduce takes many forms and is now being offered in several competing frameworks.

If you are interested in adopting map/reduce within your organization, why not choose the easiest and best performing solution? ScaleOut StateServer’s in-memory data grid offers important advantages, such as industry-leading map/reduce performance and an extremely easy to use programming model that minimizes development time.

Here’s how ScaleOut map/reduce can give your data analysis the ideal map/reduce framework:

Industry-Leading Performance

  • ScaleOut StateServer’s in-memory data grids provide extremely fast data access for map/reduce. This avoids the overhead of staging data from disk and keeps the network from becoming a bottleneck.
  • ScaleOut StateServer eliminates unnecessary data motion by load-balancing the distributed data grid and accessing data in place. This gives your map/reduce consistently fast data access.
  • Automatic parallel speed-up takes full advantage of all servers, processors, and cores.
  • Integrated, easy-to-use APIs enable on-demand analytics; there’s no need to wait for batch jobs.

Mathematical Modelling of Wireless Sensor Network

 

A day in life of datawarehouse Consultant

Consultant Analyses the business deeper to come up with star-schema design and further ETL load design,
Working as datawarehouse consultant most important task is to fix granularity of fact across dimensions to be analysed in FACT-DIMENSION Star schema design.
Granularity depends on business requirement and key drivers for business to be analysed for having its impact on Topline and Bottomline of Company. For Clinical Research key driver is No. of patient Enrolled, For banking key driver is cost of adding new customer,
Now patient is analyzed across geography dimension, against time dimension. But at what level of Granularity.
(#no of patient, day)   OR
(#no of patient, year)  OR
(#no of patient, hour)
This depends on business need and level of criticality to time. For Stock trading Every second is crucial but not for clinical trails but if trial involve enrollment of large public it may required a drill down to per day figure in BI reports hence provisions must be there in star schema.
Besides this The Other task per day can be taken based on stage of project
https://sandyclassic.wordpress.com/2014/02/19/a-day-in-life-of-datawarehouse-architect-part-1/
For datawarehouse Engineer involved with task the day look like
https://sandyclassic.wordpress.com/2014/02/19/a-day-in-life-of-datawarehousing-engineer/
For Unstructured data analysis you can look at
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Then data Transformation are applied for
Example in Informatica and SSIS:

https://sandyclassic.wordpress.com/2014/01/15/eaten-tv-from-partly-eaten-apple-part-2-artificial-intelligence/

Two Sets of documents are There LLD and HLD to look at what needs transformation to be applied.
Like in Informatica Transformation Types are :

http://www.techtiks.com/informatica/beginners-guide/transformations/transformation-types/

Look at all transformations available in Informatica version 9

http://www.folkstalk.com/2011/12/transformations-in-informatica-9.html

These can be customized according to logic required.
Next step is Loading to datawarehouse dimension tables  and then to Fact table.
Read: https://sandyclassic.wordpress.com/2014/02/06/coke-vs-pepsi-of-datawarehousing-etl-vs-elt/
And more

https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/

Security Risk Management in Healthcare on Cloud

Quick presentation made in 1 hour


Security Risk Management in Healthcare on Cloud using NIST guidelines

Case Study Artificial Intelligence,ETL and Datawarehousing Examples part 1

Read : IPTV and Augmented Reality using Artificial Intelligence.
https://sandyclassic.wordpress.com/2012/06/27/future-of-flex-flash-gamification-of-erp-enterprise-software-augmented-reality-on-mobile-apps-iptv/

AI is there in many place like one area of AI Fuzzy Set there is already Fuzzy Transformation in SQL Server Integration Services since year 2010.
What it does Fuzzy logic Transformation achieve?
So when we match two records we do it by checking each alphabet using regular matches.
But when we use fuzzy logic it brings out similar sounding and combination matches although alphabet may not be same also it checks meaning is same. Even it can override spelling mistakes to get right results How?
Example Fuzzy logic in SSIS:
USA,us, united states – For country Any person can enter any of these combination.
Usually its taken up for Data cleansing.
If data is not cleaned using De-dup it may not show many of these records in result for matches.
But Fuzzy logic we use Fuzzy set from all records it creates fuzzy set of record with
Set A { ElementA, membershipOfElementA}
membershipOfElementA define in percentage terms the possibility of it being in the similarly grouped set.
{us,0.97} {united states,0.98} {usa,0.99} {united states of America,1} so we can set tolerance level to 3% then all of these matches are there in result.
code you can see at http://www.codeproject.com/Tips/528243/SSIS-Fuzzy-lookup-for-cleaning-dirty-data
SIRI:  Speech Recognition Search Which was introduced in iPhone long back takes speech.
Speech input to pressure sensor –> generate Waveform –> Then Compare wave form
That’s process but.
AI in SIRI
The Waveform may be amplitude modulated but yet same thing let suppose we say
Apple the Two Waveform compared may have boundary level aberrations which can be defined by membership function Then same result within same Tolerance limit can be deemed to be similar. This membership can be calculated each time person do a search dynamically when it says something in on Mike which repeat same process again.
There can be lots of image processing and AI search algorithm can be built to make better.
Like A* search etc.
Already if the words are linked can be understood by Neural Network. Similar way Neural Network is used to predict The  traffic congestion aggregating data paths from street light sensors in japan Tokyo.
Aggregation of words can be achieved by neural Network in not exact but similar way to some Extent. Thus completing the search.
This aggregation may be used in text, covariance matrix of images or covariance of sound score or speech search.
Using Laplace Transform’s Cross correlation. Read (http://en.wikipedia.org/wiki/Cross-correlation512px-Comparison_convolution_correlation.svg
Now TV is large platform just like difference between watching movie on laptop or TV Vs on 70 mm screen. Each of those has there own market.
Costly Miniaturisation
What effect you can provide on TV may not be provided on mobile until there is technical break through in miniaturisation. I am not saying it cannot be provided but it will require relative less technical  break through compared with miniaturized chips or may be less costly.
Second TV is like we have last mile connectivity in Telecom.
So When you have something to watch in any storage device you can just throw that on TV Ubiquitously . As TV would be there in every house so you need not carry screen to watch. Just like Last mile wireless connectivity using HotSpot.

Telecom Technology Stack – Part 2 Strategy ( Oracle Stack + CRM for ARPU)

Part 1 read: https://sandyclassic.wordpress.com/2013/10/26/telecom-technology-stack/
The Enhanced Telecom Operation Map in short e-TOM does gives complete landsacpe of software products used by a Telecom Vendor.
EtomLevel0                                              Level Zero e-TOM software landscape.
More detail : read previous blog
Complementary read: http://en.wikipedia.org/wiki/Enhanced_Telecom_Operations_Map
Two major category its is divided into are Operational support software OSS and BSS Business support software. (Read Last article for more detail)
https://sandyclassic.wordpress.com/2013/10/26/telecom-technology-stack/
Oracle has been trying to build a complete stack of OSS and BSS bundled into one product offering by acquiring company like acquisition of Portal in 2006 for billing acquisition of
http://en.wikipedia.org/wiki/Portal_Software
Covergin : Telecom Service Borker
Oracle Communication Stack                                   Oracle Communication Stack
Watch complete list of acquisitions:
http://en.wikipedia.org/wiki/List_of_acquisitions_by_Oracle
Software Based Communication Stack is also being defined by Open Management Group OMG which maintains specifications for UML, CORBA (http://www.omg.org/spec/CORBA/) and other IDL . Can read Complete list of Specification maintained by OMG at http://www.omg.org/spec/
In OSS Activation and in BSS mediation,Billing are most important components.
Want to be Profitable focus on CRM Analytics:
Telecom services Company profitability is dependent on:
These days to mantain Good ARPU (Average Revenue per User) , CRM is most critical.
As tailoring of plans and greater understanding of consumer behaviour can be achieved By studying Data of customer inside CRM.
CRM Customer Relationship management Software were first set of ERP module Which Went trough reversal in Approach. While in ERP an ERP analyst feeds data of customer (high probability of data errors) in CRM its self service Automation. CRM provided user itself access to forms were data can be entered.
Do you remember When you take vodaphone card seller tells you do not forgot to enter your details in portal you will get extra top-up free. its same self service automation which generates forms and take data to CRM system.So not only less work on data entry and thus errors , wrong bills sometimes due to that Also Wrong targeted offering by vendors which defeats the purpose itself. So CRM is very crucial.

Changes in CRM ecosystem?
CRM were first set of software to embrace open source with products like SugarCRM based on PHP and LAMP technology. Why?
Reasoning: CRM is one set of ERP module which is required by not only small scale vendor , but also SMB (small and medium sector enterprises) as well Large vendors. Usually unlike other modules like we say SAP financials (it was hard earlier for small vendor not only to purchase but many of its sub module will remain redundant or Such details are not required by SME vendors). So Many applications vendor started adding feature for SME CRM requirements. From this born out SugarCRM completely PHP based.

Siebel dominated CRM market as focused vendor for Only CRM not other  module.
Highly customizable like ERP and integrated with products for data management like informatica, data stage , BI Business Objects or Sieble for reporting.

CRM were first set of software to enter into cloud why?
precisely same reason spelt out above. Also benefits of pay per use is more for small vendor turning its capital Expenditure CAPEX into operating Expenditure OPEX.
For SME also CAPEX to OPEX make more sense rather than blocking money in Expensive software buying , maintenance and implementation.
Cloud based SalesForce CRM was Hot technology in cloud made perfect sense last 3-4 yrs.
To That Extent that one of oracle develop conference did had inaugural address from Salesforce CEO Marc Benioff.
See this News: http://www.salesforce.com/ca/company/news-press/press-releases/2010/09/100913.jsp

The Presence in Every Basket Strategy
But in next 2011 conference since cloud tech was hot this was matter of speculations Whether  Marc Benioff will speak or not? in Oracle Develop. Any how Ellison had investment in both the companies.It was like P&G marketing strategy.
If you are high income group I have soap X for you.
if you are medium income group I have soap Y for you.
if you are low income group I have soap Z for you.
So Every segment was coverded.
Oracle Already present in non-cloud CRM by Oracle CRM and acquisition Oracle peoplesoft CRM, Oracle Siebel CRM had invesment in Cloud CRM Salesforce CRM.
CRM Analytics
CRM data which is most crucial is Churn Analysis will show where customer is moving.
Showing Number of customer moving out of particular plan can help in improving retention in plan or improve plan.You can find which assets are moving by turnover ratios and offer discounts on not moving to monetise assets holding up money circulation.
Market basket analysis will show the Number of baskets in which customer can be grouped use this for targeted plan for each group/basket.Segmenting depends on number of variable related to consumer and market conditions like inflection point. Using analytic we can simulate to test hypothesis, variance, trends, Extrapolate data based on induced conditions, predictive analytics can further refine trends and predict success factors.

Coke Vs Pepsi of :Datawarehousing ETL Vs ELT

The Coke and Pepsi are always fighting to have bigger pie in international drinks market.
Both are present in 180+ countries agressively pursuing the pie of market share.

The Datwarehouses are different animals on block. They are databases But they are not normalized. They do not follow all 12 Codd Rules. But yet source and Target are RDBMS.
The Structure Where its saved Whether in Star Schema or Snow-flake is denormalized as possible like flat file structures. More Constraints slows down the join process.
Read more: https://sandyclassic.wordpress.com/2014/01/26/a-day-in-life-of-business-intelligence-engineer/
So there are less restrained much Faster file based alternatives for databases which Emerged for need to store unstructured data and achieve 5V massive volume, variety, velocity etc.. Read below links:
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
Which are have also found favour in ETL world with Hadoop. Now Every ETL allows hadoop connector or adapter to Extract data from hadoop HDFS so service in HDFS and similar.
https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/
(
Adapters use-case for  product offering Read:https://sandyclassic.wordpress.com/2014/02/05/design-pattern-in-real-world/)
ETL process

ETL Extract-Transform-Load
ETL where transformation happens in staging area.
Extract data from sources , put in staging area cleanse it, transform data and Then Load in Target Datawarehouse. So popular Tools like informatica, datastage or ab-initio use this approach. Like in Informatica for fetching data or Extract Phase we can use fast source-qualifier transformation OR use Joiner transformation when we have multiple different databases like both SQL Server and Oracle although may be slow but can take both input but Source qualifier may require single vendor but is fast.
After Extracting We can use Filter transformation to filter out unwanted rows in staging area. Then load into target Databases.

ELT Extract Load and then Transform.
Extract data from disparate sources , Load the data into RDBMS engine first after . Then use RDBMS facility to Cleanse and Transform Data. This Approach was popularised By Oracle because Oracle Already had Database Intellectual property and was motivated to increase its usage.So Why does cleansing and Transformation outside the RDBMS into staging area rather within RDBMS engine. Oracle ODI Oracle Data integrator uses this concept of ELT not ETL bit reversal from routine.

So like Pepsi Vs Cola wave of Advertisement and gorilla marketing or To showcase Each other products strengths and hide weakness Games continue here Also in ETL world of data warehousing. Each one has its own merits and demerits.

Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1. 
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
https://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
https://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/

A Day in Life of Business Intelligence (BI) Architect- part 1

BI Architect most important responsibility is maintaining semantic Layer between Datawarehouse and BI Reports.
There are basically Two Roles of Architect: BI Architect or ETL Architect in data warehousing and BI. (ETL Architect in Future posts).
Semantic Layer Creation
Once data-warehouse is built and BI reports Needs to created. Then requirement gathering phase HLD High level design and LLD Low Level design are made.
Using HLD and LLD BI semantic layer is built in SAP BO its called Universe, in IBM Cognos using framework manager create Framework old version called catalogue, In Micro strategy its called project.
Once this semantic layer is built according to report data SQL requirements.
Note: Using semantic layer saves lot of time in adjustment of changed Business Logic in future change requests.
Real issues Example: Problems in semantic Layer creation like in SAP BO: Read
https://sandyclassic.wordpress.com/2013/09/18/how-to-solve-fan-trap-and-chasm-trap/
Report Development:
Reports are created using objects created by semantic layer.Complex reporting requirement for
1. UI require decision on flavour of reporting Tool like within
There are sets of reporting tool to choose from Like in IBM Cognos choose from Query Studio, Report Studio, Event Studio, Analysis Studio, Metric Studio.
2. Tool modification using SDK features are not enough then need to modify using Java/.net of VC++ API. At html level using AJAX javascript API or integrating with 3rd party API.
3. Report level macros/API for better UI.
4. Most important is data requirement my require Coding procedure at database or consolidations of various databases. Join Excel data with RDBMS and unstructured data using report level features. Data features may be more complex than UI.
5. user/data level security,LDAP integration.
6. Complex Scheduling of reports or bursting of reports may require modification using rarely Shell script or mostly Scheduling tool.
List is endless
Read More:
details of
https://sandyclassic.wordpress.com/2014/01/26/a-day-in-life-of-bi-engineer-part-2/

Integration with Third party and Security

After This BI’s UI has to fixed to reflect customer requirement. There might be integration with other products and seamless integration of users By LDAP. And hence Objects level security, User level security of report data according to User roles.
Like a Manager see report with data The same data may not be visible to clerk when he sees same report. Due filtering of data by user roles using User Level security.

BI over Cloud
setting BI over cloud Read blog.
Cloud Computing relation to Business Intelligence and Datawarehousing

Read :
1. https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/

2. https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
https://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
https://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/

A day in Life of Business Intelligence Engineer

BI Requirements
Business Intelligence BI is used to analyse data to direct the resources towards area which is most productive or profitable. Slicing and dicing data available in cube(multi-dimensional data structure) along the critical areas of business defined by dimensions along which analysis has to be performed. Traditional databases are just 2 dimensional but if we need to move beyond 2 dimensional Analysis we have to used Analytical queries. Analytical Queries using functions like rank, tile etc. Are Not very Easy to conceptualize to complex requirements. These BI tools take the data and generate Analytical Queries in background seamlessly without user being aware of it.
The Need for Datawarehouse data modelling
The data from desperate Sources is accumulated into single datawarehouse or Can in subject specific data warehouse (also called Data marts). The datawarehouse consit of Facts table which is surrounded by dimension table (with ratio of about 80:20). Most data is found in Fact table. if there are more dimension to analyse beyond this 20 limit Then we can say “Too much analysis becomes paralysis”. Since there would be then lots of combinations of dimensions on which data could be analysed leading to lots of complex choices to deal with which may not be worth the effort.
So after proper business Analysis The key drivers of business are chosen along with those drivers parameters , Key performance indicators KPI , or facts are created.And also the dimensions which are critical to business are chosen to represent dimension tables.
Like for Clinical Research Company most critical driver is No. of Patients. Because if there are patients recruited by investigator (doctor) then only you can perform drug testing.
So Facts inside fact table used of analysis are measure centred around like number of patient enrolled, No. of patient suffered Adverse Effect, No. of randomized sample patient population , cost of enrolling patient etc… Now this will analysed across dimension like Geographic area (countries),Time (year 2012,2013 etc..), Study (analyse study wise),
etc…
Now Can Fact table will have numeric values facts and key of dimension tables. if its Star Schema then 2nd Normal Form of RDBMS (All non-prime attributes dependent on Prime attribute) but may have transitive dependency. Transitive dependency is removed by 3rd Normal form. A–>B (read B dependent on A), B–>C then A–>C should not be there.

When in perfect 3rd normal form Star Schema will become Snow-Flake schema with dimensions decomposed further into sub-dimensions.
DDB2- Snow Flake-SchemaExample of Snow flake schema you can see Item dimension is further broken into supplier table.
Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1.
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read:
https://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
https://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/

—————————————————————————————–
A day Time Schedule of BI Engineer:
There is lifecycle flow in BI projects from requirement gather High Level Design HLD, Low Level Design LLD, using design to create report,
Choose Right Tool for report
Type of Reports:
1. Ad-hoc reports, Slice-dice (cube),
2. Event based alerts reports, Scheduling options,
3. Operational reports,
4. Complex logic report (highly customized),
5. BI App report embedded with 3rd party API,
6. Program generated report, report exposed as web service,
7. In-memory system based report like IBM Cognos TM1 or SAP HANA
8. Reports from ERP (has totally different dynamics like Bex Analyser SAP Sales and Distribution Reports). Which goes deep into domain and fetch Domain specific as well cross domain functionality. More Domain reports in SAP are fetch using SAP ABAP  Report other option SAP Bex Analyser, or Web dynapro or SAP BO crystal report, or Web-Intelligence like this more than 20 favours of reports software exist with 1 Report development product like SAP BO and SAP.
BI Report FlavorFigure: Some reporting Flavours Of SAP BO List is still not Full here. 9/14 reporting toll shown here.
Which will see later this in More detail. (Can Read link above for architectural differences between two system).
9.  Enterprise Search: This is Also part of BI ecosystem. Microsoft FAST or Endeca can search Enterprise repository having indexes related index to point to Right data not just document. Like SharePoint CMS searches,brings indexed documents ,set rights to view or edit, set user profile, But its not pointing to right Answers based on data.
Even Single unified metadata using Customer Data Integration CDI, can corelated equivalence between entities Across disparate ERP.  Enterprise search can use this intelligence and maintain repository to throw answers to user question in search like interface.
Lets look At Each of this option in 1 product Like SAP BO/BI:
1.WebIntelligence/Desktop Intelligence
2. Crystal Reports
3. Polestar
4. Qaws
5. Xcelsious
6. Outllooksoft (planning now BPS SAP Business planning and simulation)
7. Business Explorer
8. Accelerator
Similar 14 report option Exist in BO itself. IBM Cognos (8 software) has its own options,
Microstrategy also has some (5-7) set of Reporting options.
———–
Web Dynapro
Bex Analyser
————————————————————————————
First few days should understand business otherwise cannot create effective reports.
9:00 -10am Meet customer to understands key facts which affect business.
10-12 prepare HLD High level Document containing 10,000 feet view of requirement.
version 1. it may refined later subsequent days.
12-1:30 attend scrum meeting to update status to rest of team. co-ordinate with Team Lead, Architect and project Manager for new activity assignment for new reports.
Usually person handling one domain area of business would be given that domain specific reports as during last report development resource already acquired domain knowledge.
And does not need to learn new domin..otherwise if becoming monotonus and want to move to new area. (like sales domain report for Chip manufactuers may contain demand planning etc…)
1:30-2:00 document the new reports to be worked on today.
2:00-2:30 Lunch
2:30-3:30 Look at LLD and HLD of new reports. find sources if they exist otherwise Semantic layer needs to modified.
3:30-4:00 co-ordinate with other resource reports requirement with Architect to modify semantic layer, and other reporting requirements.
4:00-5:00 Develop\code reports, conditional formatting,set scheduling option, verify data set.
5:00-5:30 Look at old defects rectify issues.(if there is separate team for defect handling then devote time on report development).
5:30-6:00 attend defect management call and present defect resolved pending issue with Testing team.
6:00-6:30 document the work done. And status of work assigned.
6:30-7:30 Look at report pending issues. Code or research work around.
7:30-8:00 report optimisation/research.
8:00=8:30 Dinner return back home.
Ofcourse has to look at bigger picture hence need to see what reports other worked on.
Then Also needed to understand ETL design , design rules/transformations used for the project. try to develop frameworks and generic report/code which can be reused.
Look at integration of these reports to ERP (SAP,peopesoft,oracle apps etc ), CMS (joomla, sharepoint), scheduling options, Cloud enablement, Ajax-fying reports web interfaces using third party library or report SDK, integration to web portals, portal creation for reports.
So these task do take time as and when they arrive.

This Article is in top most on google Search page on semantic web OWL and Internet of things.
SemanticWebOWLAndInternetOf Things
– Ontology  can be represented by OWL (Ontology Web Language) which is also refining and defining the agents used to search the personalized behavioural web for you(also called semantic web) Thus these agents understand you behaviour and help to find better recommendations and  search List in Semantic space.
OWL:
OWL2-structure2-800
– Semantic Web:
– Augmented Reality : is used in Gaming and multimedia application
(read Article link below)
Perceived Reality vs The augmented reality
Augmented reality is fuelled by ontology + perceived reality.
READ: How Augmented Reality transforming gamification of Software (like ERP)
https://sandyclassic.wordpress.com/2012/06/27/future-of-flex-flash-gamification-of-erp-enterprise-software-augmented-reality-on-mobile-apps-iptv/
– New age software development
https://sandyclassic.wordpress.com/2013/09/18/new-breed-of-app-development-is-here/
O
ntology can integrate many task into a Uniform task which were not possible Early.

Read Discussion On Reality Vs Actuality on Wikki link of Ontology
http://en.wikipedia.org/wiki/Ontology

This Is ongoing Article I going to complete the pieces with example and below topics
– CDI, Customer Data Integration (Single version of truth for a data Like
single person can be An Employee in Peoplesoft ERP, same person can be Customer in SAP CRM, Can be an represented in different way in  Oracle Financial. But when we develop reports on some parameters across functional areas then categorisation into a single entity can be achieved through CDI.
How this is linked to Ontology will explain further.
– MDM, Master data Management (Managing data (meta data)about data)
– Federated data management.
several Data Marts leading to a universal single data warehouse one design But in Data Federation the data from various data mart can be integrated virtually to create single view of Data from various disparate sources can also be used.
This relationship would be further expanded its not complete now..

Mathematical Modelling the Sensor Network

 

Modelling Wireless sensor network

Modelling Wireless sensor network

1. Go through the Slides about Modelling the Wireless sensor Network and Internet of Things

  • 10 PROJECT GOALS 1. Routing algorithm: SPIN,CTP. 2. measure energy consumed 3. Validate PPECEM Model 4. Improve in existing model for efficiency, reliability, availability.
  • 2. 10 PROJECT GOALS 5. New Model: ERAECEM Efficiency Reliability Availability Energy consumption Estimation Model. 6. ERAQP BASED on ERAECEM Model for WSN a new energy aware routing algorithm (ERAQP)
  • 3. 10 PROJECT GOALS 7. Configurable Routing Algorithm Approach Proposed on WSN motes utilizing user defined QoS parameters 8. Model for WSN: Leader-Follower Model, Directed Diffusion Model
  • 4. 10 PROJECT GOALS 9. Fuzzy routing Algorithm 10. Fuzzy Information Neural Network representation of Wireless Sensor Network.
  • 5. MOTIVATION
  • 6. 1.1 SPIN
  • 7. 1.2 CTP  Collection tree protocol
  • 8. 2 ENERGY MEASUREMENT  Agilent 33522B Waveform Generator was used to measure the Current and voltage graph .  The Graph measurement were then converted to numerical power Power= Voltage X current = V X I. The Power consumed during motes routing on SPIN and CTP then taken into is added up to give power consumption and values are applied to PPECEM.
  • 9. 1.3 WSN SECURITY
  • 10. 3.1COST OF SECURITY  Cost of security In WSN can only be estimated by looking at extra burden of secure algorithm and security of Energy Consumption as the Energy is key driver or critical resource in design of WSN. As design is completely dominated by size of battery supplying power to mote.
  • 11. 3.2 PPECEM  QCPU = PCPU * TCPU = PCPU * (BEnc * TBEnc + BDec * TBDec +BMac * TBMac + TRadioActive) Eq.2)
  • 12. 4 ERA  Efficiency = Ptr X Prc X Pcry … (Eq.2)  Reliability = Rnode1 = FtrX FrcX Fcy  Availability= TFNode1 = Ftr+ Frc+Fcry
  • 13. 5. IMPROVE EXISTING  . ERA = fed  Efficiency of Energy Model: QEff=QCPU X Eff (improvement #1 in Zang model)
  • 14. ERAECEM  Etotal = Average(Eff + R +A)= (E+R+A)/3  Efficiency of Energy Model: QEff=QCPU X Etotal (improvement #1 in Zang model)
  • 15. 6 ERAQP  Efficiency ,Reliability, Availability QoS prioritized routing Algorithm  ERA ranked and routing based Ranking Cost on Dijesktra to find most suitable path
  • 16. 7.CONFIG. ROUTING  q1, q2, q3 as QoS parameter algorithm rank Motes/nodes based on combined score of these parameters. Based on this we rank we apply Dijesktra algorithm to arrive at least path or elect Cluster head to node. Thus q1, q2, q3 can be added, deleted.
  • 17. 8 MATHEMATICAL MODEL  Leader Follower EACH node share defined diffusion rate given by slider control on UI which tells quantity it is diffusing with its neighbors.Since it’s a directed graph so Node B gives data towards Node A while traffic from A towards B may be non-existent  Directed Diffusion Mathematical model represent diffusion of quantity towards a directed network. Helps to understand topology, density and stability of network and a starting point for designing complex , realistic Network Model.
  • 18. 9 FUZZY ROUTING  Fuzzy set A {MoteA, p(A))  Where, p(A) is probability Of Data Usage Or Percentage Load in Fraction Compared With Global Load
  • 19. 10 FUZZY TOPOLOGY  Based on this Utilization p(A) nodes can be ranked in ascending order to find most data dwarfed node at the top. Then We can apply Dijkstra’s algorithm on the network to find best route based on weight on each node represented by Rank.

2. WSN and BPEL and Internet Of Things (IoT)
https://sandyclassic.wordpress.com/2013/10/06/bpm-bpel-and-internet-of-things/

3. Internet Of Things (IoT) and effects on other device ecosystem.
The Changing Landscape:
https://sandyclassic.wordpress.com/2013/10/01/internet-of-things/

4. How application development changes with IoT, Bigdata, parallel computing, HPC High performance computing.
https://sandyclassic.wordpress.com/2013/09/18/new-breed-of-app-development-is-here/

 

Future of Web:More Processors+ more AJAX + more Web Services (flexibility Vs Security)

Since Asynchronous Javascript and XML (AJAX)  was introduced first time through xmlHttp Object in IE browers by microsoft. Since then Every browser and Website runs on AJAX not waiting for synchronous reply from the server before sending another request. Client got freedom to seen any number of request anytime without waiting for arrival of reply.
AJAX shifted Web development balance more towards web scripting languages like JavaScript and vbscript from Server centric world of Java/ASP/Servlet.
AJAX API have come into being like for about anything from Mapping API of google, Charting and UI provided by JQuery, and better UI through Adobe Flex API using Action script, ExtJS library.
Server centric world have more sifted towards interoperability of  Web services with WS-i ans JSR 21. For better utilisation of components written as web service can be consumed by any Java or .NET or COBOL or SAP or Oracle apps.
With web services defined by (WSDL) web services definition language And three components Service Broker, Service Requester, Service Provider interacting with Each other to Expose as web service not only comes interoperate but since all web service properties are in XML thus achieving portability to other platforms like Java web service can talk to .net web service passing property details in XML format.
Now if data fetched from databases in put into XML format. data Also become portable to and database ..like you can take data out from oracle in XML and import to SQL server and so on..
PHP not only was first to popularize AJAX but also first to introduce Restful web services.
now instead of SOAP request to server the client sends SOAP over http thus web service request is just like URL. This provided lot of flexibility and security. As Earlier firewalls were made to intercept http header request but not SOAP header request. There were things like Schema poisoning, SOAP injection etc.. But today’s firewalls can track these all to details we want.
————————————————————————————
So front end AJAX API revolution  and server side moving towards restful web services. This Flexibility in AJAX also raised lot of Security concerns like cross site scripting (XSS) attacks, injection of javascript , (CSRF) Cross site Request Forgery and cryptographic security at web server services like JAAS Java authentication and authorisation API, (JCA) java cryptography API, similarly  .NET  Also have similar API.
————————————————————————————

With in few year code sifted more from server side to client side with AJAX, Json etc So from 90:10 ratio to present 60:40 the movement has be relentless. But it have been marred with all security concerns of code present at client side.
Now we Even have server side javascript as well with Node.js.
with Flexible scripting languages like DOJO which provides so many options as compared with plain javascript. And Now we even talk reverse way by which javascript function can
Usually user request through JavaScript to server side JSP/ASP/Servlet.
But in reverse way also through Java Web R emoting : JavaScript can use methods implemented in Server side code directly. can can request for data. while server also tightly coupled with client can refresh client code any time like Cricket score websites.
server can refresh the client with new code same way client can request new data and client javascript can use java server side functions directly using JWR.
https://sandyclassic.wordpress.com/2013/09/20/next-generation-application-developement/
M
ore and More code will move towards client with AJAX to Server code ration if 40:60 it will go to 50: 50 to may be 60:40.. As the number of chips available on client machine motherboard continue to increase with i7 being 1 master chip and 6 slave chips (its like tree of level 3.  so 2^(3-1) = 7 CPU..
In future there might be 23 CPU in your system so we can execute more code in parallel
Read how this will effect programming:
https://sandyclassic.wordpress.com/2012/11/11/parallel-programming-take-advantage-of-multi-core-processors-using-parallel-studio/

BPM and its influence on Cloud: Infrastructure as service

Cisco, Microsoft and Neapp Jointly produced a system called opalis (Workflow) in 2012.

Data centre System process interactions can be configured depending on user need on Opalis and rules can be set up for those interactions. Read previous blog more about BPM and internet of things:

https://sandyclassic.wordpress.com/2013/10/06/bpm-bpel-and-internet-of-things/
Opalis, Which essentially provide a workflow to dynamically create,monitor,deploy a Machine instance , allocate OS instance, (just like in Nebula, or Eucalyptus ) and User also can request (specific machine with RAM, CPU, storage space).
Microsoft provide all OS /software instance, Neapp provide SAN or and storage required on , Cisco provide Server , Nexus switches boxes.
http://technet.microsoft.com/en-us/systemcenter/hh913943.aspx
I
ts integrated with Microsoft SCMM System centre Manager (used to creating private cloud on Microsoft technologies and a single User Interface to administer whole
Orchestration are discussed in previous blog in case of opalis its architecture llooks like this
opalis_orchestrationRead: http://technet.microsoft.com/en-gb/library/hh420377.aspx
Read: opalis blog
http://blogs.technet.com/b/opalis/

Read: http://contoso.se/blog/?p=1665

If all exist then they can be configured using BPM workflow of opalis for a user.