A Day in Life of Datawarehouse Architect part 1

A data warehouse Architect generally help to design datawarehouse , requirement gathering in ETL Low level design LLD, and HLD high level design, setting up database infrastructure design for datawarehouse like Storage Area Network requirements, Rapid application Clusters for database of datawarehouse more details read
Datawarehousing consists of three main area :
1. ETL(data migration, data cleansing, data scrubbing, data loading )
2. Datawarehouse design
3. Business Intelligence (BI) Reporting infrastructure.
BI
Read These Two part article for BI
– https://sandyclassic.wordpress.com/2014/01/26/a-day-in-life-of-bi-engineer-part-2/
– https://sandyclassic.wordpress.com/2014/01/26/a-day-in-life-of-business-intelligence-engineer/
And Architect
https://sandyclassic.wordpress.com/2014/02/02/a-day-in-life-of-business-intelligence-bi-architect-part-1/

Design : Now Coming to part 2 (is generally work of Data warehouse architect)
Read Some details More would be covered in future articles
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
9:00-9:30 Read and reply mails.
9:30-10:30 Scrum Meeting
10:30-11:30 update documents According to Scrum meeting like burn down chart etc..update all stake holders.
11:30-12:00 Meeting with Client to understand new requirements. create/update design specification from requirement gathered.
12:00-13:30 create HLD/LLD from the required user stories according to customer Landscape of technology used.
13:30-14:00 Lunch Break.
14:00-14:30 Update the Estimations ,coding standards , best practises for project.
14:30-15-30 Code walk through update team on coding standards.
15-30-16:30Defect call with Testing and development Team to understand defects, reasons of defects, scope creep, defect issuse with defect manager, look at issue/defect register
16:30-17:30 Work on specification of Design of datawarehouse modelling Star or Snow flake schema design according to business requirements granularity requirements.
17:30-18:30 Look at Technical Challenges requiring Out of Box thinking, thought leadership issue, Proof of concept of leading Edge and Breeding Edge technologies fitment from project prospective.
18:30-19:30  onwards Code for POC and Look a ways of tweaking , achieving technology POC code.
19:30- 20:30 onwards Forward thinking issue might be faced ahead by using a particular technology is continuous never ending process as there can be multiple combination possible to achieve as well as using particular component or technology should not create vendor lock in, cost issues, make/buy cost decisions, usability, scalability, security issues (like PL/SQL injection, SQL injection using AJAX or web services may be affected by (XSS attack or web services Schema poisoning), Environmental network scalability issues. Affect due to new upcoming technology on Existing code.
20:30 Dinner
Available on Call.. for any deployment, production emergency problems.

Coke Vs Pepsi of :Datawarehousing ETL Vs ELT

The Coke and Pepsi are always fighting to have bigger pie in international drinks market.
Both are present in 180+ countries agressively pursuing the pie of market share.

The Datwarehouses are different animals on block. They are databases But they are not normalized. They do not follow all 12 Codd Rules. But yet source and Target are RDBMS.
The Structure Where its saved Whether in Star Schema or Snow-flake is denormalized as possible like flat file structures. More Constraints slows down the join process.
Read more: https://sandyclassic.wordpress.com/2014/01/26/a-day-in-life-of-business-intelligence-engineer/
So there are less restrained much Faster file based alternatives for databases which Emerged for need to store unstructured data and achieve 5V massive volume, variety, velocity etc.. Read below links:
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
Which are have also found favour in ETL world with Hadoop. Now Every ETL allows hadoop connector or adapter to Extract data from hadoop HDFS so service in HDFS and similar.
https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/
(
Adapters use-case for  product offering Read:https://sandyclassic.wordpress.com/2014/02/05/design-pattern-in-real-world/)
ETL process

ETL Extract-Transform-Load
ETL where transformation happens in staging area.
Extract data from sources , put in staging area cleanse it, transform data and Then Load in Target Datawarehouse. So popular Tools like informatica, datastage or ab-initio use this approach. Like in Informatica for fetching data or Extract Phase we can use fast source-qualifier transformation OR use Joiner transformation when we have multiple different databases like both SQL Server and Oracle although may be slow but can take both input but Source qualifier may require single vendor but is fast.
After Extracting We can use Filter transformation to filter out unwanted rows in staging area. Then load into target Databases.

ELT Extract Load and then Transform.
Extract data from disparate sources , Load the data into RDBMS engine first after . Then use RDBMS facility to Cleanse and Transform Data. This Approach was popularised By Oracle because Oracle Already had Database Intellectual property and was motivated to increase its usage.So Why does cleansing and Transformation outside the RDBMS into staging area rather within RDBMS engine. Oracle ODI Oracle Data integrator uses this concept of ELT not ETL bit reversal from routine.

So like Pepsi Vs Cola wave of Advertisement and gorilla marketing or To showcase Each other products strengths and hide weakness Games continue here Also in ETL world of data warehousing. Each one has its own merits and demerits.

Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1. 
https://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 https://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
https://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
https://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
https://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/