Hive details in brief

Hive Code up and Why Hive+Hbase and where PIG comes?

The Big Data Trends

Hive in Nutshell:

Cons Hive
-not real time (suited for batch and large
datasets) analytics and agregation
-high latency
-scehma on read(fast load/flexibility,slow
query time)
-no row level insert/update/delete
-do not support transactions limited subquery
HBase(row level update,rapid quries, row
level transactions)
PIG(data flow language)
gr8 ETL but not good for ad-hoc quering

HiveCLI (beeline) –> HiveServer2(thrift)
–>jdbc–>HIVE
Hive:metastore{system catlog
(tables,schema,columns,partition)
Metastore(mapping file structure to tabular
form in hive).

HCatalog
built on Hive–>metastore–>Hcatalog (Hive
DDL)
HCatalog (Hive CLI)–(read/write
interface)–> pig/mapreduce

Directory:
• bin – executables to
start/stop/configure/check status of hive,
various scripts
• conf – Hive environment, metastore,
security, and log configuration files
• doc – Hive documentation and Hive examples
• lib – server’s JAR files
• man – man page information
• scripts – scripts for upgrading derby and
MySQL metastores from one version of Hive to
the next

Hive metadata: services/datastore
configuration:
1. embedded (for test environment)

View original post 259 more words

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s