Integrating Hive with HBase

A brief introduction to Hive: Apache Hive is a data warehouse software that facilitates querying and managing of large datasets residing in distributed storage. Hive provides SQL-like language called HiveQL for querying the data. Hive is considered friendlier and more familiar to users who are used to using SQL for querying data. Hive is best […]

Beginner’s Guide for Spark

In this Blog we will be discussing the basics of Spark’s functionality and its installation. Apache spark is a cluster computing framework which runs on top of the Hadoop eco-system and handles different types of data. It is a one stop solution to many problems. Spark has rich resources for handling the data and most […]