Pig Script in Local Mode

Pig Script in Local Mode Step1: Writing a Script Open an editor (e.g. gedit) in your Cloudera Demo VM environment. Write the following command to create ‘sample.pig’ file inside the home directory of cloudera user: Command:  gedit sample.pig Let’s write few PIG commands in the sample script! Let’s say our task is to read data from a […]


Hadoop – the solution for deciphering the avalanche of Big Data – has come a long way from the time Google published its paper on Google File System in 2003 and MapReduce in 2004. It created waves with its scale-out and not scale-up strategy. Inroads from Doug Cutting and team at Yahoo and Apache Hadoop […]

HIVE-QL -word count

Word Count In Hive In this post I am going to discuss how to write word count program in Hive.Assume we have data in our table like below This is a Hadoop Post and Hadoop is a big data technology and we want to generate word count like below a 2 and 1 Big 1 […]

PIG-LATIN – word Count

Word Count in Pig Latin In this Post, we learn how to write word count program using Pig Latin. Assume we have data in the file like below. This is a hadoop post hadoop is a bigdata technology and we want to generate output for count of each word like below (a,2) (is,2) (This,1) (class,1) […]

Linux Interview Questions For Beginners: Top Questions You Must Prepare For In 2016

System Administrator, Storage Administrator, Web Applications Expert, Database Administrator – these are just a handful of job titles that have seen an upsurge since October 2015 (according to Indeed.com). Job opportunities are skyrocketing, and with organizations adopting Linux far and wide, Linux Administrator roles are getting hard to fill. The signal has never been clearer […]

Working With Hive Complex Data Types

In this blog, we will discuss the working of complex data types in Hive. Before we move ahead you can go through the below link blogs to gain more knowledge on Hive and its working. Beginners Guide For Hive Perform Word Count Job Using Hive Pokemon Data Analysis Using Hive Bucketing in Hive – Let’s […]

Java – Wrapper Class vs Primitive Class

Each of Java’s eight primitive data types has a class dedicated to it. These are known as wrapper classes because they “wrap” the primitive data type into an object of that class. The wrapper classes are part of the java.lang package, which is imported by default into all Java programs. The wrapper classes in java […]

UPI in Boosting a Cashless Indian Economy

After the demonetization hullabaloo last year, India is slowly inching its way to a cashless economy and UPI is a buzzword that is resonating through the air. So what is UPI or Unified Payment Interface? UPI is a payment system that allows money transfer between any two bank accounts via smartphone. The mechanism allows a […]

How to Run Hive Scripts?

Being a Data Warehousing package built on top of Hadoop, Apache Hive is increasingly getting used for data analysis, data mining and predictive modeling. In this post, let’s look at how to run Hive Scripts. In general, we use the scripts to execute a set of statements at once. Hive Scripts are used pretty much […]

Differance b/w RHEL & DEBIAN – LINUX

There are hundreds of Linux distributions available, for free (in the other sense). Every Linux Enthusiast has a special taste for certain distribution, at some point of time. The taste for specific distribution largely depends upon the intended area of application. Some the famous Linux distributions and its area of application are listed below. Fedora: Cutting Edge […]

Sentiment Analysis on Demonetization – Pig Use Case

Let us find out the views of different people on the demonetization by analysing the tweets from twitter. Here is the dataset where twitter tweets are gathered in CSV format. You can download the dataset from the below link https://drive.google.com/open?id=0B2nmxAJLHEE8amhpbTl5STEzZTQ Now we will load the data into pig using PigStorage as follows: 1 load_tweets = […]

Importance of Big Data in The Banking (BFSI) Segment

If in 2015 banking and financial markets firms were infants in utilizing Big Data to effectively transform their processes and organizations. They have turned out to be toddlers in 2016 as they inch forward from various stages of their activity with Big Data. It is encouraging to see banks continuing to make progress on drafting […]