Pig Script in Local Mode

Pig Script in Local Mode Step1: Writing a Script Open an editor (e.g. gedit) in your Cloudera Demo VM environment. Write the following command to create ‘sample.pig’ file inside the home directory of cloudera user: Command:  gedit sample.pig Let’s write few PIG commands in the sample script! Let’s say our task is to read data from a […]


Hadoop – the solution for deciphering the avalanche of Big Data – has come a long way from the time Google published its paper on Google File System in 2003 and MapReduce in 2004. It created waves with its scale-out and not scale-up strategy. Inroads from Doug Cutting and team at Yahoo and Apache Hadoop […]

HIVE-QL -word count

Word Count In Hive In this post I am going to discuss how to write word count program in Hive.Assume we have data in our table like below This is a Hadoop Post and Hadoop is a big data technology and we want to generate word count like below a 2 and 1 Big 1 […]

PIG-LATIN – word Count

Word Count in Pig Latin In this Post, we learn how to write word count program using Pig Latin. Assume we have data in the file like below. This is a hadoop post hadoop is a bigdata technology and we want to generate output for count of each word like below (a,2) (is,2) (This,1) (class,1) […]

Linux Interview Questions For Beginners: Top Questions You Must Prepare For In 2016

System Administrator, Storage Administrator, Web Applications Expert, Database Administrator – these are just a handful of job titles that have seen an upsurge since October 2015 (according to Indeed.com). Job opportunities are skyrocketing, and with organizations adopting Linux far and wide, Linux Administrator roles are getting hard to fill. The signal has never been clearer […]

PIG – Relational Operators

In this instructional post, we will explore and understand few important relational operators in Pig which is widely used in big data industry. Before we understand relational operators, let us see what Pig is. Apache Pig, developed by Yahoo! helps in analyzing large datasets and spend less time in writing mapper and reducer programs. Pig […]

SQL vs No-SQL Database

Are you having lots of queries flooding your mind when you think about SQL and NoSQL? Then this article gives you a complete run down on the major differences between the two, and their benefits. What are the Key differences between SQL and NoSQL? SQL NoSQL SQL Databases are called Relational databases or RDMS – […]

Working With Hive Complex Data Types

In this blog, we will discuss the working of complex data types in Hive. Before we move ahead you can go through the below link blogs to gain more knowledge on Hive and its working. Beginners Guide For Hive Perform Word Count Job Using Hive Pokemon Data Analysis Using Hive Bucketing in Hive – Let’s […]

Java – Wrapper Class vs Primitive Class

Each of Java’s eight primitive data types has a class dedicated to it. These are known as wrapper classes because they “wrap” the primitive data type into an object of that class. The wrapper classes are part of the java.lang package, which is imported by default into all Java programs. The wrapper classes in java […]

UPI in Boosting a Cashless Indian Economy

After the demonetization hullabaloo last year, India is slowly inching its way to a cashless economy and UPI is a buzzword that is resonating through the air. So what is UPI or Unified Payment Interface? UPI is a payment system that allows money transfer between any two bank accounts via smartphone. The mechanism allows a […]

How to Run Hive Scripts?

Being a Data Warehousing package built on top of Hadoop, Apache Hive is increasingly getting used for data analysis, data mining and predictive modeling. In this post, let’s look at how to run Hive Scripts. In general, we use the scripts to execute a set of statements at once. Hive Scripts are used pretty much […]

Differance b/w RHEL & DEBIAN – LINUX

There are hundreds of Linux distributions available, for free (in the other sense). Every Linux Enthusiast has a special taste for certain distribution, at some point of time. The taste for specific distribution largely depends upon the intended area of application. Some the famous Linux distributions and its area of application are listed below. Fedora: Cutting Edge […]