Beginner’s Guide for Spark

In this Blog we will be discussing the basics of Spark’s functionality and its installation. Apache spark is a cluster computing framework which runs on top of the Hadoop eco-system and handles different types of data. It is a one stop solution to many problems. Spark has rich resources for handling the data and most […]

Importance of Big Data in The Banking (BFSI) Segment

If in 2015 banking and financial markets firms were infants in utilizing Big Data to effectively transform their processes and organizations. They have turned out to be toddlers in 2016 as they inch forward from various stages of their activity with Big Data. It is encouraging to see banks continuing to make progress on drafting […]

MapReduce Use Case – Uber Data Analysis

In this post, we will be performing analysis on the Uber dataset in Hadoop using MapReduce in Java. The Uber dataset consists of four columns; they are dispatching_base_number, date, active_vehicles and trips. You can download the dataset from here- Problem Statement 1: In this problem statement, we will find the days on which each […]

Scala Tutorial Part 2- Basics of Scala

In our previous blogs we covered the introduction to functional programming CLICK HERE , in this blog we will be discussing about the Control structures and the Functions in Scala. Here we will be explaining about the Arithmetic, relational, logical operators, if-else statement, while loops, For loops and methods. Arithmetic Operators As like in other languages, […]

Run Your mapreduce code locally

In this blog we have explained in detail about how to run your mapreduce code locally in eclipse in any linux machine. After reading this blog you can easily run your mapreduce codes in eclipse without starting any of your hadoop daemons. Before getting started with the things let us learn something about local mode […]

Beginner’s Guide for Sqoop

In this blog we will be discussing about the basics of Sqoop. You will also learn how to import data from RDBMS to HDFS and to export data from HDFS into RDBMS using Sqoop. Note: We hope that Sqoop is already installed in your system and all the required connectors and jar files are imported […]