Pig Script in Local Mode
Step1: Writing a Script
- Open an editor (e.g. gedit) in your Cloudera Demo VM environment.
- Write the following command to create ‘sample.pig’ file inside the home directory of cloudera user:
Command: gedit sample.pig
Let’s write few PIG commands in the sample script!
Let’s say our task is to read data from a data file and to display the required contents on the console as output.
The sample data file contains the following data:
Save the text file with the name ‘information.txt’
The file contains five columns FirstName, LastName, MobileNo, City, and Profession separated by tab key. Our task is to read the content of this file and display First Name, Mobile Number and Profession of a contact.
To process this data using Pig, this file should be present in local file system because we are working in local mode of Pig.
Edit the Pig script (sample.pig) to include following commands:
Here, the data-set file ‘information.txt’ is present in cloudera directory and hence, we have specified the file path ‘/home/cloudera/information.txt’.
Save and close the file.
- The first command loads the file information.txt into variable A with indirect schema
(FName, LName, MobileNo, City, and Profession).
- The second command loads the required data from variable A to variable B.
- The third line displays the content of variable B on the terminal/console.
Step 2: Execute the Pig Script
To execute the pig script in local mode, run the following command:
Command: pig –x local sample.pig
Review the result.
Congratulations on successful execution of the Pig script and getting a step ahead in Pig Programming!