Sunday, December 9, 2012

Oozie Installation

Oozie Installation

 I tried to install oozie and run a workflow from the samples. I thought this was a little cumbersome effort, thought I would post it on my blog which might help someone.

Pre-Requisties:-  I use Mac OS X Lion, and here is what I have at hand before.
 a) Hadoop Installed at the location, /Users/hadoop/hadoop-0.20.2, referred as HADOOP_HOME
 b) Java Installed with version 1.6.0_33
 c) Maven 3.0* version installed under /Users/hadoop/apache-maven-3.0.3
 c) For readability, when I mention Hadoop config, it is under, $HADOOP_HOME/conf.

1) Download Oozie from the site:-  Oozie Download

2) Extracted the below tar  to /Users/hadoop/, referred as OOZIE_HOME

3) Open the terminal, go to the directory, $OOZIE_Home/bin

4) Run the script, mkDistro.sh -DskipTests:- This step will download all the jars required for Oozie and creates the incubating packages under the $OOZIE_HOME/distro/target.

5) Download the zip extJs-2.2.zip, which is the app for Oozie to open up a console for oozie and look at the jobs that can are scheduled, run etc. You can download this from the link:- ExtJs2.2.zip

6) Create the oozie.war, by running the following command:-  ./$OOZIE_HOME/distro/target/oozie-3.2.0-incubating-distro/oozie-3.2.0-incubating/bin/oozie-setup.sh -hadoop 0.20.2 / -extjs /Users/hadoop/ext-2.2.zip

7) The above steps makes the oozie.war from the incubating project, along with the hadoop required jars, and also makes the ext-js to the oozie-server.

8) Now copy the newly created oozie.war to webapps where tomcat runs.
cp ./$OOZIE_HOME/distro/target/oozie-3.2.0-incubating-distro/oozie-3.2.0-incubating/oozie-server/webapps/oozie.war $OOZIE_HOME/webapp/src/main/webapp/oozie.war

9) Change the configuration, $OOZIE_HOME/distro/target/oozie-3.2.0-incubating-distro/oozie-3.2.0-incubating/conf/oozie-site.xml. 

     <property>
        <name>oozie.service.JPAService.create.db.schema</name>
        <value>true</value>
        <description>
            Creates Oozie DB.
            If set to true, it creates the DB schema if it does not exist. If the DB schema exists is a NOP.
            If set to false, it does not create the DB schema. If the DB schema does not exist it fails start up.
        </description>
    </property>

10) Edit the Hadoop config core-site.xml, and add the below lines, which will allow oozie to connect to Hadoop.
<property>
     <name>hadoop.proxyuser.{$user.name}.hosts</name>                                               
      <value>*</value>
 </property>
       
 <property>
      <name>hadoop.proxyuser.{$user.name}.groups</name>
      <value>*</value>
 </property>

11) Please do run hadoop as well now.
12) Start the Oozie Server by running the following command:- ./$OOZIE_HOME/distro/target/oozie-3.2.0-incubating-distro/oozie-3.2.0-incubating/bin/oozie-start.sh. You can check if it is running by entering this on the browser:- http://localhost:11000/oozie.

Now, if everything runs smooth, lets see if we can run an example.

Steps to run an example on Oozie.
1)  Extract the examples.jar and place it in $OOZIE_HOME/examples/target/.
2) Inside the above directory, you will see the different apps, where you can modify the job.properties in each application,which you want to run, and put it on hadoop.
 2.1) For. E.g.:- Open up, job.properties, in /$OOZIE_HOME/examples/target/examples/apps/map-reduce/job.properties. 
    2.1.1) nameNode=hdfs://localhost:8020 (This information can be found in hadoop config, core-site.xml)

jobTracker=localhost:8021 (This information can be found in hadoop config, mapred-site.xml)
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/user/hadoop/${examplesRoot}/apps/map-reduce
outputDir=map-reduce
2.2) hadoop fs -put $OOZIE_HOME/examples/target/examples /users/hadoop/examples

3) Run the examples:- ./$OOZIE_HOME/bin/oozie -job oozie -config /$OOZIE_HOME/examples/target/examples/apps/map-reduce/job.properties -run

4) If your job is successfully submitted, you can see it running on oozie console

This is where my troubles started. 
1) 500 Internal Server Error:- When I checked in the oozie.log, I see that it requires all the services that are defined in oozie-default.xml to oozie-site.xml as below.

<property>
        <name>oozie.services</name>
        <value>
            org.apache.oozie.service.SchedulerService,
            org.apache.oozie.service.InstrumentationService,
            org.apache.oozie.service.CallableQueueService,
            org.apache.oozie.service.UUIDService,
            org.apache.oozie.service.ELService,
            org.apache.oozie.service.AuthorizationService,
            org.apache.oozie.service.HadoopAccessorService,
            org.apache.oozie.service.MemoryLocksService,
            org.apache.oozie.service.DagXLogInfoService,
            org.apache.oozie.service.SchemaService,
            org.apache.oozie.service.LiteWorkflowAppService,
            org.apache.oozie.service.JPAService,
            org.apache.oozie.service.StoreService,
            org.apache.oozie.service.CoordinatorStoreService,
            org.apache.oozie.service.SLAStoreService,
            org.apache.oozie.service.DBLiteWorkflowStoreService,
            org.apache.oozie.service.CallbackService,
            org.apache.oozie.service.ActionService,
            org.apache.oozie.service.ActionCheckerService,
            org.apache.oozie.service.RecoveryService,
            org.apache.oozie.service.PurgeService,
            org.apache.oozie.service.CoordinatorEngineService,
            org.apache.oozie.service.BundleEngineService,
            org.apache.oozie.service.DagEngineService,
            org.apache.oozie.service.CoordMaterializeTriggerService,
            org.apache.oozie.service.StatusTransitService,
            org.apache.oozie.service.PauseTransitService,
            org.apache.oozie.service.GroupsService,
            org.apache.oozie.service.ProxyUserService
        </value>
        <description>
            All services to be created and managed by Oozie Services singleton.
            Class names must be separated by commas.
        </description>
    </property>

2) This solved my problem to some extent. Then, I saw some NPE errors, which made me realize that the oozie.war has not been built properly, when I ran oozie-setup.sh. This is specifically while copying the hadoop lib to oozie lib. I had to copy the jar, guava-r09-jarjar.jar, and re-create the oozie.war. 
    2.1) Copy the jar from hadoop to $OOZIE_HOME/webapp/src/main/webapp/WEB-INF/lib.
    2.2) jar cf oozie.war *
    2.3) Restart Hadoop and Oozie and run the examples with the command:-
            ./$OOZIE_HOME/bin/oozie -job oozie -config /$OOZIE_HOME/examples/target/examples/apps/map-reduce/job.properties -run

If the job is submitted, you can see the console with the job submitted and its progress.



Have fun OOZEIIIINNNNGGGG!!!!!


1 comment:

  1. I am able to get the oozie UI and CLI working on Lion 10.7.5 . But when i try to execute the job using oozie CLI i am gettting the following
    error . Any idea what needs to be done to get this working . I am using hadoop 1.2.1 and tried oozie 3.5 and 3.2 . In both cases is get this e
    rror on my mac .
    mymacbook:demo hadoop$ oozie job -oozie http://mymacbook.local:11000/oozie -config job.properties -run
    Error: E0501 : E0501: Could not perform authorization operation, Failed on local exception: java.io.EOFException; Host Details : local host is: "mymacbook.local/127.0.0.1"; destination host is: ""mymacbook.local":8020;

    ReplyDelete