Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
  • #693

    I made a .java file. It is executing properly in the Eclipse, reading data from local folder and writing out to a separate folder after creating it. However, when I make a jar file (also tried making a Runnable jar file), and try to run it on Hadoop
    (command:: hadoop jar [local-jar_file-path] [input folder] [output folder])
    then i get the following error.

    How do I run a custom executable file in Hadoop.

    ERROR security.UserGroupInformation: PriviledgedActionException as:user cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:8020/data/test_txt2 already exists
    Exception in thread “main” org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:8020/data/test_txt2 already exists
    at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(
    at org.apache.hadoop.mapred.JobClient$
    at org.apache.hadoop.mapred.JobClient$
    at Method)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(
    at org.apache.hadoop.mapred.JobClient.submitJob(
    at org.apache.hadoop.mapred.JobClient.runJob(
    at init_code.SampleWordCount.main(
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at org.apache.hadoop.util.RunJar.main(



    just to also mention.. the inputs to the program are:
    (1) /data/test_txt2 which is the location on hdfs where input file is
    (2) /data/test_txt2/out which is the location where i need the optput file. The “out” folder does not exist

    the code for which the jar file is made is same as wordcount. The jar file runs properly when i use “java -jar …” command in local file system.

    If i use the jar file provided by hadoop, “hadoop-example-1.2.0.jar” and provide input in similar manner.. then things run properly!


    You are getting above exception because your output directory (/data/test_txt2/out) is already created/existing in the HDFS file system

    Just remember while running map reduce job do mention the output directory which is already their in HDFS. Please refer to the following instruction which would help you to resolve this exception

    To run a map reduce job you have to write a command similar to below command

    $hadoop jar {name_of_the_jar_file.jar} {package_name_of_jar} {hdfs_file_path_on_which_you_want_to_perform_map_reduce} {output_directory_path}

    Example:- hadoop jar facebookCrawler.jar com.wagh.wordcountjob.WordCount /home/facebook/facebook-cocacola-page.txt /home/facebook/crawler-output

    Just pay attention on the {output_directory_path} i.e. /home/facebook/crawler-output . If you have already created this directory structure in your HDFS than Hadoop EcoSystem will throw the exception “org.apache.hadoop.mapred.FileAlreadyExistsException”.

    Solution:- Always specify the output directory name at run time(i.e Hadoop will create the directory automatically for you. You need not to worry about the output directory creation). As mentioned in the above example the same command can be run in following manner –

    “hadoop jar facebookCrawler.jar com.wagh.wordcountjob.WordCount /home/facebook/facebook-cocacola-page.txt /home/facebook/crawler-output-1”

    So output directory {crawler-output-1} will be created at runtime by Hadoop eco system.

    For more details you can refer to : –

Viewing 3 posts - 1 through 3 (of 3 total)
  • The forum ‘Hadoop Online Session’ is closed to new topics and replies.