This topic contains 0 replies, has 1 voice, and was last updated by  hena 8 months, 2 weeks ago.

Viewing 1 post (of 1 total)
  • Author
  • #3110 Reply


    answer) If the tables are bucketed by a particular column and these tables are being used in joins then we can enable bucketed map join to improve the performnce. SMB joins are used wherever the tables are sorted and bucketed. The join boils down to just merging the already sorted tables,allowing this operation to be faster than an ordinary map-join

    when we bucket the data by the join keys, you could use the Bucket Map Join. For that the amount of buckets in one table must be a multiple of the amount of buckets in the other table. It can be activated by executing set hive.optimize.bucketmapjoin=true; before the query. If the tables don’t meet the conditions, Hive will simply perform the normal Inner Join.
    If both tables have the same amount of buckets and the data is sorted by the bucket keys, Hive can perform the faster Sort-Merge Join. To activate it, you have to execute the following commands:

    set hive.optimize.bucketmapjoin=true;
    set hive.optimize.bucketmapjoin.sortedmerge=true;

    Answer) The maximum number of index entries to read during a query that uses the compact index. Negative value is equivalent to infinity.

    Answer)How many rows in the right-most join operand Hive should buffer before emitting the join result.


    5)hive.zookeeper.quorum localhost.localdomain
    List of ZooKeeper servers to talk to. Used in connection string by JDBC/ODBC clients instead of URI of specific HiveServer2 instance.

    Input formats are playing very important role in Hive performance.Primary choices of Input Format are Text,Sequence File,RC File,ORC .Default is combinehiveinput format.
    InputFormat In Hive
    There are two places where we can specify InputFormat in hive, when creating table and before executing HQL, respectively.

    For the first case, we can specify InputFormat and OutputFormat when creating hive table
    CREATE TABLE example_tbl
    id int,
    name string
    STORED AS INPUTFORMAT ‘org.apache.hadoop.mapred.TextInputFormat’ OUTPUTFORMAT ‘’;

    For the second case, we could set ‘hive.input.format’ before invoking a HQL:
    hive> set;
    hive> select * from example_tbl where id > 10000;
    If we set this parameter in hive-site.xml, it will be the default Hive InputFormat provided not setting ‘hive.input.format’ explicitly before the HQL.

    Maximum number of worker threads in the Hive Metastore Server’s thread pool
    The hive.default.fileformat configuration parameter determines the format to use if it is not specified in a CREATE TABLE or ALTER TABLE statement. Text file is the parameter’s default value.

    Minimum number of worker threads in the Thrift server’s pool.

    The authenticator manager class name to be used in the metastore for authentication. The user-defined authenticator class should implement interface

Viewing 1 post (of 1 total)
Reply To: CDH4 hive properties (40-50)
Your information:


Your Name (required)

Your Email (required)


Phone No

Your Message


  • No products in the cart.