Hadoop Hive Web Interface

I’ve been playing with Hive recently and liking what I’ve found. In theory at least it provides a very nice, simple way of getting into analysing large data sets. To make it even easier to show other people what you’re up to Hive has a nascent web interface with a little documentation on the wiki

image of hive web ui

On the one hand it’s rather simple at this point, but that should be easily enought to prettify given a bit of time. The bigger problem was getting it working in the first place. What follows worked for me using the latest cloudera packages on debian testing. I’m assuming you already have Hive and Hadoop installed, the basic packages worked fine for me here.

Next up you’ll need the JDK (not just the JRE) as their is some compilation that will go on the first time you run the web interface.

<% syntax_colorize :bash, type=:coderay do %> apt-get install ant sun-java6-jdk <% end %>

Next up I had to modify the installed /etc/hive/conf/hive-site.xml file as follows:

I changed this:

<% syntax_colorize :xml, type=:coderay do %> hive.metastore.uris file:///var/lib/hivevar/metastore/metadb/ Comma separated list of URIs of metastore servers. The first server that can be connected to will be used. <% end %>

To this. Note the hivevar path doesn’t exist so I’m not sure if this was a typo in the source.

<% syntax_colorize :xml, type=:coderay do %> hive.metastore.uris file:///var/lib/hive/var/metastore/metadb/ Comma separated list of URIs of metastore servers. The first server that can be connected to will be used. <% end %>

I also change the following section regarding the metastore name:

<% syntax_colorize :xml, type=:coderay do %> javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/var/lib/hive/metastore/${user.name}_db;create=true JDBC connect string for a JDBC metastore <% end %>

To this, with a fixed name. When using the above confirguration the file was actually called ${user.name} rather than my username being subsituted in. Elsewhere this seems to work fine.

<% syntax_colorize :xml, type=:coderay do %> javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true JDBC connect string for a JDBC metastore <% end %>

I’m not convinced the above two changes are needed but have left them here just in case. The main tricky part is making sure a load of environment variables are correctly set. The following worked for me:

<% syntax_colorize :bash, type=:coderay do %> export ANT_LIB=/usr/share/ant/lib export HIVE_HOME=/usr/lib/hive export HADOOP_HOME=/usr/lib/hadoop export PATH=$PATH:$HADOOP_HOME/bin export JAVA_HOME=/usr/lib/jvm/java-6-sun <% end %>

All being well that should allow you to run the hive command with the web interface like so:

<% syntax_colorize :bash, type=:coderay do %> hive –service hwi <% end %>

That should bring up a webserver on port 9999 where you should see something similar to the screenshot above.