Showing posts with label mahout. Show all posts
Showing posts with label mahout. Show all posts

Monday, April 18, 2011

Simple Java Wicket using Mahout

While continuing the study of mahout, it is inevitable to write code that uses the various algorithms stored with it. So I asked a colleague to help out and one option was to start with a java wicket found here: http://blog.jteam.nl/wp-content/uploads/2010/04/taste-getting-started.zip

Using it is quite straightforward. Download the zip file. Unzip it to some folder (say taste-getting-started in the mahout machine). Using Ubuntu’s Terminal application, go to that folder and run mvn package. After packaging is complete, run mvn jetty:run-war. Once the jetty server is up, you can make use of the wicket by browsing http://localhost:8080/taste-getting-started/movies.

Although the above is the straightforward usage of the said wicket, setting it up requires some effort such as

  • downloading the MovieLens’ 100K Ratings Data Set: http://grouplens.org/system/files/ml-data_0.zip
  • Unzipping it to some other folder.
  • Copying u.data to taste-getting-started/src/main/resources/grouplens/100K/ratings
  • Copying u.item to taste-getting-started/src/main/resources/grouplens/100K/data
  • Executing the initialize_movielens_db.sql using the following command mysql --user=mysqluser --password=mysqlpassword < initialize_movielens_db.sql while at the taste-getting-started/src/main/resources/sql folder.

Oh and if you do not have MySQL yet, install it by issuing the following command sudo apt-get install mysql-server mysql-client.

Thursday, March 31, 2011

Journey to Mahout Land

 

Problem: Research about mahout.

Solution: Obtain a running mahout code but this would require an VirtualBox with Ubuntu instance on my Windows machine.

Here are the steps I underwent:

  1. Download Oracle VM VirtualBox for Windows host.
  2. Install VirtualBox to your Windows machine.
  3. Download ISO of Ubuntu.
  4. Burn ISO to CD/DVD.
  5. Before “Starting” the VirtualBox instance, click on Settings and choose the boot device to CD/DVD. Click OK to save the settings.
  6. Click the Devices menu, select the CD/DVD Devices of the Host.
  7. Click Start to turn on the VirtualBox instance and initiate Ubuntu installation.

 

After Ubuntu has been installed, run the Applications > Accessories >Terminal

  • type java and see the suggested <java-package>
  • type sudo apt-get install <java-package>
  • type javac and see the suggested <javac-package>
  • type sudo apt-get insatll <javac-package>
  • type mvn and see the suggested <maven-package>
  • type sudo apt-get install <maven-package>
  • type svn and see the suggested <subversion-package>
  • type sudo apt-get install <subversion-package>
  • type cd /etc
  • type sudo chmod 777 bash.bashrc
  • type vi bash.bashrc
  • at the end of the file append the path to your java installation

JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk
export JAVA_HOME
PATH=$PATH:$JAVA_HOME/bin
export $PATH

  • reboot the VirtualBox instance
  • type cd ~/Documents
  • type mkdir mahoutcode
  • type cd mahoutcode
  • type svn co http://svn.apache.org/repos/asf/mahout/trunk
  • type cd trunk
  • type mvn install
  • type cd core
  • type mvn compile (or mvn install)

 

Whew! I am just barely starting the journey… more to follow!

sudo apt-get install made my day

 

I just have installed Ubuntu 10.10 on an instance of an Oracle VM VirtualBox on my Windows 7 machine when I tried to install the prerequisites of mahout (http://mahout.apache.org/). So I opened the Terminal app, keyed in java and it suggests that I should use sudo apt-get install <package> to initiate the installation which I did.

Now in the next steps of mahout install procedure I saw the need for subversion. So I tried keying in svn and again Ubuntu suggests the use of sudo apt-get install <package>.

Very well indeed, this suggestion saved me time where to find the correct installers, time to download them, and fire up the installation.