Monday, June 27, 2011

Installing Cassandra on UBuntu debian via ppa repository


Cassandra is a distributed database with a BigTable data model running on aDynamo like infrastructure. It is column-oriented and allows for the storage of relatively structured data. It has a fully decentralized model; every node is identical and there is no single point of failure. It's also
extremely fault tolerant; data is replicated to multiple nodes and across data centers. Cassandra is also very elastic; read and write throughput increase linearly as new machines are added.
Cassandra was open sourced by Facebook in 2008, where it was designed by Avinash Lakshman (one of the authors of Amazon's Dynamo) and Prashant Malik ( Facebook Engineer ). Now it is in production use at Rackspace, Digg, Facebook, Twitter, Cisco, Mahalo, Ooyala, and more companies that have large, active data sets. The largest production cluster has over 100 TB of data in over 150 machines.

To install Cassandra on Debian or other Debian derivatives like Ubuntu, LinuxMint..., use the following:
1- First upgrade your software :
sudo apt-get upgrade
2- Now open sources.list
sudo vi /etc/apt/sources.list
3- add the following lines to your source.list
deb http://www.apache.org/dist/cassandra/debian unstable main
deb-src http://www.apache.org/dist/cassandra/debian unstable main
4- Run update
sudo apt-get update 
Now you will see an error similar to this:

GPG error: http://www.apache.org unstable Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F758CE318D77295D

This simply means you need to add the PUBLIC_KEY. You do that like this:

gpg --keyserver wwwkeys.eu.pgp.net --recv-keys F758CE318D77295D
gpg --export --armor F758CE318D77295D | sudo apt-key add -

5- Run update again and install cassandra
sudo apt-get update && sudo apt-get install cassandra
6- Now start Cassandra :
sudo /etc/init.d/cassandra start

Note :
-By default, Cassandra uses 7000 for cluster communication, 9160 for clients (Thrift), and 8080 for JMX. These are all editable in the configuration file or bin/cassandra.in.sh (for JVM options). All ports are TCP
- The configuration files are located in /etc/cassandra

3 comments:

  1. This doesn't work. There is no directory "unstable" inside http://www.apache.org/dist/cassandra/debian/dists/
    I get a 403 forbidden error.

    ReplyDelete
  2. yeah I too get the 403 forbidden error.

    ReplyDelete
  3. I got the same 403 error, them I cracked it.
    try these one
    deb http://www.apache.org/dist/cassandra/debian 10x main
    deb-src http://www.apache.org/dist/cassandra/debian 10x main

    I'm assuming 10x is the version number, if that is the case there are four other versions
    06x
    07x
    08x and
    sid (don't know what this one is)

    ReplyDelete