Sunday, June 26, 2011

Setting up Cassandra in Windows


I was surprised how easy it is to set up and run the Apache key-value store Cassandra on Windows. There is no self-contained Windows-based installer, but I got it running from the extracted archive quite easily. All I wanted to do is to get it up and running quickly and do some basic testing. No fancy config, I used the out-of-the-box config to set up a single node on my Windows Vista dev machine. Here is how it went.

Download and unzip the archive

Just go to the Cassandra download page and get the latest binary archive from there. I downloaded the 0.6.1 archive: apache-cassandra-0.6.1-bin.tar.gz and unzipped it to c:\cassandra.

Quick setup

The storage-conf.xml file contains some important config information, including the data schema. Let’s not change the schema for now. All we needed to do is to tell Cassandra where to put the commit logs and the data on the hard disk. I created a new directory cassandra-data and created in it two sub-directories data and commitlog. So let’s modify the config file located in c:\cassandra\conf\storage-conf.xml and modify the corresponding elements:
<CommitLogDirectory>c:\cassandra-data\commitlog</CommitLogDirectory>
<DataFileDirectories>
  <DataFileDirectory>c:\cassandra-data\data</DataFileDirectory>
</DataFileDirectories>
That’s it! Time to run the server :-)

Running the server

First of all, we need to check that the environment variable JAVA_HOME is correctly set, Cassandra scripts rely on it.
I switched to the bin directory and I launched the cassandra.bat command with -f switch (-f keeps the server in “foreground” mode, on order to have the logs in stdout).
C:\cassandra\bin>cassandra.bat -f
Starting Cassandra Server
Listening for transport dt_socket at address: 8888
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/cassandra/thrift/CassandraDaemon
Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.thrift.CassandraDaemon
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
Ouch! Something went wrong. After having a quick look at the cassandra.bat script, I figured out that it assumes that the current directory is the Cassandra “home directory”, i.e. c:\cassandra in my case. Let’s give it a second try:
c:\cassandra>bin\cassandra.bat -f
Starting Cassandra Server
Listening for transport dt_socket at address: 8888
INFO 16:27:12,627 Auto DiskAccessMode determined to be standard
INFO 16:27:13,123 Saved Token not found. Using 47823312181423841445299406225505462239
INFO 16:27:13,124 Saved ClusterName not found. Using Test Cluster
INFO 16:27:13,134 Creating new commitlog segment c:\cassandra-data\commitlog\CommitLog-1272205633134.log
INFO 16:27:13,256 Starting up server gossip
INFO 16:27:13,369 Binding thrift service to localhost/127.0.0.1:9160
INFO 16:27:13,418 Cassandra starting up...
Hurray! Seems to be working. It says it’s listening on localhost port 9160, let’s verify this.

Hello World!

Cassandra comes with a nice test utility: Cassandra CLI. It’s an interactive command line that you can use to put/get values in Cassandra. Let’s open a second shell window and run it.
c:\cassandra>bin\cassandra-cli.bat
Starting Cassandra Client
Welcome to cassandra CLI.
Type 'help' or '?' for help. Type 'quit' or 'exit' to quit.
Cool! It even has a friendly help command. First, let’s connect to our server using the connect command.
cassandra> connect localhost/9160
Connected to: "Test Cluster" on localhost/9160
We succeeded to connect to the Cassandra server. Before going further, let’s have a quick look at the schema. Remember, it is defined in the same file we modified earlier in the setup step: storage-conf.xml.
<Keyspace Name="Keyspace1">
  <ColumnFamily Name="Standard1" CompareWith="BytesType" />
...
</Keyspace>
Cassandra’s Data model nomenclature is not obvious. In a nutshell: Keyspace1 is the name of the schema, Standard1 is a collection of rows, each row has an ordered set of key-value pairs. Let’s put some data there:
cassandra> set Keyspace1.Standard1['0']['greeting'] = 'Hello World!'
Value inserted.
Cassandra obeyed and stored our data: in Column Family Standard1 at row ’0′ we inserted the key-value pair (‘greeting’, ‘Hello World’). Let’s try to query Cassandra and see if the data is actually there.
cassandra> get Keyspace1.Standard1['0']['greeting']
=> (column=6772656574696e67, value=Hello World!, timestamp=1272208335786000)

No comments:

Post a Comment