Tuesday, April 26, 2011

Key trouble with KeyspaceService.multiGetSlice

I've added some additional debugging code and am confused by the
result. It appears that the result is being changed after the public
Map> execute(Cassandra.Client cassandra) of
multigetSlice is called.

diff --git a/core/src/main/java/me/prettyprint/cassandra/service/
KeyspaceServiceImpl.java b/core/src/main/java/me/prettyprint/cassandra/
service/KeyspaceServiceImpl.java
index 601d487..0ee8301 100644
--- a/core/src/main/java/me/prettyprint/cassandra/service/
KeyspaceServiceImpl.java
+++ b/core/src/main/java/me/prettyprint/cassandra/service/
KeyspaceServiceImpl.java
@@ -390,8 +390,11 @@ public class KeyspaceServiceImpl implements
KeyspaceService {

Map> result = new
HashMap>();
for (Map.Entry>
entry : cfmap.entrySet()) {
- result.put(entry.getKey(),
getColumnList(entry.getValue()));
+ System.out.println("hector raw: " + entry.getKey());
+ System.out.println("hector str: " +
StringSerializer.get().fromByteBuffer(entry.getKey()));
+ result.put(entry.getKey(),
getColumnList(entry.getValue()));
}
+
return result;
} catch (Exception e) {
throw xtrans.translate(e);
@@ -399,7 +402,12 @@ public class KeyspaceServiceImpl implements
KeyspaceService {
}
};
operateWithFailover(getCount);
- return getCount.getResult();
+ Map> res = getCount.getResult();
+ for (Map.Entry> entry : res.entrySet())
{
+ System.out.println("final hector raw: " + entry.getKey());
+ System.out.println("final hector str: " +
StringSerializer.get().fromByteBuffer(entry.getKey()));
+ }
+ return res;

}

2011-04-13 14:51:42,456 | INFO |
STDOUT | [example1.com/,
example2.com/]
2011-04-13 14:51:42,456 | INFO |
STDOUT | query keys
2011-04-13 14:51:42,456 | INFO |
STDOUT | [example1.com/,
example2.com/]
2011-04-13 14:51:42,457 | DEBUG |
me.prettyprint.cassandra.connection.HThriftClient | Transport open
status true for client CassandraClient
2011-04-13 14:51:42,457 | DEBUG |
me.prettyprint.cassandra.connection.HThriftClient | keyspace reset
from unicron to unicron
2011-04-13 14:51:42,463 | INFO |
STDOUT | hector raw:
java.nio.HeapByteBuffer[pos=61 lim=74 cap=80]
2011-04-13 14:51:42,463 | INFO |
STDOUT | hector str: example2.com/
2011-04-13 14:51:42,463 | INFO |
STDOUT | hector raw:
java.nio.HeapByteBuffer[pos=39 lim=52 cap=80]
2011-04-13 14:51:42,463 | INFO |
STDOUT | hector str: example1.com/
2011-04-13 14:51:42,464 | DEBUG |
me.prettyprint.cassandra.connection.HThriftClient | Transport open
status true for client CassandraClient
2011-04-13 14:51:42,464 | DEBUG |
me.prettyprint.cassandra.connection.ConcurrentHClientPool | Status of
releaseClient CassandraClient to queue: true
2011-04-13 14:51:42,464 | INFO |
STDOUT | final hector raw:
java.nio.HeapByteBuffer[pos=74 lim=74 cap=80]
2011-04-13 14:51:42,464 | INFO |
STDOUT | final hector str:
2011-04-13 14:51:42,464 | INFO |
STDOUT | slice keyset:
2011-04-13 14:51:42,464 | INFO |
STDOUT | [?multiget_slice
example1.com/
example2.com/
]

On Apr 13, 1:59 pm, cburroughs wrote:
> Thanks Nate. The code was written with a bit too much enthusiasm for
> the guava library, but I think it is similar. The major difference I
> saw was keySerializer.fromBytesMap, but for ByteBufferSerializer
> thats a noop. The keys of the "slice" variable below have the "?
> multiget_sliceUNPRINTABLEexample1.com/UNPRINTABLE"example2.com/"
> form. So I don't think it could be a problem with later code. If I
> follow the code from m.p.c.model.thrift.ThriftMultigetSliceQuery I
> see entry.getKey() used on thriftRet without anything special done to
> it.
>
> Example code. Keys are all urls.
>
> List keys =
> Lists.newArrayList(Iterables.transform(urls,
>
> EncodingUtils.STRING_BYTE_BUFFERER));
> Map> slice =
> ks.multigetSlice(keys,
>
> COLUMN_FAMILY, COLS_PREDICATE);
> return
> Iterables.transform(slice.entrySet(), RECORD_MAKER);
>
> -----
> static private final Function>,
> IUrlRecord> RECORD_MAKER
> = new Function>, IUrlRecord>()
> {
>
> @Override
> public IUrlRecord apply(Entry>
> row) {
> return makeRecord(row);
> }
>
> };
>
> -----
> static UrlRecord makeRecord(Entry> row) {
> return makeRecord(EncodingUtils.bbString(row.getKey()),
> row.getValue());
> }
>
> On Apr 13, 1:26 pm, Nate McCall wrote:
>
> > Take a look at:
> > m.p.c.model.thrift.ThriftMultigetSliceQuery, particularly lines 61
> > throught 70 and see if this approach is similar to how you are dealing
> > with the results from KeyspaceService.
>
> > On Wed, Apr 13, 2011 at 12:16 PM, cburroughs
wrote:
> > > I'm having trouble with Hector's KeyspaceService.multiGetSlice (yes I
> > > would rather be using the v2 api, but old code needs maintenance
> > > without large changes )
>
> > > Map> multigetSlice(List keys,
> > > ColumnParent columnParent,
> > > SlicePredicate predicate) throws HectorException;
>
> > > This List value seems correct as far as I can tell and works
> > > for keys that are both present and missing. However, the key of the
> > > Map is no the key requested. The ByteBuffer key I get back looks like
> > > a serialization of the multi_get_slice itself (see below). It looks
> > > like RowImpl doesn't do anything special with the result it gets back
> > > and I don't see a "special case multi_get_slice de-serializer"
> > > anywhere.
>
> > > For example, a request for the key "example.com" (a row key that
> > > exists) returns this as a key :
>
> > > \u0001\u0000\u0002\u0000\u0000\u0000\u000Emultiget_slice
> > > \u0000\u0000\u0000\u0004\r\u0000\u0000\u000B\u000F
> > > \u0000\u0000\u0000\u0001\u0000\u0000\u0000\fexample.com/\f
> > > \u0000\u0000\u0000\u0003\f\u0000\u0001\u000B
> > > \u0000\u0001\u0000\u0000\u0000\u0006shares\u000B
> > > \u0000\u0002\u0000\u0000\u0000\b
> > > \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0015\n
> > > \u0000\u0003\u0000\u0000\u0001. Ê\u0000\u0000\f\u0000\u0001\u000B
> > > \u0000\u0001\u0000\u0000\u0000\u0005title\u000B
> > > \u0000\u0002\u0000\u0000\u0000\u001CIANA — Example domains\n
> > > \u0000\u0003\u0000\u0000\u0001. Ê\u0000\u0000\f\u0000\u0001\u000B
> > > \u0000\u0001\u0000\u0000\u0000\u0003url\u000B
> > > \u0000\u0002\u0000\u0000\u0000\u0013http://example.com/\n
> > > \u0000\u0003\u0000\u0000\u0001. Ê\u0000\u0000\u0000
>
> > > And an example for two row keys that don't:
>
> > > [{"key":" \u0001\u0000\u0002\u0000\u0000\u0000\u000Emultiget_slice
> > > \u0000\u0000\u0000\u0002\r\u0000\u0000\u000B\u000F
> > > \u0000\u0000\u0000\u0002\u0000\u0000\u0000\rexample1.com/\f
> > > \u0000\u0000\u0000\u0000\u0000\u0000\u0000\rexample2.com/\f
> > > \u0000\u0000\u0000\u0000\u0000"},{"key":"
> > > \u0001\u0000\u0002\u0000\u0000\u0000\u000Emultiget_slice
> > > \u0000\u0000\u0000\u0002\r\u0000\u0000\u000B\u000F
> > > \u0000\u0000\u0000\u0002\u0000\u0000\u0000\rexample1.com/\f
> > > \u0000\u0000\u0000\u0000\u0000\u0000\u0000\rexample2.com/\f
> > > \u0000\u0000\u0000\u0000\u0000"}]
>
> > > This is with 0.7.0-28.
>
>

1 comment:

  1. I had this problem when using the raw thrift client.
    The problem was using new String(ByteBuffer.array()) which returns the entire array without regard to offset.

    Charset.forName(UTF_8).decode(byteBuffer).toString() works.

    ReplyDelete