Adventures in the Land of Binaries: Simplifying Cassandra interactions with Helena

Introduction

In the previous article "Talking to Cassandra using Java and DAO pattern" [1] we saw an introduction to NoSQL-based databases and especially one called Apache Cassandra [2]. After a quick explanation of its data model, we analyzed what it takes to implement a Java code which goal was to access and yet modify key-value pairs stored on a Cassandra keyspace. At that time we used a low-level interface API named Thrift.

You might have noticed that a large amount of code was needed to build an entire and yet simple persistence class, namely a structure based on Data Access Object design pattern. If we compare it to currently most used persistence technologies and frameworks in Java (i.e. JPA and Hibernate), we would probably get crazy!

Is this the price we pay for high-availability and scalability when changing from a traditional relational database model to a newly key-value data store? Poor developers... Well, that's not the end, fortunately! There are already available several high level clients for Cassandra in multiple programming languages (see [3]).

One of these clients, HelenaORM [4] (created by Marcus Thiesen), is the study subject this time. As this library is addressed to Java language, our aim will be to implement the persistence of a simple object using Helena's support.

Creating the entity

As in the other article, we'll choose the "group" entity to exemplify our codes. This time, in addition to the plain old Java class elements, in the Group class we need to annotate it with @HelenaBean and @KeyProperty.

The annotation @HelenaBean is used to specify keyspace and column family for the class, whereas @KeyProperty must indicate which field in the class should be considered the row key in Cassandra - currently annotation only works at getter or setter methods, not in the variable itself. Both annotations are analogous to JPA's @Entity and @Id annotations.

Take a look at the Group entity class properly prepared to be used by HelenaORM:


@HelenaBean(keyspace="ContactList", columnFamily="Groups")
public class Group {

  private Integer id;
  private String name;

  public Group() {
  }

  public Group(Integer id, String name) {
      this.id = id;
      this.name = name;
  }

  @KeyProperty
  public Integer getId() {
      return id;
  }
  public String getName() {
      return name;
  }
  public void setId(Integer id) {
      this.id = id;
  }
  public void setName(String name) {
      this.name = name;
  }

  @Override
  public String toString() {
      return "Group [id=" + id + ", name=" + name + "]";
  }

}

Creating the tests

The most interesting new is about to come: with Helena there's no need of traditional building a DAO interface and class!

Helena brings a factory HelenaORMDAOFactory designed to create ready-to-use DAO classes type-safely pointed to a given entity. The classes it produces, of HelenaDAO type, provides these methods: insert(), delete(), and get(). They are out-of-box implementations respectively for inserting (or editing), removing, and retrieving entire object instances from Cassandra.

So, here is the corresponding unitary test class in JUnit reserved for invoking inserts, deletions, and retrievals of Java object instances into Cassandra with the support of Helena:


public class GroupTest {

    static HelenaORMDAOFactory factory;
    private HelenaDAO<Group> dao;

    private static final Integer GROUP_ID = 123;
    private static final String GROUP_NAME = "Test Group";

    @BeforeClass
    public static void setUpBeforeClass() throws Exception {
        factory = HelenaORMDAOFactory.withConfig(
                "localhost", 9160, SerializeUnknownClasses.YES);
    }

    @AfterClass
    public static void tearDownAfterClass() throws Exception {
        factory = null;
    }

    @Before
    public void setUp() throws Exception {
        dao = factory.makeDaoForClass(Group.class);
    }

    @After
    public void tearDown() throws Exception {
        dao = null;
    }
    
    @Test
    public void testSave() {

        System.out.println("GroupDAOTest.testSave()");

        Group group = new Group();
        group.setId(GROUP_ID);
        group.setName(GROUP_NAME);

        System.out.println("Saving group: " + group);
        dao.insert(group);
        Assert.assertTrue(true);

        Group retrieved = dao.get(GROUP_ID.toString());
        System.out.println("Retrieved group: " + retrieved);
        Assert.assertNotNull(retrieved);
        Assert.assertEquals(GROUP_ID, retrieved.getId());
        Assert.assertEquals(GROUP_NAME, retrieved.getName());
    }

    @Test
    public void testRetrieve() {

        System.out.println("GroupDAOTest.testRetrieve()");

        System.out.println("Saving groups");
        for (int i = 1; i <= 10; i++) {
            Group group = new Group();
            group.setId(GROUP_ID * 100 + i);
            group.setName(GROUP_NAME + " " + i);
            dao.insert(group);
        }

        List<Group> list = dao.getRange("", "", 10);
        System.out.println("Retrieving groups");
        Assert.assertNotNull(list);
        Assert.assertFalse(list.isEmpty());
        Assert.assertTrue(list.size() >= 10);

        System.out.println("Retrieved list:");
        for (Group group : list) {
            System.out.println("- " + group);
        }
    }

    @Test
    public void testRemove() {

        System.out.println("GroupDAOTest.testRemove()");

        Group group = new Group();
        group.setId(GROUP_ID);
        group.setName(GROUP_NAME);

        System.out.println("Saving group: " + group);
        dao.insert(group);

        System.out.println("Removing group: " + group);
        dao.delete(group);
        Assert.assertTrue(true);

        Group retrieved = dao.get(GROUP_ID.toString());
        System.out.println("Retrieved group: " + retrieved);
        Assert.assertNull(retrieved);
    }

}

Checking the results

After properly executing the tests, you should check the keyspace contents inside Cassandra. In order to do that, you can use Cassandra's client by issuing the instructions described below:


cassandra> count ContactList.Groups['12305']    
1 column

cassandra> get ContactList.Groups['12305']      
=> (column=name, value=Test Group 5, timestamp=1283287882927)
Returned 1 result.

cassandra> get ContactList.Groups['12305']['name']
=> (column=name, value=Test Group 5, timestamp=1283287882927)

Conclusions

Traditional relational databases took more than 20 years of evolution to reach the state they are now. Aside, object-oriented programming languages and techniques conquered companies and developers forcing the creation of persistence frameworks in order to link both worlds. Soon as Internet usage grew, RDBMSs were not able to efficiently scale, thus a new paradigm was conceived (or reborn): the key-value distributed database model!

Cassandra, one of those distributed databases, is not trivial to talk to, from a developer perspective. In opposition to use a low-level interface API (i.e. Thrift), a lot of high-level clients were created by individual developers (thanks to open source initiatives!), and one of them was HelenaORM.

Thus, in the present article we saw how to leverage and simplify our Java code which handles persistence in Cassandra with the aid of HelenaORM libraries. A lot of work was reduced, isn't it? :D

References

[1] Talking to Cassandra using Java and DAO pattern
[2] Apache Cassandra
[3] Cassandra high level clients
[4] HelenaORM

Adventures in the Land of Binaries

quinta-feira, 2 de setembro de 2010

Simplifying Cassandra interactions with Helena

Introduction

Creating the entity

Creating the tests

Checking the results

Conclusions

References

About the author

Search

Tags

Posts

Software projects

Visit maps