More than 5 years have passed since last update.

Java - time ticks in NoSQL table as a range key

Posted at 2014-08-04

One year ago, I was working on a content web site, using Azure Table. To take advantage of NoSql query, we design the table like:

PartitionKey	RangeKey	Some_other attr
user_id	ticks_to_judgementday	values

The point of using this pattern is that many websites nowadays have scenarios querying contents ordering with the time when they are established. And Azure table and DynamoDB have PartitionKey(HashKey) and RangeKeyBy, of which composite is the identity of an item. All the items with same PartitionKey is sorted by RangeKey. So if you query out some items by using a specific PartitionKey, the results are ordered by RangeKey already.

Here are some really good posts about these concepts:

http://blog.maartenballiauw.be/post/2012/10/08/What-PartitionKey-and-RowKey-are-for-in-Windows-Azure-Table-Storage.aspx

http://www.allthingsdistributed.com/2013/12/dynamodb-global-secondary-indexes.html

By using this pattern, we can query items for one user based on their established time without any extra query, or extra indexes.
RangeKey's definition is like:

// wrap values
Long rangeKey = Long.MaxValue - DateTime.Now.

Most importantly, what make this pattern work is that the range key here(Ticks), is relatively identical for every player, because Ticks is 100 nano seconds, and we can assume that there will be no way that one user will get two feeds in 100 nano seconds.

Ticks in Java

new Date().getTime()
System.nanoTime()

I have left the world of C# for a long time (really miss it, really), and now I am working with Java. The recent project I have been walking into a very similar scenario: saving things in DynamoDB and fetch items in time order.

So, I want to play old trick.

The first thing I have to do is finding equivalent to Ticks in C# world.

Date.getTime() seems very natural, because it is the Ticks in milliseconds. In some background it can work like a charm with this pattern. The point is whether it is identical for each PartitionKey.

To me, millisecond is not enough. So I found System.nanoTime() which fit my need.
It is the number of nano seconds since every thing began (1970/1/1), so I can feel save with its identity.

Use ticks in DynamoDB

DynamoDB shares a lot of similarity with Azure Table. It has HashKey and RangeKey, instead of PartitionKey and RangeKey. The use of this pattern is almost the same with that in C#.

Now let us see how this pattern benefit us in content web site.

// create a post

public static Result CreatePost(String editorId, String content) {
    // validation...validation...validation...

    List<User> followers = MysqlHelper.getFollowers(editorId);
    
    // feed to self
    FeedTo(editorId, content);

    // feed to followers
    for (User follower : followers) {
        FeedTo(follower.id, content);
    }

    return SuccResult();
}



pubic static void FeedTo(String userid, String content) {
    Map feed = new HashMap();
    feed.put("partition_key", new AttributeValue().withS(userid));
    feed.put("range_key", new AttributeValue().withN(Long.toString(Long.MAX_VALUE - System.nanoTime())));
    feed.put("content", new AttributeValue().withS(content));
}

Then you can fetch feeds in bulk by:

public static FeedResult GetFeeds(String userId, Long lastRangeKey) {
    // check stimulus pool
    
    Condition hashCondition = new Condition()
        .withComparisonOperator(ComparisonOperator.EQ)
        .withAttributeValueList(new AttributeValue().withS(userId));
    
    Condition rangeCondition = new Condition()
        .withComparisonOperator(ComparisonOperator.LE)
        .withAttributeValueList(new AttributeValue().withN(Long.toString(lastRangeKey)));
    
    Map<String, Condition> keyConditions = new HashMap<>();
    keyConditions.put("partition_key", hashCondition);
    keyConditions.put("range_key", rangeCondition);
    
    QueryRequest queryRequest = new QueryRequest()
        .withTableName("ticks_test_table")
        .withKeyConditions(keyConditions);
    
    QueryResult queryResult = testDynamoClient.query(queryRequest);
    List<Map<String, AttributeValue>> selectOuts = queryResult.getItems();
    
    FeedResult result = WrapFeedResult(selectOuts); // note to return the max range key in result to perform a pagination
    return result;
}

To implement pagination, you will need to return the max range key in current result, or you can leave it to client. Then they can use it as a param (lastRangeKey) for paging.

Improvement

May hold connection while feeding to followers when create post is not ideal. We can make some improvement about it. If we have a message bus and a worker program, we can send this part of work to the worker and let it finish it self.

public static Result CreatePost(String editorId, String content) {
    // validation...validation...validation...

    // feed to self
    FeedTo(editorId, content);

    SendPostCreateWork(editor, content);

    return SuccResult();
}

In aspect of user experience, the editor can see the post he has just written after server returns. And it not matters much if his followers can see this post with a little delay, because the worker might be working hard on sending oi in this period of time.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up