The Rain and The Shade

July 6, 2011

The keys are Kool again

Filed under: NoSQL Databases,Windows Azure — ovaisakhter @ 11:35 pm

(This post is highly inspired by my mail correspondence with Thomas Jespersen who works at www.spiir.dk)

In the good old days when SQL was the king (it kind of still is) we were in love with the “identity column” and key for the record was insignificant in the design. So normally you will design a database where all the tables will have one primary key whose value will be either a Guid or a auto incrementing number (identity in MSSQL).

With the popularity of the NoSQL and high performance databases where many other things were revolutionized the record Key got its due importance back.Here is what tutorial on the Redis site says about the Redis

“Redis is what is called a key-value store, often referred to as a NoSQL database. The essence of a key-value store is the ability to store some data, called a value, inside a key. This data can later be retrieved only if we know the exact key used to store it”

This kind of seems to be the theme all around in the in most of the NoSQL databases e.g Redis, Azure Table Storage,RavenDB, Cassandra and many more use key to access huge amount of data. These systems index the keys for very fast retrieval of information. I am not saying that querying of the data is not possible(for example RavenDB provides amazing possibility for creating indexes on the data but more on that later) but still the fastest way to get or set the data in these systems is “if you can some how know the key”. 

Now the question is how do you know the keys without getting them from the store, an answer to that could be that you should be able to generate the keys based on the context and the type of query you want to do. let us take an example of twitter, a request comes in and says “who follows me” based on the context we know the current user (ovaisakhter in my case). So the user  ovaisakhter and he wants to know who follows him so we can have  a list in the database against a key “ovaisakhter-followers”. So now we can get all this information in one request.

Let us take another example. We need to save user’s tweets. One way of doing that could be that we maintain one list of tweets per month (depends on how the data will be accessed) so the key can  “UserId-mmyyyy-Tweets” so now if some one comes and asks for the tweets you exactly know where to find tweets quickly.

Azure Table Storage is a bit different then the other NoSQL offerings. It provides less opportunity to play with the structure of the document as the document has to be name-value pairs(maximum 255) with a total size of 1MB. They provide you with an further categorization possibility. You have two keys to play with PartitionKey and RowKey. PartitionKey has very important role to play in the scaling of your datastore (more on that later). Partition key also be used as a Categorization point for the fast data retrieval. Let see how our twitter example can look like if modeled on TableStorage. We can use PartitionKey as “UserId-mmyyyy-Tweets” and then all the Tweets can be stored with this PartitionKey. You can use Reverse ticks(reverse time stamp) as plus some identifier as RowKey for better sorting.  Remember once you know how to generate the “Key” you can use the Parallel.Foreach to get tweets for multiple months in parallel).

So in the new era of software development the Keys are back in fashion so due time should be spent on designing what your keys should be, based on things like your Data Structures, Data Retrieval Requirements,scaling requirements and other things.  

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: