Wednesday, April 21, 2010

NoSQL & Document databases

I am really loving the NOSQL movement at the moment, however there seems to be a lot of confusion as to when it is appropriate to use them.

Significant things to considered are:
  • Atomic transactions are only for single operations. You can't really have long running or nested transactions (ACID rule a different).
  • Joins don't make sense so don't expect to use them when designing the system
  • Document databases are typically schema-less. This means adding new properties is much less of a hassle than in SQL-land, especially once you are in production.
  • The notion of an aggregate root fits perfectly with a document so the idea of using a DocDB for DDD is appealing (assuming transactions are not required)
  • DocDBs tend to scale horizontally very well unlike our SQL counterparts which tend to only scale vertically without huge headaches
  • Read and write performance is possibly the opposite of what is expected with very fast writes (I understand) being the norm.
  • Queries are done differently. MongoDB for example uses JavaScript as it query language (this does not mean it is used in the web tier!)

Several uses for document databases come to mind:

  • High volume, low value writes; eg user data entry on social sites; this is not business critical but potentially requires easy scaling options; ie no one is going to die if you last Facebook update doesn't go through to all the DB servers instantaneously.
  • Auditing; One area I'm keen on is command persistence. I like the idea of having a trail of all command sent to a component, it becomes a self documenting timeline of what users were trying to do to the system. When a command is handled by a component it can just write the whole serialized object to a DocDB, thereby capturing all the info without being bound to a schema (audit is version agnostic). The command can then be processed by the component. I will admit that an Object DB is also suitable for this.

Things not to use Document Database for is high value transaction heavy stuff i.e. banking transactions or thing that inherently require SQL... whatever use case that may be.

Hope this helps :)

No comments: