Voldemort support for Avro schema evolution
We are introducing support for evolving schemas for people using Avro in Voldemort! This will help Voldemort users to use the data in existing stores and add new fields as their application logic changes.
This project achieves this by adding a new serializer type "avro-generic-versioned". ( see the voldemort.serialization.avro.versioned.AvroVersionedGenericSerializer class)
The "avro-generic-versioned" can only be used with new Voldemort stores (and not existing avro-generic stores). "avro-generic-versioned" is not backwards compatible with the old "avro-generic" format.
You can define a store with the key/value serialzer type as avro-generic-versioned and then use the version numbers for the evolving schema.
A stores.xml example of this would look like the following:
- Do ALL clients that read data need to be bounced before we can start writes in new schema?
Not anymore ! we have a new auto rebootstrap mechanism in Voldemort client, it will pick up the new schema from the server (This is asynchronous so the writes of objects created using the new schema may fail during this window (~5 seconds), you can choose to manually bounce the clients to pick up the change immediately)
- Does the client that write the data need to immediately start writing data in new schema once Voldemort is updates & client is bounced?
The Voldemort client will always try to serialize the data with the newest schema it has. If the record you supply was created with an old schema we will serialize it using the old schema, however on a get call we will return the objects written with old schema with defaults for the new fields in the new schema. This feature supports Rolling upgrades!
- Should I evolve my key schema
NO NEVER. this can cause things to break
How Do I Query Using a Java Program
This a simple example to get you started.
If you have an avro object for the key you dont need to do the Jsondecoder stuff just pass the object to the get call directly. If you have a JSON String representation of the key then you need to follow this route instead.
The object the get call returns is a Versioned Object you can do a getValue() on this object to get the actual Avro object.
You should always use the admin tool to update the stores.xml to add new schema. If you want to do this on a production store the Voldemort team will do it.
If the schema change is backwards compatible the tool updates the stores.xml
Otherwise it will fail. This is to protect you from corrupting the store due to serializing data with a bad schema.
Auto Rebootstrap when updating the schema
Please use the "--stores" option when you update