Reverse Data Synchronization

Synchronization of data from Production to Development environment is merely a task of taking backup of Production DB, restoring it to Development environment and running to cleanup scripts to hide/overwrite user-specific critical information. This is pretty straight forward and does not even require domain knowledge.

However, reverse is tricky, risky & surely need good insight of application knowledge. Here are some critical scenarios & pointers to address them.

Scenario 1: Consider a situation where bulk of binary data is added to master tables on development environment during testing. Now, while pushing things on Production, either we update the same tables on Production for this bulk data manually or look for some alternative approach that does same without intervention. Running bulk scripts for binary data is not a good solution.

Again, automated process cannot rely on auto-incremented primary keys here since same key value can exist in same table across environment and we can’t compare complete record to identify new or existing record.

So, the handy solution to address this issue is GUID (Globally Unique Identifier). NEWID() & sys_guid() are respectively MS SQL & Oracle version of this. The beauty of GUID is that they Unique across every table, every database, every server. Most of modern data synchronization & replication processes relies on this GUID. Being unique across database, it grantees that records either exist or not. There would not be the case partial match in the record.

Scenario 2: Referential constraints is another area of concern her. Tables containing reference/foreign keys can’t be synchronized prior to master table since their can be a case the record contain a key value does not exists in master table.

So, thumb rule here is , scripts to synchronize tables must be in a particular order where master tables be synchronized first prior to reference tables. This can also be complementing with disabling all constraints followed by enabling post synchronization.

Scenario 3: If auto incremented identity columns are used, then it would need to reset the identity column value & use Identity Insert during synchronization.