Tuesday, 27 August 2013

Synchronization algorithm for exchanging data in the “Client – Server” model via REST API

37 comments
Many mobile applications require to sync data with a server if they operate in the client – server model to exchange data with the central repository.

If the server serves up resources through a REST API, then all sync logic can be handled on the client side.
The sync logic is able to handle bi-directional sync between central server and multiple clients where only incremental changes apply on both sides with some conflict detection.

Each table (both server and client) participating in the sync process should include two additional fields: “Ts” as Number, “Deleted” as Boolean.

The Ts field is maintained by the server side. For example SQL Server automatically generates a unique incremental value on each insert or update. The Ts field is used to determine whether one record in the table was modified more recently than another so to download only the incremental changes. Also it will help to identify the new records created on the client as they will have no Ts value.
Please note: Ts in this algorithm is not an instant in time, typically expressed as a date and time of day. Rather, the Ts should be considered as an Update Sequence Number e.g. the number of data element that identifies the order in which the element was last modified. The Ts within a table start at “1” and then increase monotonically every time a record is created, modified, or deleted.

The Deleted field is used to mark records for deletion so the clients would remove such records from local repositories, whereas clients with initial sync would simply skip them during download.
Additionally the client data storage should include the “Dirty” field as Boolean.
The Dirty field designates the records changed locally. Once those records sent successfully to the server the Dirty flag reset.

The sync process will be initiated by the client either manually (pressing some Sync button) or periodically or by combination of push notifications about server changes and client change triggers.
The sync process first download the server changes then upload the client changes.
The sync procedure will process the table changes sequentially and in correct order (where parent child relationship exist) to avoid referential integrity conflicts.

Below is the sync algorithm:
  1. Initialize the list of tables required to sync
  2. /******* Download server changes **********/
    For each sync table:
  3. Get the latest Ts, e.g. using sqlite “select ifnull(ts, 0) from Table order by ts desc limit 1”
  4. Call the relevant RESTful service method to get server changes as collection for the table based on the provided latest timestamp (Ts)
  5. /* Insert or update the received changes into local storage */
  6. Iterate through each record in the server changes collection
    • Check if the server record with the same id already exists in the local storage
      • If so then check if the existing local record is dirty
        • If so then call some conflict resolution function
        • Reset dirty flag
      • Else replace the local record with the server record and reset the dirty flag
    • If the local storage doesn’t have the server record, then create it
    /******* Upload client changes **********/
    For each sync table:
  7. Get the list of local storage changes for inserts on the server
  8. Iterate through each record in the local changes for inserts
    • Insert the record on the server (using POST)
    • Get the Ts for the inserted record from the response
    • Update the record in the local storage with new Ts
    • Reset the dirty flag
  9. Get the list of local storage changes for updates on the server
  10. Iterate through each record in the local changes for updates
    • Update the record on the server (using PUT)
    • Get the updated Ts for the updated record from the response
    • Update the record in the local storage with new Ts
    • Reset the dirty flag
    /******* Post sync operation ************/
    For each sync table:
  11. Delete records where Deleted=true and Dirty=false

For the conflict resolution the easiest way is to update the records with the server version (server always wins) and at the end of sync show the conflict log.

37 comments:

  1. Even if your conflict resolution will always take the server versions, you will likely end up having inconsistencies.

    It seems, you are assuming that the device has *exclusively* access to the database for the duration of the sync process on the client side.

    If other users are allowed to make changes during the sync process, your algorithm will not work. The server would need to resolve conflicts itself, or simply refuse to make changes. Either way, your algorithm would need to take care about this. Implementing this correctly and reliable is one of the more hard software problems.

    Furthermore, your sync algorithm doesn't allow to make changes to the database without doing it exclusively through synching. That is, a user cannot just update a particular record - since this would destroy your prerequisite about the "age" of the table.

    There are other caveats, but you may try to find a solution for those problems mentioned first. It's going to be hard.

    ReplyDelete
    Replies
    1. Hi CouchDeveloper,

      Thanks for pointing this out.
      It is true that the concurrency should be taken into consideration when building the server side services.
      But the server side is not going to be locked for the duration of the whole sync process. The communication is based on REST Api, e.g. clients post their requests one by one for each table change. The pessimistic locking can be applied on the modifying record in order to write the change and then read the updated timestamp. Then the record is unlocked and can be modified by the next client. It is up to developer how to apply the transaction on the post or put requests.
      Once the write / read operation on a single record is atomic there shouldn’t be any problem with inconsistencies.

      Delete
  2. Changes with existing data in the server will not be detectable and assuming you have checksum for each row it will be computational heavy to iterate each rows for large dataset.

    ReplyDelete
    Replies
    1. Hi Christopher,

      Any record created on the server should have the corresponding timestamp field. Even if the synchronization functionality would be applied on the existing database the timestamp field should be added and populated with value. So all clients (regardless of their sync state) will be able to detect the changes as the timestamp value (incrementing number: http://technet.microsoft.com/en-us/library/ms182776.aspx ) is created only on the server side.

      Related to performance – you might be right and there is a need to test the algorithm on big data sets. The main goal of this algorithm is to make the sync functionality is easy to implement. But there is certainly a need for improvement. Any suggestions are welcome.

      Delete
  3. There is a slight modification for this algorithm that makes better in syncing during the upload phase. The problem is that there might be few modifications on the server records during the uploading phase. Thus, getting the latest timestamp in step 2, will ignore those modifications cause the uploaded modification(5-8) will have higher timestamps for sure. I suggest to store the latest timestamp after the download phase. This comes after step 4.

    There will be redundant work of course for the already uploaded documents in the previous sync. But you grantee no loss of data will happen.

    ReplyDelete
    Replies
    1. Hi khelll,

      Thanks for your suggestion.
      It is true that the timestamp on the client must be updated after new downloaded changes.
      And it is actually considered in the algorithm in the step 4. The timestamp will be updated as the whole record will be replaced with the server one.
      This is happening in Else clause:
      - Else replace the local record with the server record and reset the dirty flag

      I think I need to mention that in the algorithm specifically.

      Delete
  4. I implemented the algorithm in Python with little improvements. Check out here: https://github.com/melug/ClientServerSyncAlgorithm

    ReplyDelete
    Replies
    1. Hi Tulga,

      That's cool!

      Thanks for mentioning about demo here.

      Delete
  5. Hi Sergey,

    It seems to be that deletion by client side are not supported. I only see deletion in step 4 as some records marked "deleted" by the server. But there is only a mention to update and insert in step 5-8 and not a possible delete operation by client.
    Isn't it?

    Thank you

    ReplyDelete
    Replies
    1. Hi Alex,

      You are right! That was missed.

      There should be the step 9 to do the post sync clean-up:
      For each sync table:
      9. Delete records where Deleted=true and Dirty=false.

      So the deletion of records in the step 4 can be skipped as the step 9 will take care of that.
      I am going to update the algorithm to include the step 9.
      Let me know if you have any suggestions about that.

      Thanks for asking.

      Delete
    2. Thank you for sharing this post Sergey.
      I think AlexDev was talking about deleting the records on the server, something like this, after step 8:

      9. Get the list of local storage changes for deletes on the server
      10. Iterate through each record in the local changes for deletes
      - Delete the record on the server (using DELETE)
      - Get the confirmation (maybe a code 200-OK)
      - Reset the dirty flag

      Then continues with your Post sync operation, which will become step 11.

      Delete
    3. Hi Denis,

      Delete operation during sync in this type of sync topology (multi-client with central server) is not allowed. As that might lead to the client / server data inconsistency.

      For example: client A and client B has the same record 1 received from the server after sync.
      Client A decides to delete the record 1 and sync with the server. If the server deletes the record 1, then it will have no such record anymore but client B still does. Then client B might decide to update the record 1 and call sync. The result is the sync exception as there is no record 1 on the server.

      The work-around in this algorithm is to mark (not delete) the records for delete on the server (see step 8 with “Update the record on the server”). While the clients are able to safely remove the deleted records at the end of sync (step 9).

      There will be a need to remove those ghost (deleted) records from the server (they will be accumulated) at some time. One solution is to run some periodic job, which will check if the “deleted” records are old enough and if so remove them.

      Any suggestions are welcome.

      Delete
  6. Hi,

    are you using this algorithm in one of production apps? If yes, do you have any issues with it that you would like to discuss?

    ReplyDelete
    Replies
    1. Hi Ivan,

      This algorithm was designed and implemented for the project (cloud-based mobile system), which is in production now. Unfortunately, I am not a part of that project anymore and cannot comment on any issues.

      Delete
  7. Hello Sergey,

    Thanks for sharing the beautiful algorithm with everybody.

    I am quarries to know if we can merge the step 5-6 with 7-8 as we can identify the new and client updated records where timestamp is null and dirty is true respectively. That will save a call to server.

    I am implementing this algorithm in android in production. i would love discuss the production issues(if any).

    ReplyDelete
    Replies
    1. Hi Piyush,

      I think the wordings in step 5 and 7 “Get the list of local storage changes for inserts on the server” misled you.
      There is no server call in those steps. What it says is collecting the dirty records from the local storage to be send to the server.
      Thus merging steps 5,6 and 7,8 will not make a big difference. For each dirty record, there is a need to invoke either Insert or Update REST api method.

      We can improve the performance if we reduce the server calls by sending records for inserts or updates in batches, e.g. arrays of records.
      Thus, in the steps 6 and 8 we can first create the collection of records and then POST / PUT that collection in a single call. Accordingly, the server will send the array of new time stamps on success.

      Hope that helps.

      Delete
  8. we should not reset dirty flag if the result of a conflict resolution differs from remote side

    ReplyDelete
    Replies
    1. That depends on the resolution conflict implementation.

      I assumed the conflict resolution function could be one that would resolve the conflict and keep the sync without any halt.
      For example, “server always win” which would override the client changes with the server changes and display some report to the user. In this case the reset flag is required.

      Delete
    2. I understand your point. Just wanted to point out that tricky place.

      Thanks for great sync algo, pal!

      Delete
  9. Wouldn't a better conflict resolution be when the record was updated last? I.e. whichever is later: the server last modified stamp, or the client's last modified stamp?

    ReplyDelete
    Replies
    1. I am not sure if the record’s timestamp difference would make the conflict resolution any better.

      The ts field is used to identify the records (delta) to sync. It is only updated on the server and will always be higher than on the clients.

      Let’s say the record 1 is created by the client A and after sync the server will assign the ts=1000 to that record. The client B gets the record 1, then updates the record 1, e.g. set dirty=1 (while ts is unchanged and equals to 1000) and then sync. The server will update the ts and set it to 1001.
      At the same time the client A updates the record 1 again (while ts=1000) and sync. Here is the conflict and the server record’s ts is higher.

      Delete
  10. If client A's update is after client B, as a user I would expect my changes to be committed, not client B's. Ideally you want to check for updates from the server before doing any updates but sometimes the network is not available, so client A will create a dirty update, then later (could be the next day) when it can sync with the backend his changes will be overridden by B even though he made the changes after. An extra column for the client data called "lastUpdatedLocallyUTC" which is updated anytime the commit is dirty (i.e. local only) would be able to be compared upon conflict resolution. Nice article, btw.

    ReplyDelete
  11. Hi Sergey,

    first of all thanks for sharing your sync process!
    I'm currently trying to implement your approach for an mobile app and are now faced with the problem that comparing local records with the id generated on the server side is not
    as easy as it sounds.
    Consider following situation:

    A user can create new database records which are not public available but always mapped to his/her exclusive account id.

    Client/User A creates a new record in a table called for example 'Book'. Because this is the very first record in the local database the autoincrement column will use the id 1.Now the sync process is triggered so this change can be communicated to the server. Assuming that there are no changes to download to the client the client then starts to upload the local changes to the server. On the server this new record is created in the appropriate table where the autoincrement column decides also to use the id 1 (because it's the very first record on server side).

    Now client/user B creates also a new Book record. Because this is also the first record being created in the table 'Book' the id in this local database instance will be also 1. Client/User B sends now also this local change to the server but this time the used id for the new record in the table 'Book' on server-side will be 2.

    Let's assume now that client/user B will do some changes on the previously created record with the local id 1. While triggering now again the sync process the client will get in the response the record from the server side with the id 2 (because only this record belongs to this client/user -> the record with id 1 belongs to client/user A).

    In this case it is not really possible to check if the server record with the same id already exists in the local storage (see step 4 in your sync description) because on the client side we have the record with the id 1 and the record from the server side has id 2.

    Any ideas and tips how to solve this problem are really appreciated!

    ReplyDelete
    Replies
    1. Hi Panos,

      The record Ids for the tables used in the sync process should be unique across every database.
      I can suggest two options for you:
      1. Use GUID / UUID for your record Ids (primary keys) instead of auto-increment number. Thus only clients will be creating Ids (not the server).
      2. Include an additional field “ServerId” of GUID / UUID type in each table and populate it (only once) on the server. Then the ServerId field can be used for the record comparisons ( step 4 ).

      I hope that explains.

      Delete
    2. Thanks for your reply Sergey!

      Damn, UUID's of course :-) Thanks again for pointing me in the right direction!

      Delete
  12. hello can you provide a link for exapmle or code sample it will a great help

    ReplyDelete
  13. Currently I don't have my own code sample.
    However, there is an implementation of the sync algorithm by Tulga in Python: https://github.com/melug/ClientServerSyncAlgorithm .

    I hope that helps.

    ReplyDelete
  14. This a good article. But however i have been reading up on sync situations. and when you rely on timestamps for syncing there are a lot of scenarios where it breaks down.

    e.g. When the tables are being updated ( but the records are NOT yet commited ) at this tiem if you are inbetween a sync operation. You can actually miss the uncommited changes. refer : http://forums.mysql.com/read.php?27,243736,243736 for more details.

    Timestamps only work if the timestamp providing server is a single server. But what happens in a multi server architecture ? The timestamps / clock cycles will never be exactly the same. So how can the timestamps actually be compared ? We could miss out on a lot of records to be synced even if there is a second difference between the 2 servers.

    ReplyDelete
    Replies
    1. I have updated the post to clarify what Ts mean in the context of this algorithm.

      Ts in this algorithm is not an instant in time, typically expressed as a date and time of day. Rather, the Ts should be considered as an Update Sequence Number (USN) e.g. the number of data element that identifies the order in which the element was last modified. The USNs within a table start at “1” and then increase monotonically every time a record is created, modified, or deleted.

      Delete
  15. This comment has been removed by a blog administrator.

    ReplyDelete
  16. Thank you for this great algo.
    In step 3 when you wrote "Time Stamp" you mean the TS field?
    Thank you.

    ReplyDelete
  17. This comment has been removed by the author.

    ReplyDelete
  18. What if a table contains records for different users (only some records are relevant for a user)?
    The Update Sequence Number will be not constant over all user records and there might be gaps in the Update Sequence Number for one user (for example -> 1,2 .... 10,11 and 3 to 9 belongs to another user).

    Do you see any problem in that case?

    ReplyDelete
    Replies
    1. Hi Panos,

      I don't see any problem with that. The USN is updated (within table) on a global basis, e.g. without filtering. So the users will be able to get deltas correctly. For example, assigning the records 3-9 to the user who has access to records 1,2,10,11 will update the USN of those records, so the sync will include them and another user will be able to download them.

      Delete
  19. Thanks for great article and lots of discussion.

    Can any clear my doubt. how the deleted entries on server can also be deleted on client.

    ReplyDelete
    Replies
    1. Hi ATUL,

      Have a look at Post sync operation (step 9). That step is used for deleting client records.

      Delete