New Posts New Posts RSS Feed: WPF Client Synchronization with Backend
  FAQ FAQ  Forum Search   Calendar   Register Register  Login Login

WPF Client Synchronization with Backend

 Post Reply Post Reply
Author
gcpeters View Drop Down
Newbie
Newbie
Avatar

Joined: 12-Nov-2010
Location: Atlanta, Ga
Posts: 6
Post Options Post Options   Quote gcpeters Quote  Post ReplyReply Direct Link To This Post Topic: WPF Client Synchronization with Backend
    Posted: 12-Nov-2010 at 9:46am

I am in the process of building a WPF application that will licensed/sold to Companies that work in a specific industry.  Each company can have between 5-50 sales reps depending on their size.  Each company can have anywhere from 10 to several thousand accounts which are assigned to one or more of the Sales Reps / Service Technicians.

Sales Rep/Service Tech will install a copy of this WPF Client application to their laptop.  The WPF client app has a local SQL Express 2008 db instance that contains all of the tables/lookup information required to allow sales rep/tech to work offline (can be offline for a period of 1-3 day).  There are also other tables with information specific to the Rep's company
 
Once connectivity to the backend is available, all record changes (new, deleted, updated) are synchronized through a web service to a backend SQL Server instance.  This webservice will later be used by a Silverlight application that performs a different function (but shares a subset of these entities).  These changes are then synchronized with other sales reps/techs within the same company as they perform their synchronization process.
 
I can provide more details as to why a local database instance is being used as the primary repository for the client application (much of it has to do with some large datasets) and how the application is used.
 
============ Implementation  ===========
Taking into consideration that record creation/updates are distributed remotely across several clients that may or may not internet connectivity for short period of time, I've attempted to simplify my implementation flow.
 
My WPF application's main instance of it's repositories will always persist to the local database instance. 
 
A separate background process will monitor for connectivity and then perform data synchronization if the users has indicated it's ok to continue with synchronization.  This background process will use an instance of the repositories that's configured to use a backend webservice, similar to the future Silverlight app mentioned above.
 
To avoid key collisions:
 
The client db schema is setup to have any IDENTITY columns seeded with (-1,-1).  This allows all new records that are created by each client to receive an identity value that won't collide with data being merged from the server.
 
The server db schema is setup to have any IDENTITY columns seeded with (1,1).  These become the permanent primary keys which are then used to update records created by the client. 
 
============ Questions ===========
  • My plan was to have the background sync process use the "local db" repository instance when loading the entities that will participate in the synchronization process.  These entities would be persisted through the "webservice" repository instance.  Do I need to (and how) "Manually/Force" these DevForce entities to recognize whether they should be considered "new" vs. "updates" so that it's correctly handled through the service call?  Since I just loaded them, they will be marked unchanged.
    • Should I consider implementing my own webservice "sync" process and simply pass the DevForce entities back and forth within my own wrapper payloads?  I could then use DevForce on both ends applying the behavior that's appropriate for each side.
    • I know that I can toggle DevForce between using it's own caching while offline and then toggling it back so that it pushes the changes to he webservice but I'm not sure if I could take advantage of this in my case. 
    • Since I basically my local db instance is really a persisted local cache that's managed outside of DevForce, I believe I've bypassed the DevForce internal offline cache support.
    • I'm not sure if I can use multiple Entity Managers (one to load the "changeset" and another to push it to the webservice making it look like they came from a cache so that new records are treated as inserts while changes are treated as updates.)
  • When I persist the records created by the client (identities will be negative values) on the server, can I somehow use DevForce's ability to cascade the new permanent primary key down through all the other entities that have a foreign key to it?  This will make sure that the records with the foreign keys don't still contain my negative values generated on the client.
    • I wasn't sure if I'm able to do this since I created the "temporary" primary key from the database on the client so as far as DevForce is concerned, it's actually an "update" because I read it from the client database.
    • If question #1 can be solved by going through the DevForce offline caching feature, this might be a mute question.
  • When coming back from the server to the client, once I have the new permanent keys, would it be best to simply perform new INSERTS with IDENTITY INSERT ON?
    • Since I'll know which "new" records were included during the sync session, I can either mark the original records with the negative value as deleted or simply purge them.
      • I could save off the original negative identity value in the Entity so that I can find the source records again after the pk field was given the permanent value.
      • OR I could select from the table where all of the other column values matched and the identity value was < 0, but that might be too inefficient.

I'm not sure if these questions make sense.  I'm kind of thinking through this as I was trying to explain it.

Thanks for any suggestions / input.  All suggestions and questions are welcome.  I have some flexibility in changing parts of the implementation but I'd like to avoid a huge shift in design.
 
Thanks,
George
Back to Top
smi-mark View Drop Down
DevForce MVP
DevForce MVP
Avatar

Joined: 24-Feb-2009
Location: Dallas, Texas
Posts: 343
Post Options Post Options   Quote smi-mark Quote  Post ReplyReply Direct Link To This Post Posted: 12-Nov-2010 at 9:54am
Just a thought... Rather than having a local SQL instance, could you not save/restore IdeaBlade cache? If it's just lookup tables and a small amount of data, it seems that that would be a much easier implementation.

The first time the app is run it can populate the cache, and when the app closes, it simply saves the cache. This way IdeaBlade is responsible for generating all the PK's, and you can choose how to handle concurrency conflicts according to your needs.
Back to Top
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Posted: 12-Nov-2010 at 12:13pm

I would like to make a plea for simplicity. I don't think you need to have a special synchronization web service or play games with the IDENTITY seeds. I hope you are still early in the game and are open to rethinking your implementation plan.

The Alexandrian solution  ("cut the Gordian Knot") to the id portability problem is to use Sequential Guids as keys and thus avoid the problem altogether. Examine closely all the reasons you favored auto-generated integers instead of Guids. Are they really valid? Because if you can use Sequential Guids (aka GuidCombs), this particular problem just disappears.
 
'k ... you insist on integer keys (p.s.: please make them "BigInt"/longs !)
 
I echo Sci-Mark's suggestion about using the DevForce serialized entity cache files ... for simplicity reasons again.
 
Now you may suppose that you have too much data to maintain as one (or more) serialized files. For example, you wouldn't want to keep your entire Product catalog as serialized data on each client.
 
Others have been here before you. They separate data that change from data that don't. The data that don't reside in a local database; the data that do are preserved in serialized cache files ... until the application can reconnect with the server and post the changes (adds/changes/deletes).
 
This process works well ... even when there is overlap of changed and base data. Suppose that we start with a Customer's information in the local database. The user pullls up that Customer's information and modifies it before shutting down ... without being able to save to the server. Well, the app can save that customer editing session as a serialized cache file.
 
You only have to preserve the session changes (add/modify/delete) in the serialized cache file. You already have the unchanged data in a local database so you don't need them in the serialized cache. Thus your serialized session cache files can be small.
 
When the user revives the app - assume she does so offline - and the user revisits that Customer editing session, the app restores the in-memory cache from the serialized session file. Of course you'll also bring in some of the other unchanged data from the local database to flesh out the screens. Fortunately, the pending changes continue to trump the information from the local database. Her pending customer changes will remain visible in the UI and are ready to be saved to the server when a connection becomes available. With a little attention to detail, you can make the Customer edit session look exactly as it did when the user suspended it.
 
--
 
Important: you want to separate edit sessions ("sandboxed sessions") into separate serialized cache files. That will take some bookkeeping. 
 
If your application lets you edit Customer 'A' in one dialog and edit Customer 'B' in another dialog, you want to keep these dialog sessions independent. Later the user can saving (or abandon) edits to 'A' without disturbing pending edits to 'B'.
 
You should structure your application to maintain 'A'/'B' isolation BOTH while the app is running AND when you shut down and preserve pending changes locally for later resurrection. Accordingly, you should keep the two edit sessions in separate serialized cache files.
 
I know this sounds like a PITA. Well it is a PITA. But I don't see that you have a choice.
 
These complications are present regardless of your ID strategy (integer or Guid). They are present whether you store all, some, or none of your provisional data in a local database. If you pursued your present course of keeping changes in the database you would still have to (a) identify the unsaved data and (b) separate them into separate "transactions". Such challenges are intrinsic to your requirements.
 
I happen to believe that bundling pending "transactions" into separate serialized cache files is the easiest way to skin all the cats, the id cat and the session boundary cats in particular.
 
--
 
Now ask yourself: "Will my edit sessions be too big to store as serialized cache files?"  I hope the answer is "no." If it is "no" then you don't have to worry about your ID manipulations. DevForce will preserve and restore the temporary ids for you.
 
If you still feel you have to save the temporary data in a database and don't want to use serialized cache files ... we can pick up the conversation from there.
 
There is a way to go down the path you started. You still have to figure out how you will remember which SQL data are provisional changes and you will have to figure out how to draw transaction boundaries. Assuming you have a good plan for that, I think I could work out a procedure that let's DevForce do the FK fix-up for you.  Let's not go there until you are convinced (and can convince me) that you really have to do so.
Back to Top
gcpeters View Drop Down
Newbie
Newbie
Avatar

Joined: 12-Nov-2010
Location: Atlanta, Ga
Posts: 6
Post Options Post Options   Quote gcpeters Quote  Post ReplyReply Direct Link To This Post Posted: 15-Nov-2010 at 9:35am
@smi-mark - Thank you for the response and valuable feedback.  If you don't mind, I'll provide some additional information about the application below to see if your recommendations would still apply.  I left them out originally because when I previewed my original post I was thinking "OMG, many may not want to read an entire 4 page novel to get the background on my issues.  :)  I thought it would be best to provide it as the conversation evolved but I appreciate your thoughts.
 
@Ward - I am greatful for your response as well.  When I read your blogs and your responses to others I find them extremely valuable because your ideas, approach and detailed reasoning behind your answers helps me learn a tremendous amount each time.  Thank you for the suggestion on  the BigInts, I completely forgot about using that as my identity type.  In my initial conversations with Derrick Chung and Albert Wang, my original tables structures used Guids because I wouldn't have to worry about the collisions/identity synchronization. 
 
I probably spooked myself while I was doing some research and I read the number of neverending debates between using Ints vs Guids.  Many pointed to the size of Guids and potential speed issues that may arrise as the data grows in addition to being able to identify some type of order (Seq Guids would help but since I'm creating from many distributed sources, this wouldn't work for me), however my simple dteCreated column tells me the creation sequence (for the application's needs) regardless of what the keys say.
 
------ Some additional Data Information -----
Equipment, Parts, RemanParts, Kits - This subset of data which is used by all aspects of the application (much of it referenced and provides details) is global across all companies (my customers).  Using Ward's concept, this information makes up the "Equipment Catalogue (specs, market pricing, etc) along with all the oem-parts (parts) and aftermarket parts (remanparts) available for each piece of equipment.  There might be somewhere from 25K-30K different equipment records along with any parts/remanparts/kits associated with each one.  The parts/remanparts/kits can be shared across multiple equipment records that belong to the same family/series.
 
Rate of Change: Maintained by my group, initially there will be a lot of new records, but over time this will settle down to a few changes every 3-6 months.  The WPF client application will primarily update their local database with this information so that off-line references can be done.
 
Dealer Profile - At a high level, a dealer (my customers, originally called a Company above) will create a profile where they selected all the equipment/parts/remanparts/kits that they service/support/offer supplies for.  This profile contains additional information such as their cost of service, base line pricing models and min/max service thresholds.  This is basically a subset of the Equipment information.
 
Rate of Change: Typically created initially by my group but then adjusted by the managers at the Dealer (company).  Changes may happen frequently and will be done through a module that is exposed on certain deployments of the WPF client application.  The ability to make changes to the dealere profile while off-line is a goal.
 
Account Profile - A dealer will build a "profile/map" of each account.  This typically happens as the first step in trying to acquire the account.  An account profile is similar to the equipment portion of a Dealer Profile but it contains a list of all the equipment (referenced to our master catalogue) that an account currently has/uses across all of it's physical locations.
 
The only difference is that this list is per physical existence of a piece of equipment.  ie: An account might have 3 installations of Equipment Model A so there would be 3 AccountEquipment records all referencing Model A but with additional attributes unique to that install (ie: serial #, location, usage stats,etc).
 
There are several other supporting records that will be created (potentially off-line in addition to the list of equipment) such as locations, contacts, pricing, orders, service requests and "stat" captures".  The stats for each piece of equipment contains about 20-30 additional attributes such as consumable levels, total runtime, errors, etc. determined by a third party vendor.  This information is a 1-to-1 against each account equipment record.  Pricing at this level is not always 1-to-1.  Pricing only exists if the sales rep/dealership arranged a special pricing model to help "seal the deal" for this account.  This overrides the Dealer base line pricing model.
 
Rate of Change: Any change to an account's profile (location, equipment list, pricing, stats, etc) can be done remotely while off-line by a sales rep who has access to this account.  During an "assessment", practically all the account equipment records, stats records, locations, contacts, etc will be created in one session (possibly offline).
 
 
@Ward - The idea of separating data that will change more often from the base data that doesn't change as often makes perfect sense.
 
The more I think about this, I believe I can utilize the DevForce caching system and serialzed cache files for practically all of my local client changes.  I don't think the data will become too large while they are waiting for connectivity.
 
Not knowing when someone could be off-line, I would like to make sure that sales reps/techs could still access account information/assessments/proposals for any account they have access to, even if they have not previously requested it through the User Interface.  Let's say they're offline and another sales reps calls them to ask them to review it (since they are a secondary rep or something).  I don't want to implement a "local cache database" simply based on what they have explicitly requested previously through the application.
 
It seems like I want to implement something similar to Outlook using an Exchange Server running with Offline Cache support enabled.  I might have only looked at the Inbox, but Outlook has also sync'd any changes to my calendar, notes, etc during the connected session.
 
Of course I could be missing the boat (and maybe you've presented it in the form of a huge tanker I still missed it), but could the background "sync" process simply be a "puller" that happens after the DevForce change caches have been posted to the backend?  I'm not sure if this is what you were suggesting or if you were suggesting that my local database only store the fairly static data and not the all of the data (my thought).
 
Using this approach, I would eliminate the need for the local identity synchronization issues and lets DevForce handle all changesets via Serialized Cache Files.  Once pushed up, I could locate all server changes (which would include my most recent ones commited) via the Change Tracking ability in SQL Server or query for any create/mods on the server since my last sync date/time.  I can refresh my local database instance with changes / new records introduced by other sales reps including myself since they would have never existed in my local database.  This would meet the goal of having access to all accounts while off-line at any point in time (all of the data, including assessments, stats, etc).
 
I feel like I've taken a couple steps toward what you guys were suggesting but I'm still not going all the way because of my desire to store much of the data in the local database still and not treat it as "temporary" storage.  I think my suggestion would eliminate my need to keep track of my local changes and worry about merging them with incoming changes because they would be handled on two separate paths.
 
Would I have the newid generation done in the application and remove the newid() in the dbschemas so that whatever is generated is maintained as the PK?  Or have DevForce create a temp Guid for the cached entities, allowing the server's DB schema to apply a permanent sequential guid?
 
Thanks again,
George
 
 
 
 
Back to Top
gcpeters View Drop Down
Newbie
Newbie
Avatar

Joined: 12-Nov-2010
Location: Atlanta, Ga
Posts: 6
Post Options Post Options   Quote gcpeters Quote  Post ReplyReply Direct Link To This Post Posted: 15-Nov-2010 at 1:08pm
Two quick comments:
 
One:
 
I forgot to touch on a scenario mentioned by Ward when you mentioned edits to Customer A and Customer B, maintaining them separately so that one could be rolled back while not impacting the other change set.
 
I built a class called a Unit Of Work which provides access to an Entity Manager.  An initial Unit of Work is created and stored in the Unity Container which provides a "common" Entity Manager to use, typically for my lookup view models where changes won't be an issue.  An abstract base view model manages the Unit Of Work for each View Model.
 
When an edit view is displayed, its view model can either be injected with an existing Unit Of Work for batch operations or request a new one.  A new Unit Of Work has a new instance of the Entity Manager.  All view models that are required to handle a specific functional area will share this same UoW through injection.  When the view models work with various repositories, they provide a "non-saving interface" to their unit of work to remind me not to have my repositories fire any saves.  Only the view models, typically the parent one, will call Save on the Unit of Work or Abandon(Cancel/Abort/Change my Mind) any changes performed by rejecting the changes in the Entity Manager.  Since many View Models and repositories may have participated in that functional area, only the controlling view model knows the wishes of the user.
 
Second:
 
I was going through dry run of switching my PKs to Sequential Guids and I wasn't sure if there would be any performance issues with a structure like the following (assuming I switched everything over to a Guid if it could be created remotely in a disconnected state):
 
Account - Guid Pk
AccountAddress - Guid Pk, with a FK to Account (a Guid).  This table contains a global list of addresses for an account because a single address could be reused one or more times by one or more Locations (ie: branch locations share the same billing address (parent's physical address), or regional shipping address (warehouse), etc...)
AccountEquipment - Guid Pk, with a FK to Account (a Guid).
AccountLocation - Guid Pk, with a FK to Account (a Guid).
AccountLocationAddress - Guid Pk, with a FK to AccountLocation (a Guid), a FK to AccountAddress (another Guid), AddressTypeId (int).  (Ex: A single location may have a billing address, shipping address and physical address.  The billing address and shipping address may be the same one used by all the locations within a specific region while the physical addresses would each be different.)
 
Just looking at the AccountLocationAddress relationship table, there would be three Guid columns.  At a glance, without running a bunch of tests, it seems like there could be alot more overhead processing tables like this than three BigInt columns. 
 
It was just a pattern that I was noticing while I was testing the option of switching over to Guids.
 
Thanks,
George
Back to Top
 Post Reply Post Reply

Forum Jump Forum Permissions View Drop Down