2

I have a data intensive iOS app that is not using CoreData nor does it support iCloud synching (yet). All of my objects are created with unique keys. I use a simple long long initialized with the current time. Then as I need a new key I increment the value by 1. This has all worked well for a few years with the app running isolated on a single device.

Now I want to add support for automatic data sync across devices using iCloud. As my app is written, there is the possibility that two objects created on two different devices could end up with the same key. I need to avoid this possibility.

I'm looking for ideas for solving this issue. I have a few requirements that the solution must meet:

1) The key needs to remain a single integral data type. Converting all existing keys to a compound key or to a string or other type would affect the entire code base and likely result in more bugs than it's worth.

2) The solution can't depend on an Internet connection. A user must be able to run the app and add data even with no Internet connection. The data should still resolve properly later when the data syncs through iCloud once a connection is available. I'll accept one exception to this rule. If no other option is available, I may be open to requiring an Internet connection the first time the app's data is initialized.

One idea I have been toying around with in my head is logically splitting the integer key into two parts. The high 4 or 5 bits could be used as some sort of device id while the rest represents the actual key. The fuzzy part is figuring out how to come up with non-conflicting device ids that fit in a few bits. This should be viable since I don't need to deal will millions of devices. I just need to deal with the few devices that would be shared by a given iCloud account.

I'm open to suggestions. Thanks.

Update:

After giving this some more thought, I've decided to take the short term hit and do this the right way. Using a GUID is definitely the best long term option for this. Using this approach eliminates the need to do any of the various options to deal with handing out key ranges or translating keys once an Internet connection is made.

My original requirement prevented the GUID option because neither SQLite nor iOS support a simple, 128-bit integer type. Since I have a large existing code base that is written such that my keys are simple integer types, it will require a major refactoring of both the code and the database schema.

In the end, I've decided that the short term pain of doing this refactoring out weighs the long term issues of dealing with a klunky solution. Taking the hit now gives me a much simpler and less error prone solution that will last the long term.

7
  • I agree with the GUID approach. Ranges or anything sequential in nature would give you headaches for data that is distributed. But make sure any indexes on GUIDs in the DB are non-clustered.
    – mike30
    CommentedDec 4, 2012 at 20:03
  • @mike I'm using a GUID create method that uses a timestamp based implementation which is supposed to work much better with clustered indexes. Specifically it is the uuid_generate_time() function.
    – rmaddy
    CommentedDec 4, 2012 at 20:07
  • Will that function be run on client computers? Will insertions from multiple clients reach your central server in a different order than the time created? If so then a clustered index will wreck you.
    – mike30
    CommentedDec 4, 2012 at 20:10
  • Yes, this will be run on multiple client devices. I guess I need to research the proper approach and see what is best for my SQLite indexes. Thanks.
    – rmaddy
    CommentedDec 4, 2012 at 20:12
  • Just use a regular index. (ie the B tree kind). Also I would not use the "time" based function because it introduces a sequential nature into the GUID. I'd store time in a separate field.
    – mike30
    CommentedDec 4, 2012 at 20:14

4 Answers 4

3

As Lars Viklund points out in his comment, you might look into UUIDs -- they're 128-bit numbers used for unique IDs and have good stats for uniqueness. If you can't find a function to generate one (though I'm sure Apple has one), Version 4 is pretty easy to implement.

3
  • I'd love to use UUIDs (or GUIDs) but unless I'm missing something, this may violate my 1st constraint. SQLite nor iOS have a 128-bit data type. I have a huge existing code base where all keys are assumed to be a simple integral type. The amount of I would need to change would be huge.
    – rmaddy
    CommentedDec 3, 2012 at 22:07
  • @rmaddy Well, maybe you could do some sort of variant on a UUID that will fit in 64-bits -- Version 4 uses random values (with a few bits that must be within particular ranges) so you might try shrinking that down into 64-bits instead of 128. Version 3 uses an MD5 hash -- maybe you could have the key be some hash of the time and the device ID that will fit in 64-bits?
    – paul
    CommentedDec 4, 2012 at 13:00
  • GUIDs are purpose designed for your problem. And despite being large, they are just plain old integers. Their size is needed to adequately guarantee globally unique values. Any other option with small integers will require sacrificing something. If I were in your shoes I would go so far is to store the GUID in string format to gain it's benefits.
    – mike30
    CommentedDec 4, 2012 at 15:15
1

It really depends on how the keys are used, and the consequences of accidental duplicates. Some obvious choices:

(1) the exact time (down to nanosecond if possible) is statistically certain to be unique among a few devices. Just using a sufficiently precise time as the key is likely to have an overall reliability greater than a more complicated scheme.

(2) Make every key a per-device key, and assign a real unique key when the device synchs to the cloud. Maintain a dictionary in the cloud. (2a) generate temporary keys serially. When you first synch, aquire a block of permanant keys and renumber.

(3) issue blocks of a billion of so keys to devices the the first time they attach, let the devices subissue keys as needed.

2
  • Thank you. These are good ideas to consider. I'm going wait a bit for other possible answers before accepting anything.
    – rmaddy
    CommentedDec 3, 2012 at 20:56
  • 2
    How about UUIDs?CommentedDec 3, 2012 at 21:18
1

One idea I have been toying around with in my head is logically splitting the integer key into two parts. The high 4 or 5 bits could be used as some sort of device id while the rest represents the actual key. The fuzzy part is figuring out how to come up with non-conflicting device ids that fit in a few bits

IMO This is the best Option. (I like having the device ID tied to the data)

My second Option would be to have the server Issue a block of ID to the client on sync. So when the client syncs to the Server the server tells the Client Id's 10000-20000 are yours. The next time the client syncs if there it has less then 5k Ids left, the server gives it and additional 5k id's.. If the client runs out of Ids it MUST sync.. (Its a good idea to not let the Client build up an infinite amount of data without syncing anyhow)

    0

    Your other option would be to continue assigning keys on the client as you currently do but then when you sync, you re-generate the key on the server which makes it unique. You will have to remember to update the key on any child entities to make sure they link back up correctly.

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.