Race condition?

HippoMan · August 2018

I've noticed what seems to be a race condition within the CLI.

Suppose I do the following:
(1) op get item MyItem and save the item's UUID
(2) op create item 'Secure Note' --title=MyItem [... more parameters ...] (to enter a new version of MyItem)
(3) op delete item UUID (to delete the original item after the new version has been created)

If immediately after step #3 I do an op get item MyItem, I often get a listing of multiple items: the original "MyItem" and the new "MyItem".
If I wait a few seconds and re-run op get item MyItem, I then get a proper listing of only the new "MyItem" that I created.

This seems to indicate that one or more of the three original operations were queued up in the background, even though the op command returned with a status that said they are complete ... and this leads to a race condition.

It would be desirable if there was some way to tell the op commands to wait and not return until the background operations are completed. I understand that this probably shouldn't be the default behavior with op, but op could be enhanced to take an optional --wait command-line argument which, if present, tells it to wait for the background processing to finish.

Is this possible?

Thank you.

1Password Version: Not Provided
Extension Version: Not Provided
OS Version: Not Provided
Sync Type: Not Provided

jpgoldberg · August 2018

I guess this kind of thing was inevitable once people started to script interactions so that requests are made in quick succession. The way the op command is structured, each call to it is a separate process, so there is no real way for an op get to know that a previous op delete has or hasn't finished.

The actual API is mostly RESTful, and so a get item and a delete item hit the server at nearly the same time then there is no guarantee that the one that arrived first is going to finish before the one that arrives second.

I suppose that we could introduce a local locking mechanism, but on the whole I'd expect that that would introduce more problems than it would solve. Anyway, I'm just brainstorming here.

HippoMan · August 2018

Thank you. Here's a suggestion. How about adding the following logic:

Add an optional parameter to each op call, perhaps something like --requestid. If that option is specified, a unique ID is generated for the op call which (1) is returned to the caller and (2) is passed to the REST server and associated with the function that is being performed.

Then, add a new op call, something like op query ID, where ID is the value of any previously returned request ID. That call could print a string containing the status of the specified request, such as COMPLETED, FAILED, PENDING, RUNNING, NOTFOUND, etc.
Then, for any specfied op call, the user could do the following:

(1) Make the desired op call with the --requestid parameter
(2) Capture the request ID that the op call returns
(3) Repeatedly call op query ID until it returns either COMPLETED, FAILED, or NOTFOUND.
Then, the user can be sure that the work that the op command performed is not in progress.

Thoughts?

cohix · August 2018

@HippoMan I think this is less about async operations, and more about cache. There is a very short delay between an item operation being performed and our data cache being re-filled. All operations on the server (more or less) are "synchronous", i.e. when the API call returns, the operation is done. The issue is the cache, sometimes it isn't as fast as the CLI.

HippoMan · August 2018

So the CLI operation is complete, but might take a short time before it appears to be complete, because the cache may not have been updated yet ... am I understanding correctly now?

Even in this case, this cache update delay can cause subsequent CLI commands to fail, as I described in my original post in this thread, and this still is a race condition.

Is there any way we can query the cache status or issue some sort of command that will wait until that status is updated?

If not, in a future op release, can we have some sort of op wait-for-cache-update command? ... or perhaps op cache-status which can return COMPLETE or IN-PROGRESS, or something like that?

HippoMan · August 2018

Or perhaps could I query some piece of data in any of the files in the $TMPDIR/com.agilebits.op.UID directory in order to see if the cache is up to date? ... perhaps the accessed attribute?

cohix · August 2018

Since the cache is on the server side, I'm not sure that a wait for cache command is viable, but I'm investigating ways we can mitigate this issue.

HippoMan · August 2018

Thank you. I can work around this issue, but because it's a race condition, my workarounds are not foolproof, although they do lower the probability of it occurring.

It seems to me that my August 8th idea, above, might be a workable way to fix this race condition. It's a simple procedure to generate a some sort of UID that is associated with every request, and to optionally return this to the caller if he or she asks for it via a new command-line parameter, perhaps named --request-id.

Then, the only remaining work would be to make that UID be part of the data that gets cached, and to then write one new op command to access the server in order to let the caller know the status of the request associated with any given UID.

People who don't care about the race condition would be able to use the CLI exactly as they do today, and only if someone wants to prevent a race condition would he or she use --request-id and then issue the new op command to see the status associated with a previous command's UID.

Of course, I'm making this suggestion without knowing the details of your application's implementation. My methodology might turn out to be totally non-feasible. :)

I'm glad to help with this effort in any way I can.

cohix · August 2018

Thanks :) We do have request IDs already, but those IDs are not associated with cache operations currently. I'm going to discuss with the team to see if there's anything we can do here.

HippoMan · January 2019

Hi after nearly six months!

I am revisiting this issue, and I'm wondering if this race condition has been fixed in any CLI releases since I started this thread. Thank you.

cohix · January 2019

@HippoMan We have made some server tweaks there, so you can go ahead and test to see if the race condition is resolved.

Race condition?

Comments