Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout for streaming API calls needed #376

Open
twaltersp opened this issue Aug 22, 2024 · 3 comments
Open

Timeout for streaming API calls needed #376

twaltersp opened this issue Aug 22, 2024 · 3 comments

Comments

@twaltersp
Copy link

twaltersp commented Aug 22, 2024

We are trying to use gRPC for streaming between systems, however we need to assume the "other side" may misbehave. As a specific example:

  • A server is streaming measurement data to a client. The stream is expected to run for a long time (minutes to hours). The data is coming from a DAQ task so we need to keep up with the task rate. In this example the DAQ data is both being send to a client and used locally.
  • The client establishes a connection to the server and the server starts streaming the data.
  • The client starts to run behind in reading data from the streaming API. This causes the buffer on the server to fill.
  • Once the buffer is full the server side will block in the dll and wait for room in the buffer. This will cause the LV loop to be blocked. In a worst case situation where the client has hung, the only way I have found to stop the loop is to have a parallel loop that aborts the entire gRPC server, obviously not ideal.

A similar problem exists on the client side if the server hangs. The exposed timeout for the client is for the lifetime of the connection, not for an individual read call. Since this is a long duration stream (minutes to hours) I need to set that timeout to very long or -1. On the client since it is blocked in LabVIEW waiting on an occurance I can modify the client code to set a timeout for the occurance wait.

The server code is blocked in the dll so I can't work around the issue in LabVIEW. I need a way to either know that my buffer is filling so I can stop sending data or (better yet) a timeout on the streaming write call so I can handle the full buffer. This is similar to the TCP functions in LabVIEW.
image
image

AB#2837407

@j-medland
Copy link
Contributor

I think this would involve implementing a custom C++ interceptor which could monitor each outgoing message which would check if the timeout for each sent message had passed and cancel the message and/or close the stream.

There is a test for interceptors (which adds logging for each message) in the grpc repo to give you an idea of what would be involved - this would be a good chunk of C++ development and testing and require on-going maintenance as the C++ interceptor API stablizes (it is currently experimental)

https://github.com/grpc/grpc/blob/v1.44.0/test/cpp/end2end/server_interceptors_end2end_test.cc

@twaltersp
Copy link
Author

@j-medland From your comments I'm guessing you are not a fan of that option. :)

Is there a simpl(er) way to monitor how much room remains in the buffer so we know when a block is coming and we can pause ourselves?

Right now the best option I've come up with is to try to implement another stream back from the client showing what message number they last received and manually manage backpressure that way but it adds a lot of complexity since the return stream will be asynchronous.

@j-medland
Copy link
Contributor

I am open to anything but I am just a library user not a maintainer

Your suggestion of checking if there is room in the buffer helped me to stumble across this in the user guide which eludes to some concepts of controlling the flow and it links to some notes on how you can check if the underlying layer is ready to send the next message in the Java version of GRPC but nothing for the C++ version.

There is some discussion of backpressure and C++ GRPC in this stack overflow post and my naive understanding is that by choking the sender, the system is sort of working - its just that the way the LabVIEW service is architected, it blocks the whole service.

It might be possible to add an asyncronous wrapper around the send call (either on the C++ or LabVIEW side) with a timeout and I assume that if the spawned thread is blocked by a send-call then another thread cancelling the stream will allow that blocked spawned-thread call to fail and thus be unblocked so you can release resources gracefully. On second thoughts - I don't know if you can asyncronously launch a .vim so it would probably be best to handle it in the C++ code. Hmmmm

So, I am afraid I don't have an answer but it is certainly an interesting topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants