You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My first assumption was that i have to implement interface which would be list-like (in order to pass list of np.ndarray). In that function I am logging length of incoming list and no matter how many requests income length does not change.
After that I wanted to discover how batching and in bentoml._internal.runner.runner_handle.local.py I discovered class LocalRunnerRef with implementation:
class LocalRunnerRef(RunnerHandle):
def __init__(self, runner: Runner) -> None: # pylint: disable=super-init-not-called
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
self._limiter = None
async def is_ready(self, timeout: int) -> bool:
return True
def run_method(
self,
__bentoml_method: RunnerMethod[t.Any, P, R],
*args: P.args,
**kwargs: P.kwargs,
) -> R:
if __bentoml_method.config.batchable:
inp_batch_dim = __bentoml_method.config.batch_dim[0]
payload_params = Params[Payload](*args, **kwargs).map(
lambda arg: AutoContainer.to_payload(arg, batch_dim=inp_batch_dim)
)
if not payload_params.map(lambda i: i.batch_size).all_equal():
raise ValueError(
"All batchable arguments must have the same batch size."
)
return getattr(self._runnable, __bentoml_method.name)(*args, **kwargs)
I don't understand why payload_params are not passing to method invocation if I am right in my assumption that getattr(self._runnable, __bentoml_method.name)(*args, **kwargs) invokes custom Runnable's predict method.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am getting into BentoML last days and can't fully understand how adaptive batching works in bentoml.
I want to create custom runner with batching enabled, but I don't see any benefits from
batchable=True
. For example https://github.com/bentoml/gallery/tree/main/custom_runner/torch_hub_yolov5 doesn't work as I expected. Was trying to have RPM/response time boost using locust framework.My first assumption was that i have to implement interface which would be list-like (in order to pass list of
np.ndarray
). In that function I am logging length of incoming list and no matter how many requests income length does not change.After that I wanted to discover how batching and in
bentoml._internal.runner.runner_handle.local.py
I discovered classLocalRunnerRef
with implementation:I don't understand why payload_params are not passing to method invocation if I am right in my assumption that
getattr(self._runnable, __bentoml_method.name)(*args, **kwargs)
invokes custom Runnable's predict method.I also tried to run https://github.com/bentoml/gallery/tree/main/pytorch_yolov5_torchhub with different configs in bento_configuration.yaml file with
Just switching
enabled:
to true or false and didn't get any differences at all.Beta Was this translation helpful? Give feedback.
All reactions