[serve] `max_ongoing_requests` limited by `max_concurrency` in actor #47681

aRyBernAlTEglOTRO · 2024-09-16T09:58:26Z

What happened + What you expected to happen

The Bug: max_ongoing_requests params in @serve.deployment isn't useful when it larger than 1000.
Expected Behavior: max_ongoing_requests is useful even it larger than 1000.

Versions / Dependencies

Ray: 2.35.0
Python: 3.11.8
OS: Ubuntu 22.04.4 LTS

Reproduction script

Reproducible Script:

from ray import serve
from ray.serve.handle import DeploymentHandle
import asyncio

@serve.deployment(max_ongoing_requests=4096)
class Model:
    @serve.batch(max_batch_size=2048, batch_wait_timeout_s=2)
    async def __call__(self, ls: list[int]) -> list[int]:
        print(f"Length of input list: {len(ls)}")
        return ls

async def main() -> None:
    handle: DeploymentHandle = serve.run(Model.bind())
    await asyncio.gather(*[handle.remote(i) for i in range(2048)])

if __name__ == "__main__":
    asyncio.run(main())

Expect Output:

Length of input list: 2048

Actual Output:

Length of input list: 1000
Length of input list: 1000
Length of input list: 48

Issue Severity

Low: It annoys or frustrates me.

The text was updated successfully, but these errors were encountered:

aRyBernAlTEglOTRO · 2024-09-16T10:11:27Z

I think the issue is caused by the limitation of max_concurrency in Actor, which default is 1000. A quick solution is to modify add the "max_concurrency" in allowed_ray_actor_options in following and script:

ray/python/ray/serve/_private/config.py

Line 537 in 1c80db5

allowed_ray_actor_options = {

and modify the RayActorOptionsSchema in the following script to add the support for max_concurrency.

ray/python/ray/serve/schema.py

Line 190 in 1c80db5

class RayActorOptionsSchema(BaseModel):

But I think a better way is to align the max_ongoing_requests in DeploymentConfig and max_concurrency in ray actor, because they seems like share the same intention, but it will need more code changes.

aRyBernAlTEglOTRO added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 16, 2024

anyscalesam added the serve Ray Serve Related Issue label Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve] `max_ongoing_requests` limited by `max_concurrency` in actor #47681

[serve] `max_ongoing_requests` limited by `max_concurrency` in actor #47681

aRyBernAlTEglOTRO commented Sep 16, 2024

aRyBernAlTEglOTRO commented Sep 16, 2024

[serve] max_ongoing_requests limited by max_concurrency in actor #47681

[serve] max_ongoing_requests limited by max_concurrency in actor #47681

Comments

aRyBernAlTEglOTRO commented Sep 16, 2024

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

aRyBernAlTEglOTRO commented Sep 16, 2024

[serve] `max_ongoing_requests` limited by `max_concurrency` in actor #47681

[serve] `max_ongoing_requests` limited by `max_concurrency` in actor #47681