Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] max_ongoing_requests limited by max_concurrency in actor #47681

Open
aRyBernAlTEglOTRO opened this issue Sep 16, 2024 · 1 comment
Open
Labels
bug Something that is supposed to be working; but isn't serve Ray Serve Related Issue triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@aRyBernAlTEglOTRO
Copy link

What happened + What you expected to happen

  1. The Bug: max_ongoing_requests params in @serve.deployment isn't useful when it larger than 1000.
  2. Expected Behavior: max_ongoing_requests is useful even it larger than 1000.

Versions / Dependencies

  • Ray: 2.35.0
  • Python: 3.11.8
  • OS: Ubuntu 22.04.4 LTS

Reproduction script

Reproducible Script:

from ray import serve
from ray.serve.handle import DeploymentHandle
import asyncio

@serve.deployment(max_ongoing_requests=4096)
class Model:
    @serve.batch(max_batch_size=2048, batch_wait_timeout_s=2)
    async def __call__(self, ls: list[int]) -> list[int]:
        print(f"Length of input list: {len(ls)}")
        return ls

async def main() -> None:
    handle: DeploymentHandle = serve.run(Model.bind())
    await asyncio.gather(*[handle.remote(i) for i in range(2048)])

if __name__ == "__main__":
    asyncio.run(main())

Expect Output:

Length of input list: 2048

Actual Output:

Length of input list: 1000
Length of input list: 1000
Length of input list: 48

Issue Severity

Low: It annoys or frustrates me.

@aRyBernAlTEglOTRO aRyBernAlTEglOTRO added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 16, 2024
@aRyBernAlTEglOTRO
Copy link
Author

I think the issue is caused by the limitation of max_concurrency in Actor, which default is 1000. A quick solution is to modify add the "max_concurrency" in allowed_ray_actor_options in following and script:

allowed_ray_actor_options = {

and modify the RayActorOptionsSchema in the following script to add the support for max_concurrency.

class RayActorOptionsSchema(BaseModel):

But I think a better way is to align the max_ongoing_requests in DeploymentConfig and max_concurrency in ray actor, because they seems like share the same intention, but it will need more code changes.

@anyscalesam anyscalesam added the serve Ray Serve Related Issue label Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't serve Ray Serve Related Issue triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

2 participants