Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This Pull Request (PR) incorporates various convenience improvements and code refactoring. The main changes are as follows:
Dependencies are automatically installed. By providing the
--install-pkgs
option when running the server, not only the packages of this project but also the packages of all related repositories are installed. This process includes identifying the appropriate version of CUDA and installing the corresponding PyTorch, as well as the installation of TensorFlow. Please refer to the README for more details.The need to install the pytest package has been eliminated by performing unittest instead of pytest.
The docker-compose file is configured to fetch the already built docker image from Dockerhub.
The poetry dependency is included in pyproject.toml. However, it is not recommended to directly install dependencies using poetry. When running the server, the toml file is converted to a requirements.txt file, and the necessary packages are installed via pip install.
There is no need to use semaphores because the concurrent use of the model is already limited through the scheduling of the process pool's workers. However, by using semaphores, a
queue
can be created to efficiently utilize the cache model left in the existing worker for the worker scheduler, so the feature has been retained.This PR has already passed all test suites in the Python 3.11, Windows 11, CUDA 12.2 environment and will be merged automatically after an appropriate code review.