Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev update (23.8.17.) #4

Merged
merged 15 commits into from
Aug 17, 2023
Merged

Dev update (23.8.17.) #4

merged 15 commits into from
Aug 17, 2023

Conversation

c0sogi
Copy link
Owner

@c0sogi c0sogi commented Aug 17, 2023

🚀 This PR introduces a series of improvements aimed at enhancing user experience and refining the codebase. Here's a breakdown of the changes:


🌟 1. Exllama Module - LoRA Integration

  • By placing adapter_config.json and adapter_model.bin in the ./models/gptq/YOUR_MODEL directory, the system will now seamlessly initialize LoRA.

🔗 2. OpenAI Logit Bias Support

  • For API queries to models specified within the openai_replacement_models dictionary, there's an auto-conversion from OpenAI ID to Llama ID,_ courtesy of the Tiktoken tokenizer.

3. Optimized Worker Load Balancing

  • Workers within the process pool have undergone a revamp in their load balancing algorithm. Based on the computed worker_rank, they now allocate clients more efficiently. In scenarios where ranks tie, a random worker is selected.

📜 4. Enhanced Logging Mechanism

  • Expect crisper log messages henceforth. Additionally, both user prompts and response prompts stemming from Chat Completion and Text Completion operations are archived in logs/chat.log.

🔥 5. Docker Image Upgrades

  • The antecedent Docker image was reliant on the CPU version of llama.cpp, which can't use of CUDA acceleration. However, given the constraints in utilizing the CUDA Compiler during the build phase, JIT comes to the rescue to ensure automatic compilation.

@c0sogi c0sogi merged commit 023fb40 into master Aug 17, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant