-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exercise Chat
: Implement native function calling agent
#154
base: main
Are you sure you want to change the base?
Conversation
…e-chat/submission-failure-event
WalkthroughThe changes encompass a comprehensive enhancement of the message conversion system, the introduction of new data transfer objects (DTOs), and modifications to existing classes and methods across multiple files. Key updates include the addition of new message types, improved handling of tool interactions, and the integration of new prompts for educational chat agents. The overall architecture has been refined to accommodate these new functionalities and improve data validation and processing. Changes
Recent review detailsConfiguration used: CodeRabbit UI Files ignored due to path filters (1)
Files selected for processing (30)
Additional comments not posted (50)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Outside diff range, codebase verification and nitpick comments (3)
app/domain/data/programming_exercise_dto.py (1)
34-38
:recent_changes
field lacks integration.The
recent_changes
field is defined inprogramming_exercise_dto.py
but is not currently integrated with version control or change tracking systems. Consider implementing functionality to utilize this field effectively for tracking Git diffs and changes.
- Location:
app/domain/data/programming_exercise_dto.py
Analysis chain
Addition of
recent_changes
enhances transparency.The new
recent_changes
field, which captures the Git diff of recent changes, is a valuable addition for tracking updates. Ensure that this field is properly integrated with version control and change tracking systems.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify integration of `recent_changes` with version control systems. # Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems. rg --type python -A 5 $'recent_changes'Length of output: 69
Script:
#!/bin/bash # Description: Verify integration of `recent_changes` with version control systems. # Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems. rg --type py -A 5 $'recent_changes'Length of output: 410
Script:
#!/bin/bash # Description: Search for references to `recent_changes` across the codebase to verify its integration. # Test: Find all occurrences of `recent_changes` to check for integration with version control systems. rg 'recent_changes'Length of output: 106
app/domain/data/exercise_with_submissions_dto.py (1)
37-37
: Field addition approved; consider reviewing the alias usage.The addition of the
url
field with a default value ofNone
is well-implemented and ensures backward compatibility. However, the use of an alias that directly matches the field name might be unnecessary unless there are specific serialization requirements that necessitate this approach.app/common/message_converters.py (1)
Line range hint
28-47
: Refactor suggestion for handlingPyrisAIMessage
andAIMessage
conversions.The function
convert_iris_message_to_langchain_message
has been updated to handlePyrisAIMessage
specifically, which includes processing tool calls. This is a significant enhancement for supporting interactive tool functionalities. However, the current implementation could be improved for clarity and maintainability:
- Extract Tool Call Conversion: The conversion logic for tool calls (lines 39-46) could be extracted into a separate function. This would make the
convert_iris_message_to_langchain_message
function cleaner and more focused on its primary responsibility.- Error Handling: Consider adding more specific error messages or custom exceptions for better debugging and user feedback.
|
||
|
||
class ExerciseChatPipelineExecutionDTO(ChatPipelineExecutionDTO): | ||
submission: Optional[ProgrammingSubmissionDTO] = None | ||
exercise: ProgrammingExerciseDTO | ||
course: CourseDTO | ||
event_payload: Optional[PyrisEventDTO[Any]] = Field(None, alias="eventPayload") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New attribute event_payload
is well-integrated.
The addition of event_payload
with an alias for serialization purposes is well-implemented. It would be beneficial to add documentation to explain its usage and relevance in the context of the application.
Would you like me to draft the documentation for this new attribute?
class PyrisToolMessage(PyrisMessage): | ||
model_config = ConfigDict(populate_by_name=True) | ||
sender: IrisMessageRole = IrisMessageRole.TOOL | ||
contents: List[ToolMessageContentDTO] = Field(default=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specialization in PyrisToolMessage
is appropriate.
The setting of sender
to TOOL
and the specific structure for contents
are well-suited for messages from tools. It would be beneficial to add documentation detailing the specific use cases and expected content structure for tool messages.
Would you like me to draft the documentation for this class?
@@ -30,3 +30,9 @@ class ProgrammingExerciseDTO(BaseModel): | |||
problem_statement: str = Field(alias="problemStatement", default=None) | |||
start_date: Optional[datetime] = Field(alias="startDate", default=None) | |||
end_date: Optional[datetime] = Field(alias="endDate", default=None) | |||
max_points: Optional[float] = Field(alias="maxPoints", default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addition of max_points
is beneficial.
The new max_points
field adds valuable metrics for grading. Consider adding validation to ensure that the value is non-negative.
# Suggested validation to ensure non-negative values
if self.max_points is not None and self.max_points < 0:
raise ValueError("max_points must be non-negative")
tools: Sequence[ | ||
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool] | ||
] = [] | ||
_client: OpenAI | ||
|
||
def chat( | ||
self, messages: list[PyrisMessage], arguments: CompletionArguments | ||
) -> PyrisMessage: | ||
print("Sending messages to OpenAI", messages) | ||
# noinspection PyTypeChecker | ||
retries = 10 | ||
backoff_factor = 2 | ||
initial_delay = 1 | ||
|
||
for attempt in range(retries): | ||
try: | ||
if arguments.response_format == "JSON": | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
response_format=ResponseFormatJSONObject(type="json_object"), | ||
) | ||
if self.tools: | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
response_format=ResponseFormat(type="json_object"), | ||
tools=self.tools, | ||
) | ||
else: | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
response_format=ResponseFormat(type="json_object"), | ||
) | ||
else: | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
) | ||
if self.tools: | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
tools=self.tools, | ||
) | ||
else: | ||
response = self._client.chat.completions.create( | ||
model=self.model, | ||
messages=convert_to_open_ai_messages(messages), | ||
temperature=arguments.temperature, | ||
max_tokens=arguments.max_tokens, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complex logic in chat
method of OpenAIChatModel
needs attention.
The chat
method in OpenAIChatModel
has been updated to handle tool integrations. The method is complex and handles multiple scenarios:
- Refactor Suggestion: Consider refactoring to reduce complexity. Splitting the method into smaller parts or using helper functions for handling retries and response construction could improve clarity.
- Error Handling: The method includes error handling and retry logic, which is good. However, ensure that all potential error paths are covered, especially with the new tool integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Outside diff range, codebase verification and nitpick comments (9)
app/domain/chat/exercise_chat/exercise_chat_pipeline_execution_dto.py (1)
16-16
: Well-implemented attribute addition.The
event_payload
attribute is correctly implemented with an alias for serialization purposes. Consider adding a comment explaining the purpose of this attribute for future maintainability.app/domain/pyris_message.py (2)
31-34
: Well-designed subclass for AI messages.
PyrisAIMessage
is appropriately designed for AI-generated messages. Consider adding documentation for thetool_calls
attribute to clarify its usage.
37-40
: Clear and functional design for tool messages.
PyrisToolMessage
is well-designed for messages from tools. Adding documentation for thecontents
attribute could enhance clarity and maintainability.app/domain/data/programming_exercise_dto.py (1)
33-38
: Useful additions to the DTO.The new fields
max_points
andrecent_changes
are well-implemented and enhance the functionality of theProgrammingExerciseDTO
. Consider adding validation for themax_points
field to ensure it remains within a sensible range.app/llm/request_handler/capability_request_handler.py (1)
74-81
: Well-implemented method for tool binding.The
bind_tools
method is correctly implemented and integrates well with the existing class structure. The use of type annotations is commendable as it enhances code readability and maintainability.Consider adding error handling or logging within the
bind_tools
method to manage potential issues during the tool binding process, such as when the selected model does not support tool binding or when an invalid tool type is passed.app/llm/external/model.py (1)
53-63
: New abstract methodbind_tools
added.The addition of the
bind_tools
method is a positive enhancement, increasing the extensibility of theLanguageModel
class. It allows for more complex interactions with various tools.Consider providing more detailed documentation or examples on how to implement this method in subclasses, especially given the variety of tool types that can be bound.
app/web/routers/pipelines.py (1)
28-37
: Enhanced flexibility inrun_exercise_chat_pipeline_worker
.The addition of the
variant
parameter and the use ofExerciseChatAgentPipeline
enhance the flexibility and adaptability of the pipeline. This change allows for variant-specific processing, which could lead to improved performance in handling different scenarios.Consider adding more detailed logging or error handling specific to the
variant
parameter to ensure that any issues related to variant-specific configurations are captured and addressed promptly.app/llm/external/ollama.py (2)
Line range hint
83-135
: Approve existing methods and suggest documentation forbind_tools
.The existing methods in the
OllamaModel
class are correctly implemented. However, thebind_tools
method is marked as not implemented. It's important to document its intended future functionality to guide development and integration.Would you like help drafting the documentation for the
bind_tools
method?
Potential Issue with
convert_to_iris_message
FunctionThe
convert_to_iris_message
function inapp/llm/external/ollama.py
correctly maps therole
andcontent
fields from aMessage
to aPyrisMessage
. However, it usesdatetime.now()
for thesend_at
field, which may not preserve the original timestamp from theMessage
. Additionally, other fields in theMessage
are not handled, which could affect data integrity if those fields are important.
- Consider preserving the original timestamp if it is significant.
- Review if additional fields from the
Message
should be included in thePyrisMessage
.Analysis chain
Line range hint
43-66
: Verify correct mapping and data integrity inconvert_to_iris_message
.Ensure that the
convert_to_iris_message
function correctly maps all fields and maintains data integrity during the conversion.Run the following script to verify the correct mapping and data integrity:
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the correct mapping and data integrity in `convert_to_iris_message`. # Test: Search for the function usage. Expect: Correct mapping and data integrity. rg --type python -A 5 $'convert_to_iris_message'Length of output: 78
Script:
#!/bin/bash # Description: Verify the correct mapping and data integrity in `convert_to_iris_message`. # Test: Search for the function definition and usage. Expect: Correct mapping and data integrity. rg --type py -A 5 $'def convert_to_iris_message'Length of output: 827
Script:
#!/bin/bash # Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`. # Extract the function implementation to verify correct mapping and data integrity. ast-grep --lang python --pattern $'def convert_to_iris_message($_) {\n $$$\n}' app/llm/external/ollama.pyLength of output: 569
Script:
#!/bin/bash # Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`. # Extract the function implementation to verify correct mapping and data integrity. rg --type py -A 20 $'def convert_to_iris_message' app/llm/external/ollama.pyLength of output: 545
class PyrisEventDTO(BaseModel, Generic[T]): | ||
event_type: str = Field(None, alias="eventType") | ||
event: T = Field(None, alias="event") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding explicit default values and improving type annotations.
The PyrisEventDTO
class correctly uses Pydantic models and generics. However, it would benefit from the following improvements:
- Specify explicit default values for
event_type
andevent
to enhance clarity and ensure that these fields are optional. - Improve type annotations by specifying more precise types for
event_type
andevent
.
Consider applying these changes:
- event_type: str = Field(None, alias="eventType")
- event: T = Field(None, alias="event")
+ event_type: Optional[str] = Field(default=None, alias="eventType")
+ event: Optional[T] = Field(default=None, alias="event")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class PyrisEventDTO(BaseModel, Generic[T]): | |
event_type: str = Field(None, alias="eventType") | |
event: T = Field(None, alias="event") | |
class PyrisEventDTO(BaseModel, Generic[T]): | |
event_type: Optional[str] = Field(default=None, alias="eventType") | |
event: Optional[T] = Field(default=None, alias="event") |
class ToolMessageContentDTO(BaseModel): | ||
|
||
model_config = ConfigDict(populate_by_name=True) | ||
name: Optional[str] = Field(alias="toolName", default="") | ||
tool_content: str = Field(alias="toolContent") | ||
tool_call_id: str = Field(alias="toolCallId") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making model_config
an instance attribute and verify its intended usage.
The ToolMessageContentDTO
class uses Pydantic models effectively. However, the model_config
being a class attribute might not be appropriate unless it is intended to be shared across all instances. Typically, DTOs should not have shared state unless explicitly required.
Consider changing model_config
to an instance attribute or provide clarification on its intended use:
- model_config = ConfigDict(populate_by_name=True)
+ model_config: ConfigDict = Field(default_factory=lambda: ConfigDict(populate_by_name=True))
Committable suggestion was skipped due to low confidence.
class FunctionDTO(BaseModel): | ||
name: str = Field(..., alias="name") | ||
arguments: Json[Any] = Field(..., alias="arguments") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest enhancing type safety for arguments
.
While Json[Any]
provides flexibility, it may compromise type safety. Consider specifying more detailed types if possible, or validate the structure of arguments
at runtime to prevent errors.
class ToolCallDTO(BaseModel): | ||
|
||
model_config = ConfigDict(populate_by_name=True) | ||
id: str = Field(alias="id") | ||
type: Literal["function"] = "function" | ||
function: FunctionDTO = Field(alias="function") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider future extensibility for ToolCallDTO
.
The type
field is currently restricted to Literal["function"]
. Consider allowing more types if future extensions are planned, to avoid limiting the DTO's usability.
if message.sender == "TOOL": | ||
match content: | ||
case ToolMessageContentDTO(): | ||
openai_messages.append( | ||
{ | ||
"role": "tool", | ||
"content": content.tool_content, | ||
"tool_call_id": content.tool_call_id, | ||
} | ||
) | ||
case _: | ||
pass | ||
else: | ||
match content: | ||
case ImageMessageContentDTO(): | ||
openai_content.append( | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": f"data:image/jpeg;base64,{content.base64}", | ||
"detail": "high", | ||
}, | ||
} | ||
) | ||
case TextMessageContentDTO(): | ||
openai_content.append( | ||
{"type": "text", "text": content.text_content} | ||
) | ||
case JsonMessageContentDTO(): | ||
openai_content.append( | ||
{ | ||
"type": "json_object", | ||
"json_object": content.json_content, | ||
} | ||
) | ||
case _: | ||
pass | ||
|
||
if isinstance(message, PyrisAIMessage) and message.tool_calls: | ||
openai_message = { | ||
"role": map_role_to_str(message.sender), | ||
"content": openai_content, | ||
"tool_calls": [ | ||
{ | ||
"id": tool.id, | ||
"type": tool.type, | ||
"function": { | ||
"name": tool.function.name, | ||
"arguments": json.dumps(tool.function.arguments), | ||
}, | ||
} | ||
for tool in message.tool_calls | ||
], | ||
} | ||
else: | ||
openai_message = { | ||
"role": map_role_to_str(message.sender), | ||
"content": openai_content, | ||
} | ||
openai_messages.append(openai_message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor: Enhance message conversion logic in convert_to_open_ai_messages
.
The updated logic to handle messages from tools separately is a significant improvement. It allows for the inclusion of tool-specific content in the OpenAI message format, which is crucial for integrating tool interactions into the chat. Ensure that all new branches and conditions are covered by unit tests to verify correct behavior.
Consider refactoring the method to separate concerns more clearly, possibly by extracting the tool message handling into a separate function.
query: PyrisMessage | None = None | ||
if history: | ||
query = dto.chat_history[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated logic in _run_exercise_chat_pipeline
for handling variants:
The method now includes logic to append different system prompts based on the variant
value. This is a robust design choice that tailors the chatbot's responses to specific scenarios like "progress_stalled" or "build_failed". It's crucial to ensure that these variants are well-documented and tested to prevent any unexpected behavior.
Consider adding unit tests to cover these new branches in the logic to ensure they behave as expected under different conditions.
Also applies to: 248-259
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
Outside diff range, codebase verification and nitpick comments (8)
app/domain/data/exercise_with_submissions_dto.py (1)
37-37
: Field definition forurl
is correct but consider simplifying the alias.The implementation of the
url
field inExerciseWithSubmissionsDTO
is correct. However, the alias "url" is redundant since it matches the field name. Consider removing the alias for simplicity unless there is a specific reason for its inclusion.app/llm/llm_manager.py (1)
42-42
: Good shift towards validation inload_llms
method.The change to use
model_validate
instead ofparse_obj
in theload_llms
method is a positive step towards ensuring data integrity. Consider adding error handling or logging to monitor the impact of this change on the system.app/llm/request_handler/capability_request_handler.py (1)
74-81
: Lack of Validation fortools
Parameter inbind_tools
MethodThe
bind_tools
method incapability_request_handler.py
does not perform any validation or security checks on thetools
parameter before binding them to the language model. This could lead to potential security vulnerabilities. It is recommended to implement validation checks to ensure that only valid and safe tools are bound to the model.
- File:
app/llm/request_handler/capability_request_handler.py
- Lines: 74-81
Analysis chain
Review the
bind_tools
method implementation.The implementation of
bind_tools
appears correct and aligns with the PR objectives. However, consider adding security checks or validations for thetools
parameter to ensure that only valid tools are bound to the model.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify security checks for the `tools` parameter. # Test: Search for security checks related to `tools`. Expect: Security checks or validations. rg --type python -A 5 $'tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]'Length of output: 132
Script:
#!/bin/bash # Description: Search for occurrences of `tools` and any associated validation or security checks in the codebase. # Search for occurrences of `tools` in the codebase to identify any validation or security checks. rg 'tools' -A 5Length of output: 27543
app/llm/external/model.py (1)
47-47
: Potential issues with return type change inchat
method.The return type of the
chat
method inapp/llm/external/model.py
has been changed toChatCompletionMessage
, but other implementations and usages still expectPyrisMessage
. This discrepancy could lead to compatibility issues. Ensure that all usages of thechat
method are updated to handle the new return type correctly.
- Files to check:
app/pipeline/lecture_ingestion_pipeline.py
app/pipeline/competency_extraction_pipeline.py
app/llm/external/openai_chat.py
app/llm/external/ollama.py
- Various request handler files
Analysis chain
Verify the impact of the return type change in the
chat
method.The change in return type from
PyrisMessage
toChatCompletionMessage
could affect downstream processing. Verify that all existing usages of thechat
method have been updated to handle the new return type.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the impact of the return type change on existing code. # Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`. rg --type python -A 5 $'chat\('Length of output: 63
Script:
#!/bin/bash # Description: Verify the impact of the return type change on existing code. # Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`. rg --type py -A 5 $'chat\('Length of output: 5840
app/llm/external/ollama.py (1)
Line range hint
83-135
: Review: New methodbind_tools
and class structure.The class
OllamaModel
is well-structured and inherits appropriately from related classes. The new methodbind_tools
is designed to raise aNotImplementedError
, indicating it's a placeholder for future functionality. This approach is acceptable as it clearly communicates the current limitations and future intentions.However, consider adding more detailed documentation to the
bind_tools
method to explain its future purpose and expected integration.Would you like help drafting the additional documentation for this method?
app/pipeline/chat/interaction_suggestion_pipeline.py (1)
Line range hint
115-145
: Approved: Streamlined prompt handling and improved error handling.The modifications to the
__call__
method in theInteractionSuggestionPipeline
class simplify the handling of chat history and prompts. The removal of specific prompts related to chat history reduces redundancy and potential confusion. The addition of aValueError
when no last message is provided enhances the robustness of the method by ensuring necessary data is present.Consider adding unit tests to cover these changes, especially the new error handling paths, to ensure they work as expected under various scenarios.
Would you like assistance in creating these unit tests?
app/common/message_converters.py (1)
Line range hint
22-47
: Approved: Expanded message conversion functionality.The function
convert_iris_message_to_langchain_message
has been effectively expanded to handle new message types such asToolMessage
andPyrisToolMessage
. The robust error handling for empty messages and type mismatches ensures reliable operation.Consider adding detailed documentation to describe the handling of these new message types, which will aid in maintenance and future enhancements.
Would you like help with drafting this documentation?
app/pipeline/chat/course_chat_pipeline.py (1)
Line range hint
585-591
: Suggest refining exception handling.The broad exception handling in the
__call__
method could be refined to catch specific exceptions. This would allow for more targeted error responses and recovery actions, improving the robustness and maintainability of the code.Consider handling specific exceptions like
ValueError
orTypeError
separately to provide more informative error messages and recovery options.
|
||
|
||
class PyrisMessage(BaseModel): | ||
model_config = ConfigDict(populate_by_name=True) | ||
|
||
sent_at: datetime | None = Field(alias="sentAt", default=None) | ||
sender: IrisMessageRole | ||
contents: List[MessageContentDTO] = [] | ||
|
||
contents: Union[str, List[MessageContentDTO]] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest additional validation for the contents
attribute.
Given the change to a union type, ensure that the contents
attribute has adequate validation to handle both str
and List[MessageContentDTO]
types effectively.
Consider implementing a method in the PyrisMessage
class to validate the contents
based on its type before processing.
def bind_tools( | ||
self, | ||
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]], | ||
) -> LanguageModel: | ||
llm = self.llm_manager.get_llm_by_id(self.model_id) | ||
llm.bind_tools(tools) | ||
return llm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve the bind_tools
method with a suggestion.
The implementation of bind_tools
is correct and aligns with the described functionality. However, it would benefit from a detailed docstring explaining the parameters and the expected behavior.
Consider adding a detailed docstring to the bind_tools
method:
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""
Binds a sequence of tools to the language model.
Args:
tools (Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]): A sequence of tools to be bound.
Returns:
LanguageModel: The language model with tools bound.
"""
llm = self.llm_manager.get_llm_by_id(self.model_id)
llm.bind_tools(tools)
return llm
@abstractmethod | ||
def bind_tools( | ||
self, | ||
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]], | ||
) -> LanguageModel: | ||
"""Bind tools""" | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve the addition of bind_tools
method with a suggestion.
The addition of the bind_tools
method to the RequestHandler
interface is crucial for ensuring consistency across implementations. It would benefit from a detailed docstring explaining the parameters and the expected behavior.
Consider adding a detailed docstring to the bind_tools
method:
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""
Abstract method to bind tools to the language model.
Args:
tools (Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]): A sequence of tools to be bound.
Returns:
LanguageModel: The language model with tools bound.
"""
raise NotImplementedError
def bind_tools( | ||
self, | ||
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]], | ||
**kwargs: Any, | ||
) -> Runnable[LanguageModelInput, BaseMessage]: | ||
self.request_handler.bind_tools(tools) | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New method bind_tools
enhances flexibility.
The addition of the bind_tools
method is a positive enhancement, allowing dynamic integration of various tools. Consider adding unit tests to verify the method's functionality with different types of tools.
Would you like me to help with creating these unit tests?
def convert_langchain_tool_calls_to_iris_tool_calls( | ||
tool_calls: List[ToolCall], | ||
) -> List[ToolCallDTO]: | ||
return [ | ||
ToolCallDTO( | ||
function=FunctionDTO( | ||
name=tc["name"], | ||
arguments=json.dumps(tc["args"]), | ||
), | ||
id=tc["id"], | ||
) | ||
for tc in tool_calls | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved: Effective tool call conversion with a note on JSON serialization.
The function convert_langchain_tool_calls_to_iris_tool_calls
effectively converts tool calls for integration, using JSON serialization for arguments. Ensure that this serialization process is thoroughly tested to maintain data integrity and prevent issues in downstream processing.
Would you like assistance in setting up tests for the JSON serialization process?
@tool | ||
def lecture_content_retrieval() -> str: | ||
""" | ||
Retrieve content from indexed lecture slides. | ||
This will run a RAG retrieval based on the chat history on the indexed lecture slides and return the most | ||
relevant paragraphs. | ||
This will run a RAG retrieval based on the chat history on the indexed lecture slides and return the | ||
most relevant paragraphs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest performance review for lecture content retrieval.
The method now incorporates chat history in retrieving lecture content, which could introduce performance bottlenecks depending on the size of the history and the efficiency of the retrieval algorithm.
Consider reviewing the performance of this method, especially under conditions of extensive chat histories, and explore potential optimizations.
@@ -121,6 +125,7 @@ | |||
:param dto: The pipeline execution data transfer object | |||
:param kwargs: The keyword arguments | |||
""" | |||
print(dto.model_dump_json(indent=4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace print statement with logging for production readiness.
Using print
for debugging is common in development, but it's recommended to use logger.debug
or logger.info
for better control over logging levels and outputs in production environments.
Consider replacing the print statement with a logging statement:
- print(dto.model_dump_json(indent=4))
+ logger.debug(dto.model_dump_json(indent=4))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print(dto.model_dump_json(indent=4)) | |
logger.debug(dto.model_dump_json(indent=4)) |
class ExerciseChatAgentPipeline(Pipeline): | ||
"""Exercise chat agent pipeline that answers exercises related questions from students.""" | ||
|
||
llm: IrisLangchainChatModel | ||
pipeline: Runnable | ||
callback: ExerciseChatStatusCallback | ||
suggestion_pipeline: InteractionSuggestionPipeline | ||
code_feedback_pipeline: CodeFeedbackPipeline | ||
prompt: ChatPromptTemplate | ||
variant: str | ||
|
||
def __init__(self, callback: ExerciseChatStatusCallback, variant: str = "default"): | ||
super().__init__(implementation_id="exercise_chat_pipeline") | ||
# Set the langchain chat model | ||
completion_args = CompletionArguments(temperature=0.5, max_tokens=2000) | ||
self.llm = IrisLangchainChatModel( | ||
request_handler=CapabilityRequestHandler( | ||
requirements=RequirementList( | ||
gpt_version_equivalent=4.5, | ||
), | ||
), | ||
completion_args=completion_args, | ||
) | ||
self.variant = variant | ||
self.callback = callback | ||
|
||
# Create the pipelines | ||
self.db = VectorDatabase() | ||
self.suggestion_pipeline = InteractionSuggestionPipeline(variant="exercise") | ||
self.retriever = LectureRetrieval(self.db.client) | ||
self.reranker_pipeline = RerankerPipeline() | ||
self.code_feedback_pipeline = CodeFeedbackPipeline() | ||
self.pipeline = self.llm | JsonOutputParser() | ||
self.citation_pipeline = CitationPipeline() | ||
|
||
def __repr__(self): | ||
return f"{self.__class__.__name__}(llm={self.llm})" | ||
|
||
def __str__(self): | ||
return f"{self.__class__.__name__}(llm={self.llm})" | ||
|
||
@traceable(name="Exercise Chat Agent Pipeline") | ||
def __call__(self, dto: ExerciseChatPipelineExecutionDTO): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class: ExerciseChatAgentPipeline
The class is well-structured and encapsulates the functionality of an exercise chat agent pipeline. The use of dependency injection for components like VectorDatabase
, LectureRetrieval
, and various pipelines is a good practice. However, the class is quite large and handles multiple responsibilities, which might violate the Single Responsibility Principle (SRP).
Consider breaking down this class into smaller components or services that handle specific aspects of the pipeline, such as message handling, tool execution, and response generation, to improve modularity and maintainability.
def convert_chat_history_to_str(chat_history: List[PyrisMessage]) -> str: | ||
""" | ||
Converts the chat history to a string | ||
:param chat_history: The chat history | ||
:return: The chat history as a string | ||
""" | ||
|
||
def map_message_role(role: IrisMessageRole) -> str: | ||
if role == IrisMessageRole.SYSTEM: | ||
return "System" | ||
elif role == IrisMessageRole.ASSISTANT: | ||
return "AI Tutor" | ||
elif role == IrisMessageRole.USER: | ||
return "Student" | ||
else: | ||
return "Unknown" | ||
|
||
return "\n\n".join( | ||
[ | ||
f"{map_message_role(message.sender)} {"" if not message.sent_at else f"at {message.sent_at.strftime( | ||
"%Y-%m-%d %H:%M:%S")}"}: {message.contents[0].text_content}" | ||
for message in chat_history | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function: convert_chat_history_to_str
This function converts a list of chat messages into a formatted string. The nested function map_message_role
is used to convert message roles into human-readable formats, which is a clean way to handle conditional logic. However, the function could be improved by handling potential exceptions that might occur during date formatting or when accessing message contents.
Consider adding error handling to manage cases where message.sent_at
might be None
or message.contents
might be empty, which could lead to exceptions.
- f"{map_message_role(message.sender)} {"" if not message.sent_at else f"at {message.sent_at.strftime(
+ f"{map_message_role(message.sender)} {"" if not message.sent_at else f"at {message.sent_at.strftime('%Y-%m-%d %H:%M:%S') if message.sent_at else 'Unknown date'}"}: {message.contents[0].text_content if message.contents else 'No content'}"
Committable suggestion was skipped due to low confidence.
HumanMessage( | ||
"Consider the student's newest and latest input:" | ||
), | ||
convert_iris_message_to_langchain_human_message(query), | ||
("placeholder", "{agent_scratchpad}"), | ||
] | ||
) | ||
else: | ||
self.prompt = ChatPromptTemplate.from_messages( | ||
[ | ||
SystemMessage( | ||
initial_prompt_with_date | ||
+ "\n" | ||
+ add_exercise_context_to_prompt( | ||
exercise_title, | ||
problem_statement, | ||
programming_language, | ||
) | ||
+ agent_prompt | ||
+ "\n" | ||
+ format_reminder_prompt, | ||
), | ||
("placeholder", "{agent_scratchpad}"), | ||
] | ||
) | ||
|
||
tools = generate_structured_tools_from_functions( | ||
[ | ||
get_submission_details, | ||
get_additional_exercise_details, | ||
get_build_logs_analysis_tool, | ||
get_feedbacks, | ||
repository_files, | ||
file_lookup, | ||
] | ||
) | ||
agent = create_tool_calling_agent( | ||
llm=self.llm, tools=tools, prompt=self.prompt | ||
) | ||
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False) | ||
out = None | ||
self.callback.in_progress() | ||
invocation_result = agent_executor.invoke(params) | ||
if invocation_result.get("output", None): | ||
out = invocation_result["output"] | ||
|
||
try: | ||
self.callback.in_progress("Refining response...") | ||
self.prompt = ChatPromptTemplate.from_messages( | ||
[ | ||
SystemMessagePromptTemplate.from_template(guide_system_prompt), | ||
] | ||
) | ||
|
||
guide_response = (self.prompt | self.llm | StrOutputParser()).invoke( | ||
{ | ||
"response": out, | ||
} | ||
) | ||
if "!ok!" in guide_response: | ||
print("Response is ok and not rewritten!!!") | ||
else: | ||
out = guide_response | ||
print("Response is rewritten.") | ||
|
||
self.callback.done("Response created", final_result=out) | ||
except Exception as e: | ||
logger.error( | ||
"An error occurred while running the course chat interaction suggestion pipeline", | ||
exc_info=e, | ||
) | ||
traceback.print_exc() | ||
self.callback.error("Error in refining response") | ||
try: | ||
if out: | ||
suggestion_dto = InteractionSuggestionPipelineExecutionDTO() | ||
suggestion_dto.chat_history = dto.chat_history | ||
suggestion_dto.last_message = out | ||
suggestions = self.suggestion_pipeline(suggestion_dto) | ||
self.callback.done(final_result=None, suggestions=suggestions) | ||
else: | ||
# This should never happen but whatever | ||
self.callback.skip( | ||
"Skipping suggestion generation as no output was generated." | ||
) | ||
except Exception as e: | ||
logger.error( | ||
"An error occurred while running the course chat interaction suggestion pipeline", | ||
exc_info=e, | ||
) | ||
traceback.print_exc() | ||
self.callback.error("Generating interaction suggestions failed.") | ||
except Exception as e: | ||
logger.error( | ||
"An error occurred while running the course chat pipeline", exc_info=e | ||
) | ||
traceback.print_exc() | ||
self.callback.error( | ||
"An error occurred while running the course chat pipeline." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method: __call__
This method orchestrates the execution of the chat agent pipeline. It is complex and handles multiple steps including logging, error handling, and invoking various tools. The method could benefit from further decomposition into smaller methods to improve readability and maintainability.
Refactor the __call__
method by extracting parts of its functionality into separate methods, such as handle_query
, setup_prompt
, and execute_tools
, to make the code cleaner and easier to manage.
- def __call__(self, dto: ExerciseChatPipelineExecutionDTO):
+ def handle_query(self, dto):
+ # Logic to handle query extraction and validation
+
+ def setup_prompt(self, dto, query):
+ # Logic to setup the chat prompt based on the query and dto
+
+ def execute_tools(self, dto, params):
+ # Logic to execute tools and handle outputs
+
+ def __call__(self, dto: ExerciseChatPipelineExecutionDTO):
+ query = self.handle_query(dto)
+ self.setup_prompt(dto, query)
+ params = {}
+ self.execute_tools(dto, params)
Committable suggestion was skipped due to low confidence.
This PR implements the exercise chat as a native tool calling agent. For now only OpenAI tool calling is supported.
More details will be added.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores