Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(forge): Add mount method to FileStorage & execute code in mounted workspace #7115

Conversation

kcze
Copy link
Contributor

@kcze kcze commented Apr 30, 2024

User description

Background

Python file execution doesn't work when non-local FileStorage is used.

Changes 🏗️

In forge:

  • Add FileStorage.mount() method, which mounts (part of) the workspace to a local path
    • Add watchdog library to watch file changes in mount
  • Amend CodeExecutorComponent
    • Amend execute_python_file to execute Python files in a workspace mount
    • Amend execute_python_code to create temporary .py file in workspace instead of as a local file
    • Add support for Path argument to filename parameter on execute_python_file

In autogpt:

  • Fix test_execute_python_code (by making it async)

PR Quality Scorecard ✨

  • Have you used the PR description template?   +2 pts
  • Is your pull request atomic, focusing on a single change?   +5 pts
  • Have you linked the GitHub issue(s) that this PR addresses?   +5 pts
  • Have you documented your changes clearly and comprehensively?   +5 pts
  • Have you changed or added a feature?   -4 pts
    • Have you added/updated corresponding documentation?   +4 pts
    • Have you added/updated corresponding integration tests?   +5 pts
  • Have you changed the behavior of AutoGPT?   -5 pts
    • Have you also run agbenchmark to verify that these changes do not regress performance?   +10 pts

Type

enhancement


Description

  • Enhanced Python code execution by making execute_python_code asynchronous and improving file handling.
  • Implemented file synchronization with storage using FileSyncHandler and context manager for mounting directories.
  • Added local storage mounting capabilities to handle file operations transparently.
  • Updated project dependencies to include watchdog for monitoring file system events.

Changes walkthrough

Relevant files
Enhancement
execute_code.py
Enhance Python code execution and file handling                   

autogpts/autogpt/autogpt/commands/execute_code.py

  • Converted execute_python_code to an asynchronous function.
  • Implemented a new method for handling temporary Python files using a
    workspace.
  • Enhanced execute_python_file to support execution in a Docker
    container with mounted local paths.
  • Added a helper method _generate_random_string to create random
    filenames.
  • +102/-84
    base.py
    Implement file synchronization and temporary mounting       

    autogpts/autogpt/autogpt/file_storage/base.py

  • Added a FileSyncHandler class to synchronize file operations with the
    storage.
  • Implemented a context manager mount to handle temporary local
    directories.
  • +61/-1   
    local.py
    Add local storage mounting capability                                       

    autogpts/autogpt/autogpt/file_storage/local.py

  • Added a mount method as a context manager for local file storage.
  • +11/-1   
    Dependencies
    pyproject.toml
    Update project dependencies                                                           

    autogpts/autogpt/pyproject.toml

    • Added watchdog dependency to monitor file system events.
    +1/-0     

    PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    Allow to use synchronized local temp dir
    Copy link

    netlify bot commented Apr 30, 2024

    Deploy Preview for auto-gpt-docs canceled.

    Name Link
    🔨 Latest commit fa2806f
    🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/6650e4a4d7f61f00084bf85d

    Copy link

    codecov bot commented Apr 30, 2024

    Codecov Report

    All modified and coverable lines are covered by tests ✅

    Project coverage is 36.05%. Comparing base (edcbbbc) to head (f07518e).

    Current head f07518e differs from pull request most recent head fa2806f

    Please upload reports for the commit fa2806f to get more accurate results.

    Additional details and impacted files
    @@           Coverage Diff           @@
    ##           master    #7115   +/-   ##
    =======================================
      Coverage   36.05%   36.05%           
    =======================================
      Files          19       19           
      Lines        1273     1273           
      Branches      182      182           
    =======================================
      Hits          459      459           
      Misses        786      786           
      Partials       28       28           
    Flag Coverage Δ
    Linux 36.05% <ø> (ø)
    Windows 35.91% <ø> (ø)
    autogpt-agent 36.05% <ø> (ø)
    macOS ?

    Flags with carried forward coverage won't be shown. Click here to find out more.

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    Make temp file manually
    @kcze kcze marked this pull request as ready for review April 30, 2024 18:03
    @kcze kcze requested a review from a team as a code owner April 30, 2024 18:03
    @codiumai-pr-agent-pro codiumai-pr-agent-pro bot added the enhancement New feature or request label Apr 30, 2024
    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Description updated to latest commit (86c5dc7)

    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Review

    ⏱️ Estimated effort to review [1-5]

    4, due to the complexity of the changes involving asynchronous operations, file handling, and Docker integration. The PR modifies critical functionality and integrates new libraries, requiring careful consideration of thread safety, error handling, and system interactions.

    🧪 Relevant tests

    No

    🔍 Possible issues

    Possible Bug: The method _generate_random_string uses random.choices which is not cryptographically secure. This could be an issue if predictability of the file names affects security or functionality.

    Error Handling Concern: In the execute_python_file method, the Docker container execution does not handle potential exceptions from container.exec_run other than checking the exit code. This might miss other types of errors that could occur during command execution.

    🔒 Security concerns

    No

    Code feedback:
    relevant fileautogpts/autogpt/autogpt/commands/execute_code.py
    suggestion      

    Consider using secrets.token_urlsafe() instead of random.choices for generating secure random strings. This method provides a URL-safe text string, which is suitable for cryptographic use and ensures that the file names are not predictable. [important]

    relevant linerandom_string = ''.join(random.choices(characters, k=length))

    relevant fileautogpts/autogpt/autogpt/commands/execute_code.py
    suggestion      

    Add error handling for Docker command execution to capture and log any exceptions that might occur during container.exec_run, not just the exit code. This can help in diagnosing issues that occur during the execution of Python files in Docker containers. [important]

    relevant lineif exec_result.exit_code != 0:

    relevant fileautogpts/autogpt/autogpt/commands/execute_code.py
    suggestion      

    Implement logging for the entry and exit points of the execute_python_file method to improve traceability and debugging. This can help in understanding the flow of file execution, especially when errors occur. [medium]

    relevant linelogger.info(f"Executing python file '{filename}'")

    Comment on lines 117 to 122
    temp_path = Path("")
    while True:
    temp_path = f"temp{self._generate_random_string()}.py"
    if not self.workspace.exists(temp_path):
    break
    await self.workspace.write_file(temp_path, code)

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: Replace the manual loop for checking if a file exists with a more robust temporary file creation using the tempfile module. This avoids potential race conditions and simplifies the code. [enhancement]

    Suggested change
    temp_path = Path("")
    while True:
    temp_path = f"temp{self._generate_random_string()}.py"
    if not self.workspace.exists(temp_path):
    break
    await self.workspace.write_file(temp_path, code)
    with NamedTemporaryFile(suffix=".py", dir=self.workspace.root, delete=False) as tmp_file:
    temp_path = tmp_file.name
    await self.workspace.write_file(temp_path, code)

    Comment on lines 167 to 171
    file_path = self.workspace.get_path(filename)
    if not self.workspace.exists(file_path):
    # Mimic the response that you get from the command line to make it
    # intuitively understandable for the LLM
    raise FileNotFoundError(

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: Use a context manager to handle file operations to ensure that files are always closed properly, even if an error occurs. [best practice]

    Suggested change
    file_path = self.workspace.get_path(filename)
    if not self.workspace.exists(file_path):
    # Mimic the response that you get from the command line to make it
    # intuitively understandable for the LLM
    raise FileNotFoundError(
    with self.workspace.get_path(filename) as file_path:
    if not self.workspace.exists(file_path):
    raise FileNotFoundError(
    "AutoGPT is running in a Docker container; "
    f"executing {file_path} directly..."
    )

    Comment on lines 161 to 162
    if isinstance(args, str):
    args = args.split() # Convert space-separated string to a list

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: Instead of manually checking if args is a string and then splitting it, use a more Pythonic approach by ensuring args is always a list at the start of the function. [best practice]

    Suggested change
    if isinstance(args, str):
    args = args.split() # Convert space-separated string to a list
    args = shlex.split(args) if isinstance(args, str) else args

    Copy link
    Member

    @Pwuts Pwuts May 1, 2024

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggested change does not match suggestion description. 👎🏼

    BUT we do argument validation so this block can be removed entirely.

    Comment on lines 34 to 37
    with open(file_path, "rb") as f:
    content = f.read()
    assert isinstance(content, bytes)
    await self.storage.write_file(file_path, content)

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggestion: Use a more efficient method for reading file content into memory by leveraging read() directly on the file object opened in binary mode, which is more memory efficient for large files. [performance]

    Suggested change
    with open(file_path, "rb") as f:
    content = f.read()
    assert isinstance(content, bytes)
    await self.storage.write_file(file_path, content)
    with open(file_path, "rb") as f:
    await self.storage.write_file(file_path, f.read())

    Copy link
    Member

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I think the proposed change has zero effect on memory consumption, as the argument f.read() will still be evaluated and executed before write_file is called.

    autogpts/autogpt/autogpt/commands/execute_code.py Outdated Show resolved Hide resolved
    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    Changelog updates:

    2024-04-30

    Added

    • Implemented local temporary directory synchronization with storage for executing Python files in environments using non-local FileStorage.

    Changed

    • Enhanced execute_python_code to be asynchronous and improved file handling.
    • Updated project dependencies to include watchdog for monitoring file changes.

    Fixed

    • Resolved issues with Python file execution in Docker environments by ensuring proper file path handling and synchronization.

    to commit the new content to the CHANGELOG.md file, please type:
    '/update_changelog --pr_update_changelog.push_changelog_changes=true'

    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Analysis

    • This screen contains a list of code components that were changed in this PR.
    • You can initiate specific actions for each component, by checking the relevant boxes.
    • After you check a box, the action will be performed automatically by PR-Agent.
    • Results will appear as a comment on the PR, typically after 30-60 seconds.
    fileChanged components
    execute_code.py
    • Test
    • Docs
    • Improve
    • Similar
     
    execute_python_code
    (function)
     
    +9/-8
     
    • Test
    • Docs
    • Improve
    • Similar
     
    execute_python_file
    (function)
     
    +82/-74
     
    • Test
    • Docs
    • Improve
    • Similar
     
    _generate_random_string
    (method of CodeExecutorComponent)
     
    +6/-0
     
    base.py
    • Test
    • Docs
    • Improve
    • Similar
     
    FileSyncHandler
    (class)
     
    +35/-0
     
    • Test
    • Docs
    • Improve
    • Similar
     
    mount
    (method of FileStorage)
     
    +15/-0
     
    local.py
    • Test
    • Docs
    • Improve
    • Similar
     
    mount
    (method of LocalFileStorage)
     
    +7/-0
     

    ✨ Usage guide:

    Using static code analysis capabilities, the analyze tool scans the PR code changes and find the code components (methods, functions, classes) that changed in the PR.
    The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR:

    /analyze
    

    Language that are currently supported: Python, Java, C++, JavaScript, TypeScript.
    See more information about the tool in the docs.

    Copy link
    Member

    @Pwuts Pwuts left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    A few comments, but overall I like the setup!

    autogpts/autogpt/autogpt/commands/execute_code.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/commands/execute_code.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/commands/execute_code.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    @Pwuts Pwuts changed the title feat(autogpt): Implement local temporary dir feat(agent): Implement local temporary dir May 1, 2024
    @kcze kcze requested a review from Pwuts May 1, 2024 13:18
    Copy link

    github-actions bot commented May 4, 2024

    This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

    @github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label May 4, 2024
    @github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label May 6, 2024
    Copy link

    github-actions bot commented May 6, 2024

    Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

    @Pwuts Pwuts self-assigned this May 9, 2024
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    autogpts/autogpt/autogpt/file_storage/base.py Outdated Show resolved Hide resolved
    @kcze kcze requested a review from Pwuts May 12, 2024 09:21
    @github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label May 14, 2024
    Copy link

    This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

    Copy link

    Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

    @github-actions github-actions bot added conflicts Automatically applied to PRs with merge conflicts Forge and removed conflicts Automatically applied to PRs with merge conflicts labels May 24, 2024
    @github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label May 24, 2024
    Copy link

    Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

    @Pwuts Pwuts changed the title feat(agent): Implement local temporary dir feat(forge): Add mount method to FileStorage & execute code in mounted workspace May 24, 2024
    @Pwuts Pwuts merged commit 4e02f7d into Significant-Gravitas:master May 24, 2024
    14 of 16 checks passed
    @kcze kcze deleted the kpczerwinski/open-753-execute_python_file-broken-on-non-local-file-storage branch May 24, 2024 20:20
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    Status: Done
    Development

    Successfully merging this pull request may close these issues.

    None yet

    2 participants