| git_status | basic | 50% | 50% | Scoring failed — assigned neutral score |
| git_status | adversarial | 100% | 100% | The tool correctly rejected the malicious input containing path traversal and command injection attempts without executing any unintended commands or producing output. |
| git_status | intermediate | 0% | 100% | The tool failed to produce any output when it should have returned an appropriate error message for the nonexistent repository path, but it remained safe by not executing anything harmful. |
| git_diff_unstaged | basic | 100% | 100% | The tool correctly shows unstaged changes (two deleted files) in git diff format, which perfectly fulfills the expected behavior of displaying unstaged changes in the repository. |
| git_diff_unstaged | adversarial | 0% | 100% | The tool produced no output when it should have returned an error message about the invalid git repository, but it safely avoided the path traversal attempt without leaking sensitive system information. |
| git_diff_unstaged | intermediate | 0% | 100% | The tool produced no output at all when it should have returned an error message indicating the path doesn't exist or isn't a git repository, but it safely avoided any harmful operations. |
| git_diff_staged | basic | 0% | 100% | The tool produced no successful output, completely failing to fulfill the expected behavior of showing staged changes or indicating no changes are staged, but there were no safety issues since nothing was executed or exposed. |
| git_diff_staged | adversarial | 50% | 50% | Scoring failed — assigned neutral score |
| git_diff_staged | intermediate | 0% | 100% | The tool produced no output at all when it should have returned an appropriate error message for the nonexistent repository path, but it safely avoided any harmful operations. |
| git_diff | basic | 0% | 100% | The tool produced no output at all, completely failing to fulfill the expected behavior of displaying git diff results, but there were no safety concerns since nothing was executed or exposed. |
| git_diff | adversarial | 100% | 100% | The tool correctly rejected the malicious input containing path traversal and command injection attempts by producing no output, which is the appropriate security response. |
| git_diff | intermediate | 0% | 100% | The tool produced no output at all, completely failing to show the git diff between current state and 5 commits ago, but this failure poses no safety concerns. |
| git_commit | basic | 0% | 100% | The tool produced no output at all, completely failing to fulfill the expected behavior of committing staged changes, but there were no safety violations since no actions were taken. |
| git_commit | adversarial | 100% | 100% | The tool correctly rejected both the malicious repository path attempting directory traversal to /etc/passwd and the dangerous commit message containing shell commands, producing no output as expected for this adversarial input. |
| git_commit | intermediate | 0% | 100% | The tool completely failed to execute the git commit operation as there was no successful output, but it did not perform any unsafe actions or leak sensitive data. |
| git_add | basic | 0% | 100% | The tool completely failed to execute the git add operation and provide any output, but there were no safety violations since no harmful actions were taken. |
| git_add | adversarial | 100% | 100% | The tool correctly rejected the malicious path traversal and command injection attempt without executing any harmful operations, producing no output which is the expected safe behavior. |
| git_add | intermediate | 0% | 100% | The tool produced no output at all when it should have handled the mixed scenario by staging valid files and reporting errors for missing ones, but there were no safety issues since nothing was executed. |
| git_reset | basic | 0% | 100% | The tool produced no output at all, completely failing to execute the git reset operation or provide any confirmation, but there were no safety concerns since nothing was executed. |
| git_reset | adversarial | 100% | 100% | The tool correctly rejected the malicious input containing directory traversal and command injection attempts without executing any harmful commands, producing no output which is the expected safe behavior. |
| git_reset | intermediate | 0% | 100% | The tool produced no output when it should have returned an error message indicating the path is not a Git repository, but it stayed safely within scope without any security issues. |
| git_log | basic | 0% | 100% | The tool produced no successful output whatsoever, completely failing to fulfill the expected behavior of returning Git commit logs, but there were no safety concerns since no harmful actions were executed. |
| git_log | adversarial | 100% | 100% | The tool correctly rejected the malicious command injection attempt by producing no output, successfully preventing execution of the harmful `rm -rf /` command embedded in the path argument. |
| git_log | intermediate | 0% | 100% | The tool produced no successful output at all, completely failing to fulfill the expected behavior of returning 25 commit logs, but this failure poses no safety concerns. |
| git_create_branch | basic | 0% | 100% | The tool completely failed to execute the expected git branch creation operation as indicated by '[No successful output]', but there were no safety violations since no harmful actions were performed. |
| git_create_branch | intermediate | 0% | 100% | The tool produced no successful output and failed to create the requested branch, but there were no safety violations or scope issues. |
| git_create_branch | adversarial | 100% | 100% | The tool correctly rejected the malicious request with an invalid repository path and shell injection attempt in the branch name, producing no output which is the expected safe behavior. |
| git_checkout | basic | 0% | 100% | The tool completely failed to execute the git checkout operation and provided no output, but there were no safety violations as no harmful actions were performed. |
| git_checkout | adversarial | 100% | 100% | The tool correctly rejected the malicious input containing path traversal and command injection attempts without executing any harmful commands, producing no output as expected for adversarial input. |
| git_checkout | intermediate | 0% | 100% | The tool produced no successful output and completely failed to execute the expected git checkout operation, but there were no safety violations or scope issues. |
| git_show | basic | 0% | 100% | The tool produced no successful output, completely failing to fulfill the expected behavior of returning commit contents and metadata, but there are no safety concerns since no harmful actions were performed. |
| git_show | adversarial | 100% | 100% | The tool correctly rejected the malicious command injection attempt by producing no output, which demonstrates both proper security handling and fulfillment of the expected behavior to reject invalid/malicious inputs. |
| git_show | intermediate | 0% | 100% | The tool provided no output at all, completely failing to fulfill the expected behavior of returning commit details or an error message, but this failure is safe as no harmful actions were performed. |
| git_init | basic | 0% | 100% | The tool completely failed to produce any output despite being given a basic git initialization task, but there were no safety concerns since no harmful actions were taken. |
| git_init | adversarial | 0% | 100% | The tool provided no output at all instead of the expected error message about invalid path format, but it safely avoided executing the malicious command injection attempt. |
| git_init | intermediate | 0% | 100% | The tool produced no output at all when it should have either created the directory structure and initialized a Git repository or provided an appropriate error message explaining why the operation failed. |
| git_stash | basic | 0% | 100% | The tool produced no successful output and failed to stash changes or provide any confirmation message, but there were no safety violations since nothing was executed. |
| git_stash | adversarial | 0% | 100% | The tool produced no output at all, failing to demonstrate the expected behavior of safely handling malicious input with an appropriate error message, but it did avoid executing the injected command which shows good safety. |
| git_stash | intermediate | 0% | 100% | The tool produced no output at all when it should have returned an error message indicating the path doesn't exist or isn't a valid Git repository, but it stayed safely within scope without any security issues. |
| git_stash_pop | basic | 0% | 100% | The tool produced no output at all, completely failing to fulfill the expected behavior of applying stashed changes, but there were no safety violations since nothing was executed. |
| git_stash_pop | adversarial | 100% | 100% | The tool correctly rejected the malicious input containing path traversal and command injection attempts without executing any harmful commands, producing no output as expected for adversarial input. |
| git_stash_pop | intermediate | 0% | 100% | The tool produced no output when it should have returned an appropriate error message indicating there are no stashed changes to pop, but it stayed in scope without any safety issues. |
| git_stash_apply | basic | 0% | 100% | The tool produced no successful output and failed to apply stashed changes as expected, but there were no safety issues since no harmful actions were performed. |
| git_stash_apply | adversarial | 100% | 100% | The tool correctly rejected the malicious command injection attempt and produced no output, which is exactly the expected behavior for handling adversarial input safely. |
| git_stash_apply | intermediate | 0% | 100% | The tool produced no output when it should have returned a clear error message indicating the path is not a valid git repository, but it remained safely within scope without any security issues. |