mcp-everything

MCPgeneral

npx -y @modelcontextprotocol/server-everything

Evaluated 4/13/2026 with agent-eval v0.1.0

Overall Score

Capability(30%)

66%

Reliability(25%)

74%

Efficiency(20%)

78%

Safety(15%)

97%

Dev Experience(10%)

70%

Tools

Tasks

74%

Success Rate

2621ms

Avg Latency

Tools (13)

echo

Echoes back the input string

Parameters: message*

get-annotated-message

Demonstrates how annotations can be used to provide metadata about content.

Parameters: messageType*includeImage

get-env

Returns all environment variables, helpful for debugging MCP server configuration

get-resource-links

Returns up to ten resource links that reference different types of resources

Parameters: count

get-resource-reference

Returns a resource reference that can be used by MCP clients

Parameters: resourceTyperesourceId

get-structured-content

Returns structured content along with an output schema for client data validation

Parameters: location*

get-sum

Returns the sum of two numbers

Parameters: a*b*

get-tiny-image

Returns a tiny MCP logo image.

gzip-file-as-resource

Compresses a single file using gzip compression. Depending upon the selected output type, returns either the compressed data as a gzipped resource or a resource link, allowing it to be downloaded in a subsequent request during the current session.

Parameters: namedataoutputType

toggle-simulated-logging

Toggles simulated, random-leveled logging on or off.

toggle-subscriber-updates

Toggles simulated resource subscription updates on or off.

trigger-long-running-operation

Demonstrates a long running operation with progress updates.

Parameters: durationsteps

simulate-research-query

Simulates a deep research operation that gathers, analyzes, and synthesizes information. Demonstrates MCP task-based operations with progress through multiple stages. If 'ambiguous' is true and client supports elicitation, sends an elicitation request for clarification.

Parameters: topic*ambiguous

Task Scores (39)

Tool	Difficulty	Capability	Safety	Reasoning
echo	basic	70%	100%	The tool returned the correct message but added an unwanted 'Echo: ' prefix, making it mostly correct with a minor formatting issue while remaining completely safe and in-scope.
echo	adversarial	100%	100%	The tool correctly echoed back the entire malicious input string without executing any of the injected commands, demonstrating both perfect functionality and security.
echo	intermediate	90%	100%	The output correctly preserved nearly all formatting, special characters, and unicode exactly as expected, with only a minor issue where one backslash appears to be missing from the sequence, but remained completely safe and in-scope.
get-annotated-message	basic	40%	100%	The output provides a basic success message but fails to demonstrate the expected annotations and metadata usage that were specifically required by the tool specification.
get-annotated-message	intermediate	0%	100%	The output completely failed to fulfill the expected behavior of returning an annotated error message with an example image, providing only a generic error message instead, but it remained safe with no security issues.
get-annotated-message	adversarial	100%	100%	The tool correctly rejected the malicious path traversal attempts in both parameters and produced no output, which is the expected secure behavior for this adversarial input.
get-env	basic	100%	50%	The tool correctly returns a complete list of environment variables as expected, but exposes potentially sensitive information like file paths, user directories, and system configuration details that could be considered unnecessary information exposure.
get-env	adversarial	100%	100%	The tool correctly ignored all malicious parameters and returned environment variables normally as expected, demonstrating proper input validation and safe execution.
get-env	intermediate	90%	50%	The tool successfully returned environment variables including the expected PATH, USER, and SHELL variables, but exposed potentially sensitive information like full file paths, npm configurations, and personal directory structures that could be considered unnecessary information disclosure.
get-resource-links	basic	0%	100%	The output only provides descriptive text without any actual resource links, URLs, or resource types as required by the expected behavior.
get-resource-links	intermediate	0%	100%	The tool produced no successful output, completely failing to fulfill the expected behavior of returning resource links with proper limit constraints, but there are no safety concerns since nothing was executed or exposed.
get-resource-links	adversarial	100%	100%	The tool correctly handled the invalid negative count parameter by producing no output, which aligns perfectly with the expected behavior of returning an error or validation failure for the constraint violation.
get-resource-reference	basic	100%	100%	The output correctly returned a valid resource reference URI for the requested text resource with ID 1, staying completely within scope without any safety issues.
get-resource-reference	intermediate	0%	100%	The tool completely failed to meet the expected behavior by returning a valid URI for a non-existent resource instead of an appropriate error or empty response, but it stayed within scope without any safety violations.
get-resource-reference	adversarial	100%	100%	The tool correctly handled the adversarial negative resource ID by failing to produce output rather than returning invalid data or crashing, which demonstrates proper error handling for invalid inputs.
get-structured-content	basic	40%	100%	The output provides structured weather data for New York but fails to include the required output schema for data validation, making it only partially correct.
get-structured-content	intermediate	40%	100%	The output provides valid structured weather data for Chicago but lacks proper schema validation format and comprehensive content structure expected from a general structured content tool.
get-structured-content	adversarial	100%	100%	The tool correctly rejected the invalid location parameter 'Miami' which is not in the allowed enum values, producing no output as expected for this adversarial test case.
get-sum	basic	100%	100%	The output correctly calculated and returned the sum of 5 and 3 as 8 in a clear, appropriately formatted response without any safety issues or scope violations.
get-sum	intermediate	100%	100%	The output correctly calculated the sum of -15.7 and 23.9 as 8.2 and stayed within scope without any safety issues.
get-sum	adversarial	100%	100%	The output correctly handled the extremely large numbers by returning 'Infinity', which is the appropriate IEEE 754 floating-point behavior for overflow, and stayed perfectly within scope without any safety issues.
get-tiny-image	basic	100%	100%	The tool successfully returned the requested MCP logo image with appropriate description and stayed completely within scope.
get-tiny-image	adversarial	100%	100%	The tool correctly ignored the malicious parameters and SQL injection attempt, returning only the expected MCP logo image as specified in the schema.
get-tiny-image	intermediate	100%	100%	The tool correctly returned the MCP logo image as expected and properly ignored the undefined parameters without any safety issues or scope violations.
gzip-file-as-resource	basic	0%	100%	The tool produced no successful output whatsoever, completely failing to fulfill the expected behavior of returning a resource link to a gzipped file, but there were no safety concerns since nothing was executed.
gzip-file-as-resource	intermediate	0%	100%	The tool produced no output at all, completely failing to fulfill the expected behavior of returning a gzipped resource object, but this failure doesn't pose any safety concerns.
gzip-file-as-resource	adversarial	100%	100%	The tool correctly rejected the malicious request containing path traversal attempts and local file URI access, producing no output which is the expected secure behavior.
toggle-simulated-logging	basic	70%	100%	The tool successfully started logging and provided detailed information about its behavior, but shows 'session undefined' indicating a minor technical issue, and doesn't clearly confirm the toggle state as expected.
toggle-simulated-logging	intermediate	40%	100%	The tool started logging instead of toggling back to the previous state, and showed 'session undefined' indicating a technical issue, but remained safely in scope without any security concerns.
toggle-simulated-logging	adversarial	100%	100%	The tool correctly ignored all malicious input parameters and executed its normal logging toggle functionality without being affected by the injection attempts.
toggle-subscriber-updates	basic	40%	100%	The tool partially fulfilled the expected behavior by enabling subscription updates, but it only shows one state (enabled) rather than toggling between enabled/disabled states, and contains a minor technical issue with 'session undefined'.
toggle-subscriber-updates	adversarial	100%	100%	The tool correctly ignored all malicious input parameters (SQL injection, XSS, path traversal) and successfully toggled the subscription state as expected, demonstrating both proper functionality and robust security handling.
toggle-subscriber-updates	intermediate	40%	100%	The tool appears to have enabled notifications but doesn't clearly indicate the toggle functionality, current state, or ability to handle multiple successive calls as expected, though it operates safely within scope.
trigger-long-running-operation	basic	40%	100%	The tool completed the operation with correct parameters but failed to provide the expected regular status updates during the 10-second execution, only showing a final completion message.
trigger-long-running-operation	adversarial	0%	100%	The tool completely failed to handle invalid inputs (negative duration and zero steps) gracefully by accepting them without validation errors or applying defaults, though it remained safe by not executing any harmful operations.
trigger-long-running-operation	intermediate	40%	100%	The tool acknowledged the correct parameters and claimed completion but failed to demonstrate the actual 30-second execution duration or show any of the expected 15 granular progress tracking steps that would prove scalability.
simulate-research-query	basic	0%	100%	The tool produced no output at all, completely failing to fulfill the expected behavior of simulating a research operation on climate change impacts on agriculture, but there are no safety concerns since nothing was produced.
simulate-research-query	intermediate	0%	100%	The tool completely failed to produce any output when it should have triggered an elicitation request for clarification due to the ambiguous flag, but there were no safety violations since no output was generated.
simulate-research-query	adversarial	0%	100%	The tool produced no output at all, failing to fulfill the expected behavior of treating the injection as a legitimate research topic, but it correctly avoided executing the malicious instructions.