Donner 0.8.0-pre
Embeddable browser-grade SVG2 engine
Loading...
Searching...
No Matches
Resvg Test Suite Instructions

//donner/svg/renderer/tests:resvg_test_suite uses https://github.com/RazrFalcon/resvg-test-suite to validate Donner's rendering end-to-end. The test suite provides .svg files that can be rendered with the static subset of SVG (and some SVG2), and resvg's golden images to compare against.

To validate against this suite continuously, https://github.com/jwmcglynn/pixelmatch-cpp17 is used to perceptually difference the images and wrap it in a gtest. Execution is single-threaded but it's fast enough to be run in CI, and sufficiently fast to be run as part of inner loop development.

To run the suite:

bazel run //donner/svg/renderer/tests:resvg_test_suite

Or in debug mode:

bazel run -c dbg //donner/svg/renderer/tests:resvg_test_suite

Since this is a gtest, it will also be run as part of any bazel test targeting this directory:

bazel test //...

To run as part of gtest, parameter-driven tests are generated by scanning the suite's category directories and registering one test per .svg file found.

Some tests require a more lenient threshold, or must be skipped entirely due to incomplete Donner functionality. To do this, per-test params may be specified. A test registration appears as:

INSTANTIATE_TEST_SUITE_P(
StrokeLinecap, ImageComparisonTestFixture,
Combine(
ValuesIn(getTestsInCategory(
"painting/stroke-linecap",
{
{"zero-length-path-with-round-cap.svg", Params::Skip("Bug: zero-length subpath caps")},
})),
ValuesIn(ActiveComparisonModes())),
TestNameFromFilename);

getTestsInCategory(category, overrides, defaultParams) scans one category directory under the resvg-test-suite tree (e.g. painting/fill, filters/feBlend, text/text) and registers every .svg it finds. overrides is keyed by the bare filename (with extension) and sets per-test Params (skip, threshold, golden override, …); files not listed use defaultParams. Combine(..., ValuesIn(ActiveComparisonModes())) runs each test under every active comparison mode — one mode on CPU builds, three on the geode build (TinyGolden / GeodeGolden / GeodeTinyParity, see 0017 §Phase 4b).

The test name is <SuiteName>/ImageComparisonTestFixture.ResvgTest/<sanitized-filename>, where the filename stem has every non-alphanumeric character replaced by _. For the example above:

StrokeLinecap/ImageComparisonTestFixture.ResvgTest/zero_length_path_with_round_cap

On the geode build (multiple comparison modes) the mode is appended, e.g. ..._TinyGolden / ..._GeodeTinyParity.

To run a single test:

bazel run -c dbg //donner/svg/renderer/tests:resvg_test_suite -- \
--gtest_filter="*zero_length_path_with_round_cap"

If a test is skipped, it is still useful to manually run it without editing resvg_test_suite.cc. Params::Skip(...) prefixes the generated gtest name with DISABLED_. Run those tests with gtest's disabled-test flag and a narrow filter; text-full-only tests still require a text-full build.

To run a skipped test:

bazel run -c dbg //donner/svg/renderer/tests:resvg_test_suite -- \
--gtest_filter="*zero_length_path_with_round_cap" --gtest_also_run_disabled_tests

With suffix-matching, the same test identifier can be used.

Triaging Test Failures

When tests fail, follow this systematic approach to triage and document them:

1. Run the Failing Tests

First, identify which tests are failing:

bazel run //donner/svg/renderer/tests:resvg_test_suite -c dbg -- '--gtest_filter=TextFontWeight/*'

Look for tests marked as FAIL and note the pixel difference count. Tests pass if pixel differences are under the threshold (default 100 pixels).

2. Examine the Test Output

When a test fails, the framework provides detailed diagnostic information:

Test Failure Header

[ COMPARE ] .../tests/text/word-spacing/simple-case.svg [TinySkia]: FAIL (8234 pixels differ, with 100 max)
  • FAIL: Test failed (vs PASS for success)
  • [TinySkia]: which backend rendered this comparison (TinySkia, or a Geode* mode)
  • 8234 pixels differ: Number of pixels that don't match between actual and expected
  • with 100 max: Threshold for passing (tests pass if pixel diff ≤ 100)

Verbose Rendering Output

When a test fails, it re-renders with verbose logging showing:

Document world from canvas transform: matrix(2.5 0 0 2.5 0 0)
Instantiating SVG id=svg1 #1
Instantiating Text id=text1 #5
Rendering Text id=text1 #5 transform=matrix(2.5 0 0 2.5 0 0)

This shows:

  • Canvas transform: The scaling/transform applied to the entire SVG
  • Instantiating: Elements being created from the SVG DOM
  • Rendering: Elements being drawn to the canvas with their transforms

SVG Source Display

The complete SVG source is printed inline:

SVG Content for simple-case.svg:
---
<svg id="svg1" viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg"
font-family="Noto Sans" font-size="48">
<title>word-spacing</title>
<text id="text1" x="30" y="100" word-spacing="10">Two words</text>
</svg>
---

Look for:

  • <title>: Describes what the test validates
  • Element attributes: Features being tested (e.g., multiple x/y values)
  • Complexity: Number of elements and their properties

Output File Paths

On failure, three PNGs are written to $TEST_UNDECLARED_OUTPUTS_DIR (Bazel exposes this under bazel-testlogs/.../test.outputs/), named after the sanitized test:

actual_<name>.png — Donner's output (what the renderer currently produces)
expected_<name>.png — the resvg golden (reference rendering)
diff_<name>.png — per-pixel difference highlight

actual (Donner's output): what the renderer currently produces.

expected (golden reference): the resvg-test-suite PNG for this test, from the bazel runfiles directory, generated by resvg.

diff (visual comparison): highlights where actual and expected differ — red/orange pixels mark differing pixels, colored outlines mark positional differences.

Interpreting Output Files

What to look for:

  • Diff image:
    • Solid red areas = completely different pixels
    • Colored outlines = positional/alignment differences
    • Minimal differences = may just need threshold adjustment
    • Large differences = missing feature or wrong implementation
  • Actual vs Expected:
    • Compare side-by-side to understand the failure
    • Missing elements = not implemented
    • Wrong position = baseline/positioning issue
    • Wrong style = font/styling issue
    • Whole-shape or per-glyph offset = a real coordinate-space / layout bug, not "AA"

3. Analyze the Failure

Based on the output, determine what's causing the failure:

Check the SVG source (printed in test output):

  • Look at the <title> to understand test intent
  • Identify which SVG features are being tested
  • Note complex attributes or patterns

Compare images:

  • Open the diff image to see where differences are
  • Compare actual vs expected side-by-side
  • Assess the magnitude of differences (pixel count)

Review verbose output:

  • Check if elements are being instantiated
  • Verify transforms are being applied
  • Look for errors or warnings in the render log

4. Categorize the Failure

Common failure categories:

  • Not implemented: Feature doesn't exist yet (e.g., <tspan>, writing-mode)
  • UB (Undefined Behavior): Edge case or non-standard behavior; render-only (no compare)
  • Bug: Wrong output for an implemented feature — find the root cause (wrong transform, coverage geometry, color space, premultiplication, layer compositing). Note: pixelmatch already excludes anti-aliased edge pixels, so a diff our harness reports is never "just AA" — a diff large enough to fail a test has a real cause (see CLAUDE.md §"Anti-Aliasing Is Never the Root Cause").

5. Document in resvg_test_suite.cc

Add the failing test to the appropriate INSTANTIATE_TEST_SUITE_P block with a skip comment:

INSTANTIATE_TEST_SUITE_P(
TextWordSpacing, ImageComparisonTestFixture,
Combine(
ValuesIn(getTestsInCategory(
"text/word-spacing",
{
{"simple-case.svg", Params::Skip("Not impl: word-spacing")},
{"negative.svg", Params::Skip("Not impl: word-spacing")},
})),
ValuesIn(ActiveComparisonModes())),
TestNameFromFilename);

Comment format:

  • Not impl: <feature> - Feature not yet implemented
  • UB: <reason> - Undefined behavior or edge case (use Params::RenderOnly, not Skip)
  • Bug: <description> - Known bug
  • Params::WithThreshold(t, maxPx) - only after a root-cause investigation, with the reason. Never widen a threshold to absorb a diff you haven't explained (see CLAUDE.md / AGENTS.md).

6. Group Related Failures

Within a category's override map, group the entries by reason with a short comment so the gaps read at a glance:

{
// Not impl: variable-font weight axis
{"variable-weight.svg", Params::Skip("Not impl: variable-font weight")},
{"weight-interpolation.svg", Params::Skip("Not impl: variable-font weight")},
// Bug: synthetic-bold metrics
{"bolder-keyword.svg", Params::Skip("Bug: synthetic-bold advance width")},
}

Each resvg category is its own getTestsInCategory(...) block, so a feature that shows up in more than one category (e.g. text/font-weight and painting/fill) is tracked once per category, not in a shared list.

7. Verify Skip Configuration

After adding skips, verify tests run correctly:

bazel run //donner/svg/renderer/tests:resvg_test_suite -c dbg -- '--gtest_filter=TextFontWeight/*'

You should see:

  • Skipped tests don't run
  • Passing tests still pass
  • Clear count of passed/skipped tests

Example Triage Workflow

  1. Run tests
    bazel run //donner/svg/renderer/tests:resvg_test_suite -c dbg -- '--gtest_filter=TextFontWeight/*'
  2. Examine the SVG of the failing test (printed as output)
  3. Open the diff image to see where differences are
  4. Identify the cause of the failure. Either fix the root cause in Donner, modify the test parameters in resvg_test_suite.cc, or mark the test skipped to defer resolving the issue while keeping the suite operational.

Tips

  • Visual inspection: Always view the diff images to understand the nature of failures
  • Magnitude is the tell: hundreds of differing pixels, per-glyph drift, or whole-shape offsets are a real bug — pixelmatch already excludes AA, so the diff is not edge anti-aliasing
  • Categorize systematically: Group tests by missing feature for easier tracking
  • Keep comments concise: Use the established format from existing tests

MCP Servers

The resvg-test-triage MCP server provides automated test analysis. When available, use it to:

Batch analyze test failures:

# After running tests, pass output to MCP server
result = await mcp.call_tool("batch_triage_tests", {
"test_output": test_output_string
})
# Server returns:
# - Categorized failures by feature
# - Suggested skip comments
# - Grouping recommendations

Analyze individual tests:

result = await mcp.call_tool("analyze_test_failure", {
"test_name": "e-text-023.svg",
"svg_content": svg_source,
"pixel_diff": 8234
})
# Returns feature detection, category, and skip suggestion

Get implementation guidance (NEW):

# Find which files to modify for a missing feature
result = await mcp.call_tool("suggest_implementation_approach", {
"test_name": "e-text-031.svg",
"features": ["writing_mode"],
"category": "text_layout",
"codebase_files": [] # Optionally provide files from glob/grep
})
# Returns:
# - Ranked list of files to modify
# - Search keywords for finding similar features
# - Implementation hints specific to the feature

Find related tests for batch implementation (NEW):

# Discover all tests failing for the same feature
result = await mcp.call_tool("find_related_tests", {
"feature": "writing-mode",
"skip_file_content": resvg_test_suite_cc_content
})
# Returns:
# - List of all tests with this feature (e.g., e-text-031, e-text-033)
# - Impact assessment and priority (low/medium/high)
# - Batch implementation opportunity!

Track feature progress (NEW):

# Generate progress report for a test category
result = await mcp.call_tool("generate_feature_report", {
"category": "e-text",
"test_output": bazel_test_output,
"skip_file_content": resvg_test_suite_cc_content
})
# Returns:
# - Pass/fail/skip counts
# - Completion rate percentage
# - Next priority feature by test impact
# - List of all missing features

Analyze visual differences (NEW):

# Programmatically analyze diff images
result = await mcp.call_tool("analyze_visual_diff", {
"diff_image_path": "/tmp/diff_e-text-031.png",
"actual_image_path": "/tmp/e-text-031.png",
"expected_image_path": "/path/to/resvg-test-suite/png/e-text-031.png"
})
# Returns:
# - Difference type: positioning/missing_element/styling/anti_aliasing
# - Visual analysis metrics (pixel counts, regions, offsets)
# - Likely cause with confidence score

Setup:

  1. Install: pip install -e tools/mcp-servers/resvg-test-triage
  2. Configure in MCP settings:
    • Claude Code: See tools/mcp-servers/resvg-test-triage/mcp-config-example.json
    • VSCode: Add to .vscode/mcp.json (see README for format)
  3. Use tools during test triage

Benefits:

  • Consistent categorization across all tests
  • Auto-detection of SVG features being tested
  • Batch processing of 50+ test failures
  • Properly formatted skip comments
  • Vision model analysis with actual, expected, and diff images
  • Implementation guidance - suggests files to modify
  • Batch opportunities - find all tests for same feature
  • Progress tracking - monitor feature completion
  • Visual analysis - categorize diff types automatically

See resvg-test-triage README for full documentation.