Arize-ai/phoenix

formatTemplate() HTML-escapes variables, corrupting LLM prompt content

Summary

  • Context: The formatTemplate() function in js/packages/phoenix-evals/src/template/applyTemplate.ts is used to populate LLM prompt templates with variable content for evaluation tasks such as hallucination detection and document relevancy scoring.

  • Bug: The function uses Mustache.render() which HTML-escapes all variables by default, converting characters like <>&, and " to HTML entities (&lt;&gt;&amp;&quot;).

  • Actual vs. expected: LLM prompts containing code snippets or comparison operators are corrupted with HTML entities instead of preserving the original characters as plain text.

  • Impact: Code snippets, technical documentation, and mathematical expressions in prompts are silently corrupted, leading to incorrect LLM evaluations without any error messages.

Code with bug

export function formatTemplate(args: {
  template: Template;
  variables: Record<string, unknown>;
}) {
  const { template, variables } = args;

  return Mustache.render(template, variables);
  // <-- BUG 🔴 HTML-escapes all variables by default
}

Example affected template (HALLUCINATION_TEMPLATE.ts):

[Query]: {{input}}

[Reference text]: {{reference}}
  // <-- BUG 🔴 If reference contains code, < and > are escaped

[Answer]: {{output}}
  // <-- BUG 🔴 If output contains code, < and > are escaped

Mustache.js Default Behavior

The Mustache.js library documentation explicitly states: “All variables are HTML-escaped by default.”

The library’s source code implements escaping for these characters:

var entityMap = {
  '&': '&amp;',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#39;',
  '/': '&#x2F;',
  '`': '&#x60;',
  '=': '&#x3D;',
};

Example corruption:

// Input
formatTemplate({
  template: "Analyze: {{code}}",
  variables: {
    code: "if (a > b && c < d)",
  },
});

// Expected output
"Analyze: if (a > b && c < d)"

// Actual output (corrupted)
"Analyze: if (a &gt; b &amp;&amp; c &lt; d)"

Failing Test

it("should NOT HTML-escape special characters for LLM prompts", () => {
  const result = formatTemplate({
    template: "Analyze: {{code}}",
    variables: {
      code: "<script>alert('test')</script>",
    },
  });

  expect(result).toBe("Analyze: <script>alert('test')</script>");
  expect(result).not.toContain("&lt;");
  expect(result).not.toContain("&gt;");
});

it("should handle comparison operators without escaping", () => {
  const result = formatTemplate({
    template: "Expression: {{expr}}",
    variables: {
      expr: "if a > b & c < d",
    },
  });

  expect(result).toBe("Expression: if a > b & c < d");
  expect(result).not.toContain("&amp;");
  expect(result).not.toContain("&lt;");
});

Test output: Both tests fail. The function returns HTML-escaped strings instead of preserving the original characters.

Recommended fix

Disable HTML escaping by passing a custom escape function that returns text unchanged, since LLM prompts are plain text, not HTML:

export function formatTemplate(args: {
  template: Template;
  variables: Record<string, unknown>;
}) {
  const { template, variables } = args;

  return Mustache.render(template, variables, {}, {
    escape: (text) => text, // <-- FIX 🟢 Disable HTML escaping for plain text prompts
  });
}

This approach preserves all special characters while maintaining compatibility with existing templates.