Arize-ai/phoenix

Backslashes not escaped in sanitizePythonStr produce invalid/corrupted Python string literals

Summary

  • Context: The sanitizePythonStr function in app/src/utils/pythonUtils.ts converts JavaScript strings to Python string literals for generating metadata filter conditions.

  • Bug: The function fails to escape backslash characters when sanitizing strings.

  • Actual vs. expected: Backslashes are passed through unescaped (e.g., C:\\Users\\test → "C:\\Users\\test"), when they should be doubled (expected: "C:\\\\Users\\\\test").

  • Impact: The function generates invalid Python code that raises SyntaxError or corrupts data through unintended escape sequence interpretation.

Code with bug

function sanitizePythonStr(value: string) {
  return value
    .replaceAll("\\\\", "\\\\\\\\") // <-- FIX 🟢 escape backslashes first
    .replaceAll("\\n", "\\\\n")
    .replaceAll('"', '\\"');
}

Evidence

The bug manifests in three distinct failure modes:

Test 1: Windows file path with \\U sequence

Input:
  'C:\\\\Users\\\\test'

Buggy output:
  "C:\Users\test"

Evaluated as:
  eval('"C:\Users\test"')

Error:
  SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
  position 2–3: truncated \\UXXXXXXXX escape

Test 2: Literal backslash-n interpreted as newline

Input:
  'path\\\\name'

Buggy output:
  "path\\name"

Evaluated as:
  result = eval('"path\\name"')

Result:
  'path\\name'   (newline character)

Expected:
  'path\\\\name' (backslash + n)

Data corruption:
  Backslash sequence misinterpreted

Test 3: Backslash before quote causes unterminated string

Input:
  'value\\\\"more'

Buggy output:
  "value\\\\"more"

Evaluated as:
  eval('"value\\\\"more"')

Error:
  SyntaxError: unterminated string literal

Cause:
  The `\\"` escapes the closing quote,
  leaving the string unterminated

Correct implementation

def correct_sanitize(value):
    return (
        value
        .replace('\\\\', '\\\\\\\\')
        .replace('\\n', '\\\\n')
        .replace('"', '\\\\"')
    )


# Input:
#   'C:\\\\Users\\\\test'

# Correct output:
#   "C:\\\\Users\\\\test"

result = eval('"C:\\\\Users\\\\test"')

# Result:
#   'C:\\\\Users\\\\test'  (preserved correctly)

Recommended fix

Escape backslashes before other characters to prevent interference with subsequent escape sequences:

function sanitizePythonStr(value: string) {
  return value
    .replaceAll("\\\\", "\\\\\\\\")  // <-- FIX 🟢 Escape backslashes first
    .replaceAll("\\n", "\\\\n")
    .replaceAll('"', '\\"');
}

The order is critical: if backslashes are escaped after newlines, a literal \\n in the input would become \\\\n, then incorrectly become \\\\\\n.