🧮 fix: Properly Escape Currency and Prevent Code Block LaTeX Bugs (#9248)

* fix(latex): prevent LaTeX conversion when closing $ is preceded by backtick

When text contained patterns like "$lookup namespace" followed by "`$lookup`",
the regex would match from the first $ to the backtick's $, treating the entire
span as a LaTeX expression. This caused programming constructs to be incorrectly
converted to double dollars.

- Added negative lookbehind (?<!`) to single dollar regex
- Prevents matching when closing $ immediately follows a backtick
- Fixes issues with inline code blocks containing $ symbols

* fix(latex): detect currency amounts with 4+ digits without commas

The currency regex pattern \d{1,3} only matched amounts with 1-3 initial digits,
causing amounts like $1157.90 to be interpreted as LaTeX instead of currency.
This resulted in text like "$1157.90 (text) + $500 (text) = $1657.90" being
incorrectly converted to a single LaTeX expression.

- Changed pattern from \d{1,3} to \d+ to match any number of initial digits
- Now properly escapes $1000, $10000, $123456, etc. without requiring commas
- Maintains support for comma-formatted amounts like $1,234.56

* fix(latex): support currency with unlimited decimal places

The currency regex limited decimal places to 1-2 digits (\.\d{1,2}), which
failed to properly escape amounts with more precision like cryptocurrency
values ($0.00001234), gas prices ($3.999), or exchange rates ($1.23456).

- Changed decimal pattern from \.\d{1,2} to \.\d+
- Now supports any number of decimal places
- Handles edge cases like scientific calculations and high-precision values
This commit is contained in:
Danny Avila 2025-08-25 02:44:13 -04:00 committed by GitHub
parent c2f4b383f2
commit 1915d7b195
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 34 additions and 3 deletions

View file

@ -207,4 +207,36 @@ y$ which spans lines`;
const expected = 'Set $$\\{x | x > \\$0\\}$$ for positive prices'; const expected = 'Set $$\\{x | x > \\$0\\}$$ for positive prices';
expect(preprocessLaTeX(content)).toBe(expected); expect(preprocessLaTeX(content)).toBe(expected);
}); });
test('does not convert when closing dollar is preceded by backtick', () => {
const content = 'The error "invalid $lookup namespace" occurs when using `$lookup` operator';
const expected = 'The error "invalid $lookup namespace" occurs when using `$lookup` operator';
expect(preprocessLaTeX(content)).toBe(expected);
});
test('handles mixed backtick and non-backtick cases', () => {
const content = 'Use $x + y$ in math but `$lookup` in code';
const expected = 'Use $$x + y$$ in math but `$lookup` in code';
expect(preprocessLaTeX(content)).toBe(expected);
});
test('escapes currency amounts without commas', () => {
const content =
'The total amount invested is $1157.90 (existing amount) + $500 (new investment) = $1657.90.';
const expected =
'The total amount invested is \\$1157.90 (existing amount) + \\$500 (new investment) = \\$1657.90.';
expect(preprocessLaTeX(content)).toBe(expected);
});
test('handles large currency amounts', () => {
const content = 'You can win $1000000 or even $9999999.99!';
const expected = 'You can win \\$1000000 or even \\$9999999.99!';
expect(preprocessLaTeX(content)).toBe(expected);
});
test('escapes currency with many decimal places', () => {
const content = 'Bitcoin: $0.00001234, Gas: $3.999, Rate: $1.234567890';
const expected = 'Bitcoin: \\$0.00001234, Gas: \\$3.999, Rate: \\$1.234567890';
expect(preprocessLaTeX(content)).toBe(expected);
});
}); });

View file

@ -3,9 +3,8 @@ const MHCHEM_CE_REGEX = /\$\\ce\{/g;
const MHCHEM_PU_REGEX = /\$\\pu\{/g; const MHCHEM_PU_REGEX = /\$\\pu\{/g;
const MHCHEM_CE_ESCAPED_REGEX = /\$\\\\ce\{[^}]*\}\$/g; const MHCHEM_CE_ESCAPED_REGEX = /\$\\\\ce\{[^}]*\}\$/g;
const MHCHEM_PU_ESCAPED_REGEX = /\$\\\\pu\{[^}]*\}\$/g; const MHCHEM_PU_ESCAPED_REGEX = /\$\\\\pu\{[^}]*\}\$/g;
const CURRENCY_REGEX = const CURRENCY_REGEX = /(?<![\\$])\$(?!\$)(?=\d+(?:,\d{3})*(?:\.\d+)?(?:\s|$|[^a-zA-Z\d]))/g;
/(?<![\\$])\$(?!\$)(?=\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?(?:\s|$|[^a-zA-Z\d]))/g; const SINGLE_DOLLAR_REGEX = /(?<!\\)\$(?!\$)((?:[^$\n]|\\[$])+?)(?<!\\)(?<!`)\$(?!\$)/g;
const SINGLE_DOLLAR_REGEX = /(?<!\\)\$(?!\$)((?:[^$\n]|\\[$])+?)(?<!\\)\$(?!\$)/g;
/** /**
* Escapes mhchem package notation in LaTeX by converting single dollar delimiters to double dollars * Escapes mhchem package notation in LaTeX by converting single dollar delimiters to double dollars