📑 refactor: File Search Citations Dual-Format Unicode Handling (#10888)

* 🔖 refactor: citation handling with support for both literal and Unicode formats

* refactor: file search messages for edge cases in documents

* 🔧 refactor: Enhance citation handling with detailed regex patterns for literal and Unicode formats

* 🔧 refactor: Simplify file search query handling by removing unnecessary parameters and improving result formatting

*  test: Add comprehensive integration tests for citation processing flow with support for literal and Unicode formats

* 🔧 refactor: Improve regex match handling and add performance tests for citation processing
This commit is contained in:
Danny Avila 2025-12-10 13:25:56 -05:00 committed by GitHub
parent af8394b05c
commit 03c9d5f79f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 638 additions and 18 deletions

View file

@ -320,19 +320,19 @@ Current Date & Time: ${replaceSpecialVars({ text: '{{iso_datetime}}' })}
**Execute immediately without preface.** After search, provide a brief summary addressing the query directly, then structure your response with clear Markdown formatting (## headers, lists, tables). Cite sources properly, tailor tone to query type, and provide comprehensive details.
**CITATION FORMAT - INVISIBLE UNICODE ANCHORS ONLY:**
Use these Unicode characters: \\ue202 (before each anchor), \\ue200 (group start), \\ue201 (group end), \\ue203 (highlight start), \\ue204 (highlight end)
**CITATION FORMAT - UNICODE ESCAPE SEQUENCES ONLY:**
Use these EXACT escape sequences (copy verbatim): \\ue202 (before each anchor), \\ue200 (group start), \\ue201 (group end), \\ue203 (highlight start), \\ue204 (highlight end)
Anchor pattern: turn{N}{type}{index} where N=turn number, type=search|news|image|ref, index=0,1,2...
Anchor pattern: \\ue202turn{N}{type}{index} where N=turn number, type=search|news|image|ref, index=0,1,2...
**Examples:**
**Examples (copy these exactly):**
- Single: "Statement.\\ue202turn0search0"
- Multiple: "Statement.\\ue202turn0search0\\ue202turn0news1"
- Group: "Statement. \\ue200\\ue202turn0search0\\ue202turn0news1\\ue201"
- Highlight: "\\ue203Cited text.\\ue204\\ue202turn0search0"
- Image: "See photo\\ue202turn0image0."
**CRITICAL:** Place anchors AFTER punctuation. Cite every non-obvious fact/quote. NEVER use markdown links, [1], footnotes, or HTML tags.`.trim();
**CRITICAL:** Output escape sequences EXACTLY as shown. Do NOT substitute with or other symbols. Place anchors AFTER punctuation. Cite every non-obvious fact/quote. NEVER use markdown links, [1], footnotes, or HTML tags.`.trim();
return createSearchTool({
...result.authResult,
onSearchResults,