diff --git a/doc-build/md013.md b/doc-build/md013.md index c129a003..848f4c74 100644 --- a/doc-build/md013.md +++ b/doc-build/md013.md @@ -7,11 +7,11 @@ up into multiple lines. To set a different maximum length for headings, use This rule has an exception when there is no whitespace beyond the configured line length. This allows you to include items such as long URLs without being forced to break them in the middle. To disable this exception, set the `strict` -parameter to `true` and an issue will be reported when any line is too long. To -warn for lines that are too long and could be fixed but allow long lines +parameter to `true` and a violation will be reported when any line is too long. +To warn for lines that are too long and could be fixed while allowing long lines without spaces, set the `stern` parameter to `true`. -For example (assuming normal behavior): +For example (assuming default behavior): ```markdown IF THIS LINE IS THE MAXIMUM LENGTH @@ -26,14 +26,25 @@ mode, the middle two lines above are both violations, but the last is okay. You have the option to exclude this rule for code blocks, tables, or headings. To do so, set the `code_blocks`, `tables`, or `headings` parameter(s) to false. -Code blocks are included in this rule by default since it is often a -requirement for document readability, and tentatively compatible with code -rules. Still, some languages do not lend themselves to short lines. +Code blocks are included in this rule by default since it is often a requirement +for document readability, and tentatively compatible with code rules. However, +some languages do not lend themselves to short lines. + +By default, this rule treats the number of characters in the line as its length. +Many editors render [emoji][emoji-013] and [CJK characters][cjk-characters-013] +at *twice* the width of [Latin characters][latin-script-013], so the +`wide_characters` parameter can be set to `true` to treat the "visual" width of +the line as its length instead. Lines with link/image reference definitions and standalone lines (i.e., not part -of a paragraph) with only a link/image (possibly using (strong) emphasis) are -always exempted from this rule (even in `strict` mode) because there is often no -way to split such lines without breaking the URL. +of a paragraph) with only a link/image (possibly using emphasis) are always +exempted from this rule (even in `strict` mode) because there is often no way to +split such lines without breaking the URL. Rationale: Extremely long lines can be difficult to work with in some editors. + More information: . + +[cjk-characters-013]: https://en.wikipedia.org/wiki/CJK_characters +[emoji-013]: https://en.wikipedia.org/wiki/Emoji +[latin-script-013]: https://en.wikipedia.org/wiki/Latin_script diff --git a/doc-build/md060.md b/doc-build/md060.md index cf6912c9..d7515804 100644 --- a/doc-build/md060.md +++ b/doc-build/md060.md @@ -47,11 +47,11 @@ not, violations are be reported for whichever style would produce the *fewest* issues (i.e., whichever style is the closest match). Note: Delimiter placement for the `aligned` style is based on visual appearance -and not character count. Because editors typically render [emoji][emoji] and -[CJK characters][cjk-characters] at *twice* the width of -[Latin characters][latin-script], this rule takes that into account for tables -using the `aligned` style. The following table is correctly formatted and will -appear aligned in most editors and monospaced fonts: +and not character count. Because editors typically render [emoji][emoji-060] and +[CJK characters][cjk-characters-060] at *twice* the width of +[Latin characters][latin-script-060], this rule takes that into account for +tables using the `aligned` style. The following table is correctly formatted and +will appear aligned in most editors and monospaced fonts: @@ -67,7 +67,7 @@ appear aligned in most editors and monospaced fonts: Rationale: Consistent formatting makes it easier to understand a document. -[cjk-characters]: https://en.wikipedia.org/wiki/CJK_characters -[emoji]: https://en.wikipedia.org/wiki/Emoji +[cjk-characters-060]: https://en.wikipedia.org/wiki/CJK_characters +[emoji-060]: https://en.wikipedia.org/wiki/Emoji [gfm-table-060]: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables -[latin-script]: https://en.wikipedia.org/wiki/Latin_script +[latin-script-060]: https://en.wikipedia.org/wiki/Latin_script diff --git a/doc/Rules.md b/doc/Rules.md index 042b7dbe..e0a7e5b4 100644 --- a/doc/Rules.md +++ b/doc/Rules.md @@ -496,6 +496,7 @@ Parameters: - `stern`: Stern length checking (`boolean`, default `false`) - `strict`: Strict length checking (`boolean`, default `false`) - `tables`: Include tables (`boolean`, default `true`) +- `wide_characters`: Expand wide characters (`boolean`, default `false`) This rule is triggered when there are lines that are longer than the configured `line_length` (default: 80 characters). To fix this, split the line @@ -506,11 +507,11 @@ up into multiple lines. To set a different maximum length for headings, use This rule has an exception when there is no whitespace beyond the configured line length. This allows you to include items such as long URLs without being forced to break them in the middle. To disable this exception, set the `strict` -parameter to `true` and an issue will be reported when any line is too long. To -warn for lines that are too long and could be fixed but allow long lines +parameter to `true` and a violation will be reported when any line is too long. +To warn for lines that are too long and could be fixed while allowing long lines without spaces, set the `stern` parameter to `true`. -For example (assuming normal behavior): +For example (assuming default behavior): ```markdown IF THIS LINE IS THE MAXIMUM LENGTH @@ -525,18 +526,29 @@ mode, the middle two lines above are both violations, but the last is okay. You have the option to exclude this rule for code blocks, tables, or headings. To do so, set the `code_blocks`, `tables`, or `headings` parameter(s) to false. -Code blocks are included in this rule by default since it is often a -requirement for document readability, and tentatively compatible with code -rules. Still, some languages do not lend themselves to short lines. +Code blocks are included in this rule by default since it is often a requirement +for document readability, and tentatively compatible with code rules. However, +some languages do not lend themselves to short lines. + +By default, this rule treats the number of characters in the line as its length. +Many editors render [emoji][emoji-013] and [CJK characters][cjk-characters-013] +at *twice* the width of [Latin characters][latin-script-013], so the +`wide_characters` parameter can be set to `true` to treat the "visual" width of +the line as its length instead. Lines with link/image reference definitions and standalone lines (i.e., not part -of a paragraph) with only a link/image (possibly using (strong) emphasis) are -always exempted from this rule (even in `strict` mode) because there is often no -way to split such lines without breaking the URL. +of a paragraph) with only a link/image (possibly using emphasis) are always +exempted from this rule (even in `strict` mode) because there is often no way to +split such lines without breaking the URL. Rationale: Extremely long lines can be difficult to work with in some editors. + More information: . +[cjk-characters-013]: https://en.wikipedia.org/wiki/CJK_characters +[emoji-013]: https://en.wikipedia.org/wiki/Emoji +[latin-script-013]: https://en.wikipedia.org/wiki/Latin_script + ## `MD014` - Dollar signs used before commands without showing output @@ -2746,11 +2758,11 @@ not, violations are be reported for whichever style would produce the *fewest* issues (i.e., whichever style is the closest match). Note: Delimiter placement for the `aligned` style is based on visual appearance -and not character count. Because editors typically render [emoji][emoji] and -[CJK characters][cjk-characters] at *twice* the width of -[Latin characters][latin-script], this rule takes that into account for tables -using the `aligned` style. The following table is correctly formatted and will -appear aligned in most editors and monospaced fonts: +and not character count. Because editors typically render [emoji][emoji-060] and +[CJK characters][cjk-characters-060] at *twice* the width of +[Latin characters][latin-script-060], this rule takes that into account for +tables using the `aligned` style. The following table is correctly formatted and +will appear aligned in most editors and monospaced fonts: @@ -2766,10 +2778,10 @@ appear aligned in most editors and monospaced fonts: Rationale: Consistent formatting makes it easier to understand a document. -[cjk-characters]: https://en.wikipedia.org/wiki/CJK_characters -[emoji]: https://en.wikipedia.org/wiki/Emoji +[cjk-characters-060]: https://en.wikipedia.org/wiki/CJK_characters +[emoji-060]: https://en.wikipedia.org/wiki/Emoji [gfm-table-060]: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables -[latin-script]: https://en.wikipedia.org/wiki/Latin_script +[latin-script-060]: https://en.wikipedia.org/wiki/Latin_script @@ -78,7 +78,7 @@ appear aligned in most editors and monospaced fonts: Rationale: Consistent formatting makes it easier to understand a document. -[cjk-characters]: https://en.wikipedia.org/wiki/CJK_characters -[emoji]: https://en.wikipedia.org/wiki/Emoji +[cjk-characters-060]: https://en.wikipedia.org/wiki/CJK_characters +[emoji-060]: https://en.wikipedia.org/wiki/Emoji [gfm-table-060]: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables -[latin-script]: https://en.wikipedia.org/wiki/Latin_script +[latin-script-060]: https://en.wikipedia.org/wiki/Latin_script diff --git a/lib/configuration-strict.d.ts b/lib/configuration-strict.d.ts index 563e22ed..5e1f6c97 100644 --- a/lib/configuration-strict.d.ts +++ b/lib/configuration-strict.d.ts @@ -465,6 +465,10 @@ export interface ConfigurationStrict { * Stern length checking */ stern?: boolean; + /** + * Expand wide characters + */ + wide_characters?: boolean; }; /** * MD013/line-length : Line length : https://github.com/DavidAnson/markdownlint/blob/v0.39.0/doc/md013.md @@ -513,6 +517,10 @@ export interface ConfigurationStrict { * Stern length checking */ stern?: boolean; + /** + * Expand wide characters + */ + wide_characters?: boolean; }; /** * MD014/commands-show-output : Dollar signs used before commands without showing output : https://github.com/DavidAnson/markdownlint/blob/v0.39.0/doc/md014.md diff --git a/lib/md013.mjs b/lib/md013.mjs index 1bb21aed..bcb4611e 100644 --- a/lib/md013.mjs +++ b/lib/md013.mjs @@ -3,6 +3,7 @@ import { addErrorDetailIf } from "../helpers/helpers.cjs"; import { filterByTypesCached, getReferenceLinkImageData } from "./cache.mjs"; import { addRangeToSet, getDescendantsByType } from "../helpers/micromark-helpers.cjs"; +import stringWidth from "string-width"; // Regular expression for a line that is not wrappable const notWrappableRe = /^(?:[#>\s]*\s)?\S*$/; @@ -28,6 +29,7 @@ export default { const includeTables = (tables === undefined) ? true : !!tables; const headings = params.config.headings; const includeHeadings = (headings === undefined) ? true : !!headings; + const wideCharacters = !!params.config.wide_characters; const headingLineNumbers = new Set(); for (const heading of filterByTypesCached([ "atxHeading", "setextHeading" ])) { addRangeToSet(headingLineNumbers, heading.startLine, heading.endLine); @@ -65,6 +67,7 @@ export default { const inTable = tableLineNumbers.has(lineNumber); const maxLength = inCode ? codeLineLength : (isHeading ? headingLineLength : lineLength); const text = (strict || stern) ? line : line.replace(/\S*$/u, ""); + const textLength = wideCharacters ? stringWidth(text) : text.length; if ((maxLength > 0) && (includeCodeBlocks || !inCode) && (includeTables || !inTable) && @@ -73,15 +76,15 @@ export default { (strict || (!(stern && notWrappableRe.test(line)) && !linkOnlyLineNumbers.has(lineNumber))) && - (text.length > maxLength)) { + (textLength > maxLength)) { addErrorDetailIf( onError, lineNumber, maxLength, - line.length, + wideCharacters ? stringWidth(line) : line.length, undefined, undefined, - [ maxLength + 1, line.length - maxLength ] + wideCharacters ? undefined : [ maxLength + 1, line.length - maxLength ] ); } } diff --git a/schema/.markdownlint.jsonc b/schema/.markdownlint.jsonc index 831440ad..3f6fb708 100644 --- a/schema/.markdownlint.jsonc +++ b/schema/.markdownlint.jsonc @@ -86,7 +86,9 @@ // Strict length checking "strict": false, // Stern length checking - "stern": false + "stern": false, + // Expand wide characters + "wide_characters": false }, // MD014/commands-show-output : Dollar signs used before commands without showing output : https://github.com/DavidAnson/markdownlint/blob/v0.39.0/doc/md014.md diff --git a/schema/.markdownlint.yaml b/schema/.markdownlint.yaml index e769041e..eda318bb 100644 --- a/schema/.markdownlint.yaml +++ b/schema/.markdownlint.yaml @@ -79,6 +79,8 @@ MD013: strict: false # Stern length checking stern: false + # Expand wide characters + wide_characters: false # MD014/commands-show-output : Dollar signs used before commands without showing output : https://github.com/DavidAnson/markdownlint/blob/v0.39.0/doc/md014.md MD014: true diff --git a/schema/build-config-schema.mjs b/schema/build-config-schema.mjs index 3b5a7c2b..a145c1be 100644 --- a/schema/build-config-schema.mjs +++ b/schema/build-config-schema.mjs @@ -264,6 +264,12 @@ for (const rule of rules) { "type": "boolean", "default": false }; + // @ts-ignore + subscheme.properties.wide_characters = { + "description": "Expand wide characters", + "type": "boolean", + "default": false + }; break; case "MD022": // @ts-ignore diff --git a/schema/markdownlint-config-schema-strict.json b/schema/markdownlint-config-schema-strict.json index f9c0a250..2078fa46 100644 --- a/schema/markdownlint-config-schema-strict.json +++ b/schema/markdownlint-config-schema-strict.json @@ -920,6 +920,11 @@ "description": "Stern length checking", "type": "boolean", "default": false + }, + "wide_characters": { + "description": "Expand wide characters", + "type": "boolean", + "default": false } } } @@ -998,6 +1003,11 @@ "description": "Stern length checking", "type": "boolean", "default": false + }, + "wide_characters": { + "description": "Expand wide characters", + "type": "boolean", + "default": false } } } diff --git a/schema/markdownlint-config-schema.json b/schema/markdownlint-config-schema.json index 91b2bc2f..cafb4cf9 100644 --- a/schema/markdownlint-config-schema.json +++ b/schema/markdownlint-config-schema.json @@ -920,6 +920,11 @@ "description": "Stern length checking", "type": "boolean", "default": false + }, + "wide_characters": { + "description": "Expand wide characters", + "type": "boolean", + "default": false } } } @@ -998,6 +1003,11 @@ "description": "Stern length checking", "type": "boolean", "default": false + }, + "wide_characters": { + "description": "Expand wide characters", + "type": "boolean", + "default": false } } } diff --git a/test/long-lines-thresholds-wide-characters.md b/test/long-lines-thresholds-wide-characters.md new file mode 100644 index 00000000..ee0a90fc --- /dev/null +++ b/test/long-lines-thresholds-wide-characters.md @@ -0,0 +1,50 @@ +# Long Lines Thresholds (Wide Characters) + +00000000011111111112222222222333333333344444444445 +12345678901234567890123456789012345678901234567890 + +Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx t + +Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx t + +Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx✅ t + +{MD013:-2} {MD013:-4} + +## Texxxxxxxxxxxxxxxxxxxxxxxx t + +## Texxxxxxxxxxxxxxxxxxxxxxxxx t + +## Texxxxxxxxxxxxxxxxxxxxxxx✅ t + +{MD013:-2} {MD013:-4} + +```text +Texxxxxxxxxxxxxxxxx t +Texxxxxxxxxxxxxxxxxx t +Texxxxxxxxxxxxxxxx✅ t +``` + +{MD013:-3} {MD013:-4} + + Texxxxxxxxxxxxx t + Texxxxxxxxxxxxxx t + Texxxxxxxxxxxx✅ t + +{MD013:-2} {MD013:-3} + +/ 👋🌎 / 你好,世界 / こんにちは世界 / 안녕 세상 / + +むかしむかし,あるところに,おじいさんとおばあさんがくらしていました。 + +{MD013:-2} {MD013:-4} + + diff --git a/test/long-lines-thresholds.md b/test/long-lines-thresholds.md index 256e017f..23a5a77f 100644 --- a/test/long-lines-thresholds.md +++ b/test/long-lines-thresholds.md @@ -7,25 +7,35 @@ Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx t Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx t -{MD013:-2} +Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx✅ t + +{MD013:-4} ## Texxxxxxxxxxxxxxxxxxxxxxxx t ## Texxxxxxxxxxxxxxxxxxxxxxxxx t -{MD013:-2} +## Texxxxxxxxxxxxxxxxxxxxxxx✅ t + +{MD013:-4} ```text Texxxxxxxxxxxxxxxxx t Texxxxxxxxxxxxxxxxxx t +Texxxxxxxxxxxxxxxx✅ t ``` -{MD013:-3} +{MD013:-4} Texxxxxxxxxxxxx t Texxxxxxxxxxxxxx t + Texxxxxxxxxxxx✅ t -{MD013:-2} +{MD013:-3} + +/ 👋🌎 / 你好,世界 / こんにちは世界 / 안녕 세상 / + +むかしむかし,あるところに,おじいさんとおばあさんがくらしていました。 ␊ + `, + } + ## long-lines-thresholds.md > Snapshot 1 @@ -42689,7 +42873,7 @@ Generated by [AVA](https://avajs.dev). 2, ], fixInfo: null, - lineNumber: 14, + lineNumber: 16, ruleDescription: 'Line length', ruleInformation: 'https://github.com/DavidAnson/markdownlint/blob/v0.0.0/doc/md013.md', ruleNames: [ @@ -42706,7 +42890,7 @@ Generated by [AVA](https://avajs.dev). 2, ], fixInfo: null, - lineNumber: 20, + lineNumber: 24, ruleDescription: 'Line length', ruleInformation: 'https://github.com/DavidAnson/markdownlint/blob/v0.0.0/doc/md013.md', ruleNames: [ @@ -42723,7 +42907,7 @@ Generated by [AVA](https://avajs.dev). 2, ], fixInfo: null, - lineNumber: 26, + lineNumber: 31, ruleDescription: 'Line length', ruleInformation: 'https://github.com/DavidAnson/markdownlint/blob/v0.0.0/doc/md013.md', ruleNames: [ @@ -42742,25 +42926,33 @@ Generated by [AVA](https://avajs.dev). ␊ Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx t␊ ␊ - {MD013:-2}␊ + Texxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx✅ t␊ + ␊ + {MD013:-4}␊ ␊ ## Texxxxxxxxxxxxxxxxxxxxxxxx t␊ ␊ ## Texxxxxxxxxxxxxxxxxxxxxxxxx t␊ ␊ - {MD013:-2}␊ + ## Texxxxxxxxxxxxxxxxxxxxxxx✅ t␊ + ␊ + {MD013:-4}␊ ␊ \`\`\`text␊ Texxxxxxxxxxxxxxxxx t␊ Texxxxxxxxxxxxxxxxxx t␊ + Texxxxxxxxxxxxxxxx✅ t␊ \`\`\`␊ ␊ - {MD013:-3}␊ + {MD013:-4}␊ ␊ Texxxxxxxxxxxxx t␊ Texxxxxxxxxxxxxx t␊ + Texxxxxxxxxxxx✅ t␊ ␊ - {MD013:-2}␊ + {MD013:-3}␊ + ␊ + / 👋🌎 / 你好,世界 / こんにちは世界 / 안녕 세상 /␊ ␊