Bug 158890 - FORMATTING: Custom number format trailing 0 replaced with figure space
Summary: FORMATTING: Custom number format trailing 0 replaced with figure space
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Laurent Balland
URL:
Whiteboard: target:24.8.0
Keywords:
Depends on:
Blocks: Number-Format
  Show dependency treegraph
 
Reported: 2023-12-27 18:25 UTC by misc.libre
Modified: 2024-02-17 08:01 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description misc.libre 2023-12-27 18:25:24 UTC
Description:
Custom number format using the ? mark replaces trailing 0 with a normal space character. Could we instead replace it with a figure space (unicode 0x2007) which has a width approximating a digit rather than the much thinner normal/ASCII character space (unicode 0x0020). This would make numbers more appropriately aligned an much closer to what users are actually expecting to see from trailing 0 being replaced by a spacing character.

Steps to Reproduce:
1. Use a custom number format using ? symbol for trailing decimal digits
2. Apply to a number with trailing 0 shorter than the number of decimal digits the custom format includes (for example, the value 0.1 for a custom format of "#,##0.??????"


Actual Results:
The default ASCII character is used which is narrower than most digits making obvious alignment issues.

Expected Results:
A space character with a width closer to the width of a digit (such as figure space or en space).


Reproducible: Always


User Profile Reset: Yes

Additional Info:
An alternative may be to add another formatting symbol instead of ? to do this, but it seems to me that the behavior of ? is better altered.
Comment 1 ady 2023-12-27 23:40:28 UTC
I tend to use mono-spaced fonts in order to avoid these issues.

Having said that, I agree that the intention of the "?" symbol for number formats is to take the place (and space) of a number.

Unless there is a reason to intentionally take a different (measure) space of a number character (using the same font), I would have to agree that, somehow, it makes sense that the "?" place/space would be equivalent to a numeric glyph with the same font.

Whether the space should specifically be equivalent to Unicode 0x2007, or whether the method to achieve the goal should be different, I agree with the goal of this enhancement request.

CC'ing Laurent Balland.
Comment 2 Laurent Balland 2023-12-28 13:00:55 UTC
From my point of view, using '?' in a format code is going with using a mono-spaced font. I do not see any user case where using '?' with a non mono-spaced font make sense.

However, it should not hurt too much to replace the normal space with a larger space. Compatibility with older versions and other office suites should be preserved.
Comment 3 ady 2023-12-28 15:27:35 UTC
(In reply to Laurent Balland from comment #2)
> From my point of view, using '?' in a format code is going with using a
> mono-spaced font. I do not see any user case where using '?' with a non
> mono-spaced font make sense.

Agreed.

> 
> However, it should not hurt too much to replace the normal space with a
> larger space. Compatibility with older versions and other office suites
> should be preserved.

Compatibility is very important in this regard. If other spreadsheet tools do not respect the same space as a numeric glyph of the same font (as proposed here), then, by having a change in this regard, users will eventually complain that some imported or exported file in Calc does not "respect" the same presentation as <some_other_spreadsheet_tool>.

Moreover, if the font in use does not include the specific character (e.g. Unicode 0x2007), then a simple space character should be displayed, as it is the current behavior. The worst behavior would be to have some font showing some strange character in place of the "?" instead of some kind of "space" character; that would be unacceptable.

As I said in comment 1, using mono-spaced (or fixed-width) fonts for these cells solves the problem without effort and with no negative consequences.
Comment 4 Eike Rathke 2024-01-03 11:54:44 UTC
(In reply to Laurent Balland from comment #2)
> From my point of view, using '?' in a format code is going with using a
> mono-spaced font. I do not see any user case where using '?' with a non
> mono-spaced font make sense.
"Good" proportional fonts use mono-spaced digits anyway, so even there using U+2007 FIGURE SPACE would make sense.

> However, it should not hurt too much to replace the normal space with a
> larger space. Compatibility with older versions and other office suites
> should be preserved.
There's nothing to preserve. It's a display time rendering thing for a format code and either gets changed or not.
Comment 5 ady 2024-01-03 19:31:05 UTC
(In reply to Eike Rathke from comment #4)

> There's nothing to preserve. It's a display time rendering thing for a
> format code and either gets changed or not.

If it gets changed, what would happen when the specific font in use doesn't include the U+2007 FIGURE SPACE character? Fallback to the normal space character?

I fear the negative consequences, as expressed in comment 3, including the possible difference in rendering.
Comment 6 misc.libre 2024-01-03 20:53:52 UTC
(In reply to ady from comment #5)
> (In reply to Eike Rathke from comment #4)
> 
> > There's nothing to preserve. It's a display time rendering thing for a
> > format code and either gets changed or not.
> 
> If it gets changed, what would happen when the specific font in use doesn't
> include the U+2007 FIGURE SPACE character? Fallback to the normal space
> character?
> 
> I fear the negative consequences, as expressed in comment 3, including the
> possible difference in rendering.

As far as I can tell, Calc will fallback to a font in the OS that does include the character.  For example, if your cell is in FontX but that font does not include character Y, then Calc renders everything but character Y in FontX but grabs that one missing character from FontZ that does have it.  As much as I have been trying to break Calc's rendering off-and-on in the last few days (even trying to test it with [almost] all of the original fonts released with Win 3.1), it appears to be surprisingly robust. No ugly box characters (etc.) so far for any unavailable obscure characters that I have tested. The characters from other fonts Calc falls back on seem to be very similar in size to the requested font, too.

It looks like almost everyone will have a font to fall back on to grab 0x2007 (figure space) from. The FileFormat . Info website lists several fonts that support 0x2007. Among those fonts listed are Arial, Arial Unicode, Calibri, Cambria, Consolus, DejaVu, Doulos, Gentium, Liberation Sans, Linux Libertine, Lucida Sans, Palatino Linotype, Roboto, Segoe UI, Source Code, Source Sans, Tahoma, and Times New Roman. I think that pretty much covers Linux, Windows, and Mac. I think some of these fonts are also installed when LibreOffice is installed (Gentium, DejaVu, and few others).
Comment 7 ady 2024-01-04 03:39:35 UTC
I just performed a test. Whether it is of any utility or validity, IDK.

1. In Calc:
A1: =1&UNICHAR(HEX2DEC(2007))&1
A2: =1&UNICHAR(HEX2DEC(20))&1

Note that the result is text containing U+2007 and U+0020 respectively, not a formatted numeric value using "?" as part of its number format code.

2. Select both cells and copy them.
3. Open some simple text editor. On MS Windows this could be notepad.exe.
4. Paste the previously-copied cells into the text editor.
5. For the text editor, select the (previously-pasted) text and apply a basic font that doesn't support U+2007. On MS Windows, this could be, for instance, the old "Courier" (not to be confused with "Courier New").

The result in the text editor shows a special symbol in place of the original U+2007 character.

IDK whether this procedure shows some kind of potential problem when either exporting or importing data when the rendered characters would include U+2007 (instead of the common space character U+0020). I just wanted to share the results, just in case.

I still don't know what other spreadsheet tools do (or rather, how the result is rendered) with the "?" format code when using "proportional" fonts. This is important, so users won't complain about seeing different rendered results (or exported formats, such as in PDFs or HTMLs).
Comment 8 misc.libre 2024-01-04 19:32:09 UTC
(In reply to ady from comment #7)
> I just performed a test. Whether it is of any utility or validity, IDK.
> 
> 1. In Calc:
> A1: =1&UNICHAR(HEX2DEC(2007))&1
> A2: =1&UNICHAR(HEX2DEC(20))&1
> 
> Note that the result is text containing U+2007 and U+0020 respectively, not
> a formatted numeric value using "?" as part of its number format code.
> 
> 2. Select both cells and copy them.
> 3. Open some simple text editor. On MS Windows this could be notepad.exe.
> 4. Paste the previously-copied cells into the text editor.
> 5. For the text editor, select the (previously-pasted) text and apply a
> basic font that doesn't support U+2007. On MS Windows, this could be, for
> instance, the old "Courier" (not to be confused with "Courier New").
> 
> The result in the text editor shows a special symbol in place of the
> original U+2007 character.
> 
> IDK whether this procedure shows some kind of potential problem when either
> exporting or importing data when the rendered characters would include
> U+2007 (instead of the common space character U+0020). I just wanted to
> share the results, just in case.
> 
> I still don't know what other spreadsheet tools do (or rather, how the
> result is rendered) with the "?" format code when using "proportional"
> fonts. This is important, so users won't complain about seeing different
> rendered results (or exported formats, such as in PDFs or HTMLs).

This would not be a problem with custom formats (which is being discussed here) where this is applied as a spacing character for display purposes (including the current way it is) since it is Only visual formatting. When copying a cell, formatting is Not copied at all, only the actual inputted value that the cell contains. For example, if a cell is formatted to represent negative values with round brackets (commonly referred to as parenthesis marks), when you copy a cell containing "-23" as a value — displayed as "(23)" in the cell — and then paste it into a text editor, what was copied and then pasted is cells value of "-23", not the formatted/displayed "(23)".
Comment 9 ady 2024-01-04 20:38:40 UTC
(In reply to misc.libre from comment #8)
>  When
> copying a cell, formatting is Not copied at all, only the actual inputted
> value that the cell contains.

Hmm. I am not sure that's accurate. For instance, if we have a numeric value of "1" (without quotation marks) with General format, copying it to notepad results in the same 1. However, when the format is "0.000", the same numeric value "1" (without quotation marks) is copied as "1.000" (without quotation marks).

Similarly, the currency format and the parentheses on negative numeric values, are considered too in the above procedure with notepad.

Moreover, there are ways to import/export data taking in consideration (the cell) formats.

To be clear, I am just pointing out potential issues, to be considered and (re-)tested in case any change is implemented.

In the meantime, I would suggest using mono-spaced (fix-width) fonts for these kind of cell's values and formats, as others have suggested.
Comment 10 misc.libre 2024-01-07 04:45:01 UTC
(In reply to ady from comment #9)
> Hmm. I am not sure that's accurate. For instance, if we have a numeric value
> of "1" (without quotation marks) with General format, copying it to notepad
> results in the same 1. However, when the format is "0.000", the same numeric
> value "1" (without quotation marks) is copied as "1.000" (without quotation
> marks).

Just checked that out and you are correct. Weird as I assumed Calc would behave the same, but had not tested/encountered this since switching over. That could be problematic in its own way as now I wonder how one would go about copying out of Calc without getting values truncated or mangled. I often use a lot of numbers with lots of decimal points but in spreadsheets/Calc are displayed with 3 or 6 or 9 decimal places showing.

(In reply to ady from comment #9)
> In the meantime, I would suggest using mono-spaced (fix-width) fonts

Mono is pretty much no-go for me. :(
Comment 11 studog 2024-02-03 05:16:28 UTC
Came here to file a new bug. Instead I'll report my reproducible test case:
1. In A1 enter "18.672903"
2. In A2 enter "47"
3. In A3 enter "1.6052"
4. Apply the custom number format ".???" to all three cells
5. Observe that the "47." is badly misaligned with the other two cells

Expected behaviour: the decimal points in all three cells align in a vertical column. This is what Excel does.

I tried quite a number of fonts in my Excel instance, only a very small number of fonts ended up with misalignment: "Berlin Sans" and "Georgia" were two. The rest align the decimals as expected.

https://help.libreoffice.org/latest/en-US/text/shared/01/05020301.html specifically says "The ? works as the # but adds a space character to keep decimal alignment if there is a hidden non-significant zero."

I'm not sure how Excel manages to keep decimal alignment in most cases/fonts, but, LibreOffice needs to do the same.

Requiring a user to change from their chosen font to a mono-spaced font (which does work around the bug, I admit that) is not an acceptable solution long term.
Comment 12 studog 2024-02-03 05:37:18 UTC
This comment https://bugs.documentfoundation.org/show_bug.cgi?id=118337#c2 indicates that decimal alignment used to work, and then broke. Perhaps that should be bisected to determine the breaking change?
Comment 13 Laurent Balland 2024-02-03 07:26:36 UTC
A patch is under review 
https://gerrit.libreoffice.org/c/core/+/161737
Comment 14 Commit Notification 2024-02-17 07:59:53 UTC
Laurent Balland committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5378ce372ae29fcb06c5134c18f935c1c7ea3c51

tdf#158890 Replace '?' with figure blank

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.