Bug 148544 - Option to remove all whitespace in "Paste special" and other dialogs
Summary: Option to remove all whitespace in "Paste special" and other dialogs
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.2.6.2 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Paste-Special
  Show dependency treegraph
 
Reported: 2022-04-12 23:13 UTC by Eric Williamson
Modified: 2023-11-29 06:59 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Williamson 2022-04-12 23:13:37 UTC
Description:
In Calc:
1. a tick box in Paste Special to remove "all whitespace" as a pasting feature
2. a menu option in Format -> Text. To do the same thing. --Since other formatting options like *case (UPPER/lower/etc) exist, it might also makes sense to have it here as well since whitespace characters affect the positioning and formatting.
3. Import data dialog box... same as #1 but in the import dialog box to allow whitespace removing while importing data.
4. (Optional) have it as part of the "Clear Direct Formatting" menu option.

-Also a comparable feature in the other office suite programs. 

Actual Results:
I cannot find the equivalent functionality, other than using cell formulas to accomplish the same task: =trim(clean(*cell*))

Expected Results:
self explanatory..


Reproducible: Always


User Profile Reset: No



Additional Info:
The proposal would be the equivalent of (in Calc): =trim(clean(*cell*)) & paste special -> text, then copy the new cell, over the old one.

This would help when importing/copy & paste from databases, websites, pdf's, other spreadsheet programs, etc.

For example, when a website has a built in grid/spreadsheet, white space characters often copy over with it. Same when copying data from a database. A lot of whitespace characters and unnecessary padding that meets the needs of the database column types, but not necessary for Calc users.

Often when I do import and I find the current formatting annoying, I will do a "Clear Direct Formatting". Having it built into this functionality would also be beneficial since it serves no purpose (I can't think of a use case) for retaining those whitespaces after a "clear direct formatting" use.

--I assume that other latin letter based character sets have the same whitespace problems. I am not sure if this is the case with non-latin (i.e. 16bit+ wide) character sets.
Comment 1 Buovjaga 2023-06-22 11:15:09 UTC
(In reply to Eric Williamson from comment #0)
> Description:
> In Calc:
> 1. a tick box in Paste Special to remove "all whitespace" as a pasting
> feature
> 2. a menu option in Format -> Text. To do the same thing. --Since other
> formatting options like *case (UPPER/lower/etc) exist, it might also makes
> sense to have it here as well since whitespace characters affect the
> positioning and formatting.
> 3. Import data dialog box... same as #1 but in the import dialog box to
> allow whitespace removing while importing data.
> 4. (Optional) have it as part of the "Clear Direct Formatting" menu option.

...

> Additional Info:
> The proposal would be the equivalent of (in Calc): =trim(clean(*cell*)) &
> paste special -> text, then copy the new cell, over the old one.

Clean function description: https://wiki.documentfoundation.org/Documentation/Calc_Functions/CLEAN

CLEAN removes a specific set of non-printable characters that are defined in the C0 Controls and Basic Latin Unicode block. The exact non-printable characters that will be removed by CLEAN are:

    Any character in the code point range U+0000 to U+001F (inclusive).
    The character with code point U+007F.

Space characters (code point U+0020) are not removed by CLEAN.

CLEAN does not currently remove any non-printable characters defined in other Unicode code blocks and thus is not totally compliant with ODF 1.2.
Comment 2 Heiko Tietze 2023-06-26 12:42:29 UTC
Clean() removes non-printable characters from the string but keeps multiple spaces, for example. Calc has TRIM() for this purpose. Does that help, Eric?

For the CSV/text import and paste special it could be a nice-to-have.
Comment 3 ady 2023-06-26 22:38:23 UTC
(In reply to Heiko Tietze from comment #2)
> Clean() removes non-printable characters from the string but keeps multiple
> spaces, for example. Calc has TRIM() for this purpose. Does that help, Eric?

TRIM() is already mentioned in comment 0.

There are several ways to accomplish the deletion of "extra" characters, including "Find and Replace".

Generally speaking, I am in favor of adding options to accomplish an objective that is currently not achievable, or when it currently requires several steps or too much time (or more-advance knowledge), or when it would add the possibility to achieve alternative results according to users' needs.

However, I don't see this request as such case.

> 
> For the CSV/text import and paste special it could be a nice-to-have.

There are conflicting options already present in some of the importing or paste special dialogues. For instance, the current "Trim spaces" check box is aimed at trimming the initial and trailing spaces (only); there other potentially conflicting options (in relation to this request), while there are several alternatives to achieve the goal of this request already.

> -Also a comparable feature in the other office suite programs. 

@Eric, for whoever would be interested in implementing this request, it would be helpful to have screenshots of those comparable features as attachments in this request ticket.
Comment 4 Eyal Rozenberg 2023-06-27 18:07:44 UTC
Why is it a common use case to want to remove all whitespace, as opposed to trimming the heading and trailing white space? Convince me please...
Comment 5 V Stuart Foote 2023-06-28 00:56:25 UTC
(In reply to Eyal Rozenberg from comment #4)
> Why is it a common use case to want to remove all whitespace, as opposed to
> trimming the heading and trailing white space? Convince me please...

For the most part, when white space is included it is not desired--it represents ill-considered formatting and bloats the strings being copy/pasted.  That includes enclosing as well as included text runs of spaces.

Ability to suppress the paste of the unhelpful space runs, as opposed to having to do repeated find/replace sequences, is appealing.

Toggle is sufficient, but I would suggest suppression of white space should be the default.
Comment 6 ady 2023-06-28 04:20:19 UTC
(In reply to V Stuart Foote from comment #5)
> (In reply to Eyal Rozenberg from comment #4)
> > Why is it a common use case to want to remove all whitespace, as opposed to
> > trimming the heading and trailing white space? Convince me please...

FWIW...

If the request is to remove absolutely _all_ whitespace, then there is no need to change anything; we can already do that by several methods.

If the request is to remove _repeated_ whitespace characters (e.g. reduce multiple consecutive spaces to one), we can also do that by several methods.

> 
> For the most part, when white space is included it is not desired--it

Sorry but… is there any source for such claimed statistic? Some use-cases might require that behavior, and others do not.

> represents ill-considered formatting and bloats the strings being
> copy/pasted.  That includes enclosing as well as included text runs of
> spaces.

Users could also mark the undesired character as separator in the import dialogue and also check Merge delimiters. After that, concatenate the resulting columns, if needed. That also eliminates _all_ whitespaces.


> 
> Ability to suppress the paste of the unhelpful space runs, as opposed to
> having to do repeated find/replace sequences, is appealing.

There is no need to do "repeating" find/replace sequences. Select the relevant area, and replace _all_, current selection only, searching for 2 consecutive spaces (or whatever you want to replace) and replacing with only one (or none, or whatever). Perhaps there is some case/need to repeat the action in some particular case, but then using regular expressions in the "Find and Replace" could help in this regard too.

If all the current alternatives are not enough, let's not forget that Calc is not a text editor.

Many relevant requests for additional options in several different areas in Calc have been rejected just to "avoid clutter" (and I disagree with that reasoning). This time we have a request that has enough alternative solutions already.

> 
> Toggle is sufficient, but I would suggest suppression of white space should
> be the default.

Having the requested option set to ON by default would be a terrible idea. Users then would ask why the import is now resulting in a different display content than the content of their original (CSV) file, especially if they had done the same import in the past. If you add an option for some particular (advance) case, then (advance) users should take responsibility for acting on such option. At most, you could make the option to be remembered within the session after the user sets it to ON, but changing the current default behavior with no good reason is bad practice.
Comment 7 Heiko Tietze 2023-07-06 13:01:47 UTC
We discussed the topic in the design meeting.

Some alternative workflows exist and the use case is not clear. On the other hand some believe such a feature would be convenient. In the end we don't come to a conclusion and suggest to keep the ticket open. If more users request this addition it might be reconsidered.
Comment 8 Mike Kaganski 2023-11-29 06:41:58 UTC
"Remove all whitespace" is definitely not a good pre-defined option. If someone finds them often in need of such, it indicates they face some ill-prepared sources. But generally, why not also introduce "remove all characters A to H"? and also "all numbers 3 to 7"?

Import filter has an option to trim leading/trailing spaces - see option 11 in [1]. It may be useful to have in the UI. But not anything more special. For the infinite number of special cases, there are ways to pre-process the data, or post-process it (functions, macros, extensions, external scripts running on exported data...). It is not reasonable to try to build a "universal paste plain text" tool that would be able to do just anything any person on Earth may ever want to do with their data. Same for other UI places (like menu item).

[1] https://help.libreoffice.org/7.6/en-US/text/shared/guide/csv_params.html?DbPAR=SHARED#bm_id181634740978601
Comment 9 Mike Kaganski 2023-11-29 06:59:20 UTC
(In reply to Mike Kaganski from comment #8)
> It may be useful to have in the UI.

... and it is there - I just missed that. So it is all good.