61795 – Weak Characters (like brackets) are mispositioned with mixed RTL and LTR

Bug 61795 - Weak Characters (like brackets) are mispositioned with mixed RTL and LTR

Summary: Weak Characters (like brackets) are mispositioned with mixed RTL and LTR

Status:	RESOLVED NOTABUG

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Writer (show other bugs)
Version: (earliest affected)	Inherited From OOo
Hardware:	All All

Importance:	medium normal
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Duplicates (1):	68092 (view as bug list)
Depends on:
Blocks:	RTL-CTL
	Show dependency tree / graph

Reported:	2013-03-04 15:23 UTC by Shlomi Israely
Modified:	2017-10-17 14:14 UTC (History)
CC List:	10 users (show)

See Also:
Crash report or crash signature:

Attachments
test case (12.69 KB, application/vnd.oasis.opendocument.text) 2013-03-04 15:23 UTC, Shlomi Israely	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Shlomi Israely 2013-03-04 15:23:12 UTC

Created attachment 75900 [details]
test case

When mixing text of an RTL language (Hebrew) and LTR language (English) many times brackets are misplaced.

This causes inability to mix texts from English and other languages, thus making Writer unusable for day to day use.
reproduce-able text:
Hello (עול)ם
expected text:
Hello ‏(עול)ם

as you can see, the parenthesis is misplaced in the first example.  

Todays LibreOffice's solution is to add LRM and RLM chars in the correct place, this is very unintuitive for 90% of the users.

I think the BiDi algorithm should be enhanced, and place LRM/RLM automatically according to the current window's keyboard layout. That is if the paragraph is RTL but the layout is English then put an LRM char before the weak char. This is what the user usually expects to happen.

I've attached a test case .odt file.

A similar bug with brackets is:
https://bugs.freedesktop.org/show_bug.cgi?id=56408
It might have a similar cause , but it's a different bug.

Comment 1 Amir E. Aharoni 2013-03-07 11:36:10 UTC

The really good way to resolve this is not to change the Unicode bidi algorithm, but to add support for inline direction marking to the OpenDocument standard and then implement it in LibreOffice. Put simply, HTML has <div dir="rtl"> and <span dir="rtl">, and OpenDocument only has something like <div dir="rtl">, but not <span dir="rtl">.

Using directionality marks like RLM, RLE and PDF is not a robust way to resolve this, although if they are used internally and the user doesn't have to use them directly, it's probably OK.

I tried to bring this issue up several times on the OpenDocument mailing list, but didn't get any useful replies. See here:
https://lists.oasis-open.org/archives/office-comment/201110/msg00000.html

There may be some challenges with implementing this, but before discussing the implementation challenges, it must be agreed to make the change in the standard that LibreOffice is implementing.

Comment 2 Maxim Monastirsky 2013-08-14 14:28:27 UTC

*** Bug 68092 has been marked as a duplicate of this bug. ***

Comment 3 QA Administrators 2015-04-19 03:23:11 UTC Comment hidden (obsolete)

** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

*Test to see if the bug is still present on a currently supported version of LibreOffice (4.4.1 or later)
https://www.libreoffice.org/download/

*If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior

*If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System

Please DO NOT

*Update the version field
*Reply via email (please reply directly on the bug tracker)
*Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case)

If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:

1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3)

http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to "inherited from OOo";
4b. If the bug was not present in 3.3 - add "regression" to keyword

Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa

Thank you for your help!

-- The LibreOffice QA Team This NEW Message was generated on: 2015-04-18

Comment 4 Hanan Sela 2015-04-21 18:55:01 UTC

The bug is still a problem in LO 4.4.2.2 on Ubuntu 14.04. The bug now is not affecting the brackets in the middle of the line but at the end of the line the brackets will not stay with the LTR word when the rest of the line is RTL. If you insert "no width no break" formatting mark, the brackets stick to the first letter but then the rest of the LTR word break at the end of the line.

Comment 5 QA Administrators 2016-09-20 10:29:34 UTC Comment hidden (obsolete)

** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

If you have time, please do the following:

Test to see if the bug is still present on a currently supported version of LibreOffice
(5.1.5 or 5.2.1 https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the version of LibreOffice and
your operating system, and any changes you see in the bug behavior

If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave
a short comment that includes your version of LibreOffice and Operating System

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not
appropriate in this case)

If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3)

http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to "inherited from OOo";
4b. If the bug was not present in 3.3 - add "regression" to keyword

Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug-20160920

Comment 6 Ofir 2016-09-20 11:00:18 UTC

Still reproducible with:

Version: 5.2.1.2
Build ID: 1:5.2.1~rc2-0ubuntu1~xenial0
CPU Threads: 1; OS Version: Linux 4.4; UI Render: default; 
Locale: en-US (en_US.UTF-8); Calc: group

Comment 7 Hanan Sela 2016-09-21 03:51:05 UTC

The bug is still reproducible in LO Version: 5.2.1.2
Build ID: 31dd62db80d4e60af04904455ec9c9219178d620
CPU Threads: 4; OS Version: Linux 4.2; UI Render: default; 
Locale: en-US (en_US.UTF-8); Calc: group

OS Ubuntu 15.10

Comment 8 ⁨خالد حسني⁩ 2016-12-26 07:06:43 UTC

This is how the Unicode bidirectional text algorithm works, and the latest versions of LibreOffice support the latest version of the algorithm which handles bracket pairing. Using formatting characters is a perfectly fine way to solve ambiguities that the algorithm can’t handle by automatically.

Comment 9 amirimobile 2016-12-30 12:55:25 UTC

I seriously don't see how this is resolved. There is a serious problem here: Users get the _wrong_ results out of what they type. That means that this is in fact a bug.

Regarding "Using formatting characters is a perfectly fine way..." - if what you mean by this is that users enter these formatting characters manually, then it is definitely _not_ fine. This is a terrible nuisance to the few who even know what these formatting characters are and how to input them, but the vast majority of users are simply dumbfounded by the current behavior, and this is something that does actually work in the popular (sorry) office suite.

All of this has of course already been mentioned by the OP of this bug, and what has already been said (in subsequent comments) is that the Unicode BiDi algorithm does not have to change, but libreoffice _does_.

Comment 10 ⁨خالد حسني⁩ 2016-12-30 13:03:48 UTC

(In reply to amirimobile from comment #9)
> I seriously don't see how this is resolved. There is a serious problem here:
> Users get the _wrong_ results out of what they type. That means that this is
> in fact a bug.

No it is not, that is how the Unicode Bidirectional Text Algorithm works, not ideal but that is why control characters exists; to asset the algorithm when there is ambiguity (which is the case here).

> Regarding "Using formatting characters is a perfectly fine way..." - if what
> you mean by this is that users enter these formatting characters manually,
> then it is definitely _not_ fine. This is a terrible nuisance to the few who
> even know what these formatting characters are and how to input them, but
> the vast majority of users are simply dumbfounded by the current behavior,
> and this is something that does actually work in the popular (sorry) office
> suite.

Please cite examples of this working differently elsewhere.

> All of this has of course already been mentioned by the OP of this bug, and
> what has already been said (in subsequent comments) is that the Unicode BiDi
> algorithm does not have to change, but libreoffice _does_.

So what are the requested changes, other than changing the algorithm?

Comment 11 amirimobile 2016-12-30 19:21:29 UTC

(In reply to Khaled Hosny from comment #10)
> (In reply to amirimobile from comment #9)
> > I seriously don't see how this is resolved. There is a serious problem here:
> > Users get the _wrong_ results out of what they type. That means that this is
> > in fact a bug.
> 
> No it is not, that is how the Unicode Bidirectional Text Algorithm works,
> not ideal but that is why control characters exists; to asset the algorithm
> when there is ambiguity (which is the case here).

That's just it: The algorithm is not at all the point here. LibreOffice Writer _is_. The algorithm specifies how to display bidirectional text. Writer is the instrument to be used to write text. What we're saying here is that writing is broken, not how the already-written text is displayed.

> 
> > Regarding "Using formatting characters is a perfectly fine way..." - if what
> > you mean by this is that users enter these formatting characters manually,
> > then it is definitely _not_ fine. This is a terrible nuisance to the few who
> > even know what these formatting characters are and how to input them, but
> > the vast majority of users are simply dumbfounded by the current behavior,
> > and this is something that does actually work in the popular (sorry) office
> > suite.
> 
> Please cite examples of this working differently elsewhere.

Microsoft Word:
- Base direction: LTR
- Write text using strongly typed LTR characters
- switch to RTL language
- write text using strongly type RTL characters, mixed with various bracket types
-> brackets keep their intended positions

> 
> > All of this has of course already been mentioned by the OP of this bug, and
> > what has already been said (in subsequent comments) is that the Unicode BiDi
> > algorithm does not have to change, but libreoffice _does_.
> 
> So what are the requested changes, other than changing the algorithm?


Two possible solutions that I would consider have already been suggested by Amir E. Aharoni 2013-03-07 11:36:10 UTC:

--- start quote ---
The really good way to resolve this is not to change the Unicode bidi algorithm, but to add support for inline direction marking to the OpenDocument standard and then implement it in LibreOffice. Put simply, HTML has <div dir="rtl"> and <span dir="rtl">, and OpenDocument only has something like <div dir="rtl">, but not <span dir="rtl">.

Using directionality marks like RLM, RLE and PDF is not a robust way to resolve this, although if they are used internally and the user doesn't have to use them directly, it's probably OK.
--- end quote ---

Comment 12 ⁨خالد حسني⁩ 2016-12-30 19:36:32 UTC

(In reply to amirimobile from comment #11) 
> Microsoft Word:
> - Base direction: LTR
> - Write text using strongly typed LTR characters
> - switch to RTL language
> - write text using strongly type RTL characters, mixed with various bracket
> types
> -> brackets keep their intended positions

Please attach a sample document.

Comment 13 ⁨خالد حسني⁩ 2016-12-30 21:09:27 UTC

Screenshot of the MS Office rendering would be appreciated as well.

Comment 14 QA Administrators 2017-07-27 12:06:31 UTC Comment hidden (obsolete)

Dear Bug Submitter,

This bug has been in NEEDINFO status with no change for at least
6 months. Please provide the requested information as soon as
possible and mark the bug as UNCONFIRMED. Due to regular bug
tracker maintenance, if the bug is still in NEEDINFO status with
no change in 30 days the QA team will close the bug as INSUFFICIENTDATA
due to lack of needed information.

For more information about our NEEDINFO policy please read the
wiki located here:
https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Status/NEEDINFO

If you have already provided the requested information, please
mark the bug as UNCONFIRMED so that the QA team knows that the
bug is ready to be confirmed.
 
Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-NeedInfo-Ping-20170727

Comment 15 QA Administrators 2017-08-30 19:31:50 UTC Comment hidden (obsolete)

Dear Bug Submitter,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INSUFFICIENTDATA due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

Warm Regards,
QA Team

MassPing-NeedInfo-20170830

Comment 16 Lior Kaplan 2017-10-11 08:21:15 UTC

Still happens in 5.4.1.

Comment 17 ⁨خالد حسني⁩ 2017-10-11 10:45:58 UTC

This is not really a bug but the expected Unicode bidirectional text rendering. The claim that MS Office handles this differently have not been supported with evidence (it wouldn’t matter even, MS Office is known not be in full compliance with Unicode Bidirectional Text Algorithm).

Comment 18 Mike Kaganski 2017-10-16 06:40:14 UTC

(In reply to Khaled Hosny from comment #17)

Khaled:

I agree with OP, that there *is* a problem here. Again: that's not the problem of rendering! There's nothing to Unicode bidirectional text algorithm. The user is able to manually create good-looking text if user performs some special actions; i.e., citing comment #0,

> Todays LibreOffice's solution is to add LRM and RLM chars in the correct place

Of course, text *rendering* should not place some formatting characters anywhere they aren't present in source. But the problem is not rendering, as already said multiple times; the problem is *input*, which should analyse the situation (current IME mode?) and insert those characters at input stage, when the string is created from keyboard. So, I suggest you to revert your decision to dismiss this, as UX-wise, this is a horrible bug actually.

Comment 19 ⁨خالد حسني⁩ 2017-10-17 14:14:58 UTC

(In reply to Mike Kaganski from comment #18)
> (In reply to Khaled Hosny from comment #17)
> 
> Khaled:
> 
> I agree with OP, that there *is* a problem here. Again: that's not the
> problem of rendering! There's nothing to Unicode bidirectional text
> algorithm. The user is able to manually create good-looking text if user
> performs some special actions; i.e., citing comment #0,
> 
> > Todays LibreOffice's solution is to add LRM and RLM chars in the correct place
> 
> Of course, text *rendering* should not place some formatting characters
> anywhere they aren't present in source. But the problem is not rendering, as
> already said multiple times; the problem is *input*, which should analyse
> the situation (current IME mode?) and insert those characters at input
> stage, when the string is created from keyboard. So, I suggest you to revert
> your decision to dismiss this, as UX-wise, this is a horrible bug actually.

There is no concrete proposal what should be done here. I don’t personally think we should start inserting characters the user didn’t type, not invisible ones at least. I don’t know any program that does that (regarding weak bidi characters) apart from what said here about MS Word (and I don’t think this is publicly specified anywhere, so we would be effectively reverse-engineering it). Also I feel that whatever we do here is likely to have undesired side-effects, if there were a robust way to handle this it would have made it to UBA by now.

But anyway, that is my 2 qirsh, feel free to re-open the issue if you think otherwise.