Bug 74357 - FILEOPEN: [DOCX filter] Content piece of the table’s large cell is lost
Summary: FILEOPEN: [DOCX filter] Content piece of the table’s large cell is lost
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.1.3.2 release
Hardware: Other All
: medium critical
Assignee: Miklos Vajna
URL:
Whiteboard: target:4.3.0 target:4.2.2
Keywords: regression
Depends on:
Blocks:
 
Reported: 2014-02-02 09:52 UTC by ape
Modified: 2018-07-02 19:34 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
MHT to DOCX by WinWord-2007 (2.72 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2014-02-02 09:52 UTC, ape
Details
PDF (WinWord-2007 + FoxitReader-6) (2.57 MB, application/pdf)
2014-02-02 20:19 UTC, ape
Details
composit image of page layouts of the regression test (13.66 KB, image/png)
2018-07-02 19:34 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ape 2014-02-02 09:52:06 UTC
Created attachment 93208 [details]
MHT to DOCX by WinWord-2007

I saved the Web page as an MHT archive. I converted the MHT file to the DOCX file using WinWord-2007 (see an attachment). I opened the DOCX file using LibreOffice-4.2.0; 4.1.5 and saw the bug:
  The large part of the table cell’s content is lost. Table cell, located on the tenth page of the document, contains an image whose size is equal to the page size. All information of the cell, located after this big image, is lost.
  I checked the operation of such programs: OOo-3.1.1; LibO-3.5.7, 3.6.7, 4.0.6; AOO-4.0.1. All of these programs open the DOCX file correctly, not making this mistake.
  This is a loss of information and regression to old programs, so the status is critical.
Comment 1 Roman Kuznetsov 2014-02-02 11:47:50 UTC
confirm bug
Comment 2 Joel Madero 2014-02-02 17:02:12 UTC
Please stop adding your own bugs to MAB - there are policies in place about what MAB is and you have been asked already once to not do this. 

You are missing a lot of information on the bug report, you don't tell us if it's a regression, no QA person is involved, etc ..., etc... I am removing the three bugs you put on the MAB. 

Also it is highly suspicious that one email address has confirmed all of your bugs - I'm not accusing but if every one of your bugs is confirmed by the same email address it begins to look like you setup a second account just to say "confirmed" or that you are guiding a friend through confirming and that doesn't verify that this bug is has all the needed information. Here you have MOST but you are still missing some key things:

What OS are you using?

also

In this case what wold be useful is a PDF created from MS Word so we can actually see what it's supposed to look like. 

Lastly - even if there is loss of data - that doesn't mean it's a MAB - we need to verify that on a very simple test case MANY USERS would see the problem - a single test case is not sufficient. Additionally, there are still other things that QA would do before putting it on MAB list.

If you want to learn about QA policies I encourage you to join the chat and actually become a contributor instead of pushing your own bugs to a list that you don't know how we use it. Thanks for your understanding
Comment 3 Joel Madero 2014-02-02 17:02:49 UTC
and apologies I made a mistake - you did tell us it's a regression
Comment 4 Joel Madero 2014-02-02 17:45:31 UTC
Is there a version that opens it PERFECTLY. I just opened it with a couple versions and each one showed a different # of pages, none of which matched Microsoft Office.
Comment 5 ape 2014-02-02 20:19:17 UTC
Created attachment 93251 [details]
PDF (WinWord-2007 + FoxitReader-6)

(In reply to comment #2)

> Also it is highly suspicious that one email address has confirmed all of
> your bugs - I'm not accusing but if every one of your bugs is confirmed by
> the same email address it begins to look like you setup a second account
> just to say "confirmed" or that you are guiding a friend through confirming
> and that doesn't verify that this bug is has all the needed information.
> Here you have MOST but you are still missing some key things:
> 
> What OS are you using?
> 
> also
> 
> In this case what wold be useful is a PDF created from MS Word so we can
> actually see what it's supposed to look like. 
--
1. I put the ‘new’ status only when all my 4 computers confirm the error.
Each computer has two operating systems (not VMs): Windows XP (32 or 64 bit) and Lubuntu 13.10 (32 or 64 bit).
2. I write their messages after the forum discussion (http://www.forumooo.ru). So you very quickly received confirmation from one of the members of this forum.
3. I do not need a second mail address. I have it. Verify this by asking the administrator my old mail address of my registration.
4. Older programs have a different number of pages? Do not pay attention to it, because they have other bugs that were fixed at different times.
The text reading error begins after the tenth page. Tenth page contains a large image inserting in the cell. It is possible that subsequent text separated by tabs.
-
P.S. Sorry for my bad English. No time to correct the text by three on-line translators and dictionaries.
Comment 6 Joel Madero 2014-02-02 20:38:53 UTC
Thanks for the details :) Just in the future let QA confirm - if you think it's a really nasty bug just email the QA list and someone will check it out. Also do not put your own bugs on the MAB ever - even QA doesn't do this as it just looks bad, we need an independent and non biased person to determine if the bug belongs there. In this case, if it's just a single complex document displaying the problem then it probably does not but a bibisect would be quite helpful here so I am requesting one.
Comment 7 Michael Stahl (allotropia) 2014-02-12 23:18:02 UTC
30 pages of content missing, ouch...

was working in 4.1.2.3, broken in 4.1.3.2

regression from:

commit ea6000028d0aa2e9d1c24d4d84defc72d5cf81da
Author:     Miklos Vajna <vmiklos@suse.cz>
AuthorDate: Wed Aug 28 11:24:07 2013 +0200
Commit:     Caolán McNamara <caolanm@redhat.com>
CommitDate: Fri Sep 6 11:28:24 2013 +0000

    bnc#816593 DOCX import: fix auto para spacing without compat option
    
    Paragraph auto spacing (before and after) without the
    w:doNotUseHTMLParagraphAutoSpacing compat option was incorrect.
Comment 8 Miklos Vajna 2014-02-15 19:01:03 UTC
Hmm, so the above fix sounds quite unrelated: I also tried to bisect this on master, just to be sure I end up at the same commit -- but it doesn't look like that:

$ git bisect good 503b248127a92b9ad190e05f6a1d50574183cd47 is the first bad commit
commit 503b248127a92b9ad190e05f6a1d50574183cd47
Author: Tor Lillqvist <tml@iki.fi>
Date:   Wed May 22 19:50:22 2013 +0300

    Update bundled boost to 1.53.0
    
    Modify our patches as necessary to match the updated boost sources. Drop
    patches for which corresponding (or even identical) changes already are
    present. Add a new boostsystem static library and use it in two places.
    
    Change-Id: Ib59558feb56dab87a69c91b38caca8e7a9e9a22e

:100644 100644 973ebc518f66adbead4340bb103228753366a8f1 0be205470fea50bab321b4d928d87b67e7e14ffd M      RepositoryExternal.mk
:040000 040000 afc5a1e28ad42c82add7bf8e2519f53a5799bba3 bff62dccaa36479a5479124c3c563cca1a5075d8 M      boost
:100644 100644 38466abe0b786109a74eca55abe4af53a83ab43d 50ecc7f1354db563380bb6f3b0331cd874fcc63b M      download.lst
:040000 040000 b9855320305f7b5a9237595699714013ef3bba6f db336173fdf6f8407ecc4c93971066d12a94bcca M      liborcus
:040000 040000 ac42d27b96eb50784fcf145ec4fb6b8829eea68f 868c5c67851f1188f3a3b9375c5571f9f8a38b28 M      sc

I suppose writerfilter simply relies on some ordering boost doesn't guarantee, but it happened to work (by luck) before the 1.53 upgrade, and since then it's broken.
Comment 9 Miklos Vajna 2014-02-16 10:23:46 UTC
I'll take care of this.
Comment 10 Commit Notification 2014-02-17 08:57:21 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e5fd7c2dacf3c128cdc62622e736ce8abbc578a5

fdo#74357 DOCX import: fix nested tables anchored inside tables



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2014-02-17 09:25:29 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8b6ff51bb89db0d7050bb4d00c0ec797b4754f25

fdo#74357 DOCX import: avoid layout problem with automatic spacing



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 12 Miklos Vajna 2014-02-17 09:49:08 UTC
Fixed on master, -4-2 reviews:
https://gerrit.libreoffice.org/8080
https://gerrit.libreoffice.org/8081
Comment 13 Commit Notification 2014-02-18 12:54:31 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=28fa9a59940471bf072175a7ee98e23546485e50&h=libreoffice-4-2

fdo#74357 DOCX import: fix nested tables anchored inside tables


It will be available in LibreOffice 4.2.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2014-02-18 12:56:57 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=15c05b945b238ae05cda92564b9465f04db5c919&h=libreoffice-4-2

fdo#74357 DOCX import: avoid layout problem with automatic spacing


It will be available in LibreOffice 4.2.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2014-02-24 09:22:24 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=364682a19713169b374d949d0fc34d105a2be5d4&h=libreoffice-4-1

fdo#74357 DOCX import: fix nested tables anchored inside tables


It will be available in LibreOffice 4.1.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 ape 2014-03-01 05:33:13 UTC
(In reply to comment #15)
> Miklos Vajna committed a patch related to this issue.
> It has been pushed to "libreoffice-4-1":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=364682a19713169b374d949d0fc34d105a2be5d4&h=libreoffice-4-1
> 
> fdo#74357 DOCX import: fix nested tables anchored inside tables
> 
> 
> It will be available in LibreOffice 4.1.6.
> 
> The patch should be included in the daily builds available at
> http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
> information about daily builds can be found at:
> http://wiki.documentfoundation.org/Testing_Daily_Builds
> Affected users are encouraged to test the fix and report feedback.

I do not see changes in LibreOfficeDev-4.1.6.0.0+
(ID: 97197e881d609404288eaa903fa7db2c3b0c70b, Win-x86_9-Voreppe, 2014-02-27_09.25.03)
Comment 17 ape 2014-03-04 15:28:08 UTC
I do not see changes in LibreOfficeDev-4.1.6.0.0 (ID: ff2704d22bab86d9e58df812bc01482cfb4bb26, Win-x86_9-Voreppe, Time: 2014-03-03_10.33.10), therefore this bug reopened for LibreOffice-4.1.6.
Comment 18 Michael Stahl (allotropia) 2014-05-19 21:09:09 UTC
hmmm ok it's apparently not fixed in 4.1.6.2; but there will
not be any more 4.1 releases now, so setting this to FIXED for 4.2.
Comment 19 retired 2014-05-19 22:16:55 UTC
ape, please download LO 4.2.4: http://www.libreoffice.org/download/libreoffice-fresh/

fixed there.
Comment 20 László Németh 2018-07-02 19:34:25 UTC
Created attachment 143272 [details]
composit image of page layouts of the regression test

regression test needs zero top margin in the first paragraph

red=MSO, black=LO, green= LO after fix in tdf#104354