[Bug 60217] New: Word document with a single table gets corrupted after load/save with no changes

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 60217] New: Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

            Bug ID: 60217
           Summary: Word document with a single table gets corrupted after
                    load/save with no changes
           Product: POI
           Version: 3.15-FINAL
          Hardware: PC
                OS: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: HWPF
          Assignee: [hidden email]
          Reporter: [hidden email]

Created attachment 34333
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34333&action=edit
Maven project with document corruption example

Attaching a sample with a Word document that gets corrupted when we open it and
save it to another file with a code like:

final POIDocument doc = new HWPFDocument(new FileInputStream(DOCUMENT_NAME));
final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME);
doc.write(copy);

When trying to open source document it will open ok.
When trying to open the document after load/save Microsoft Word reports that it
is corrupted and cannot be recovered.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

--- Comment #1 from Mark Murphy <[hidden email]> ---
Can POI read the document after load/save?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

--- Comment #2 from Javen O'Neal <[hidden email]> ---
Created attachment 34340
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34340&action=edit
output .doc file after running unit test

Using the DocumentWithOneTable.doc from your attachment, the unit test below
creates the attached file. LibreOffice does not complain about this file. Can
you check if Word reports that the attached file is corrupted?

Added to TestHPSFBugs.java:
public void test60217() throws Exception {
    InputStream fis = new FileInputStream("/tmp/bug60217.doc");
    POIDocument doc = new HWPFDocument(fis);
    fis.close();
    doc.write(new File("/tmp/bug60217-out.doc"));
    doc.close();
}

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

Javen O'Neal <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

Kostiantyn Miklevskyi <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #3 from Kostiantyn Miklevskyi <[hidden email]> ---
>Mark Murphy 2016-10-07 19:23:58 UTC
>Can POI read the document after load/save?

No, it throws an exception.
Should've provided this info in initial report as I actually tried it.

Here's a code:

        final POIDocument doc = new
HWPFDocument(SaveToAnotherDocumentBug.class.getClassLoader().getResourceAsStream(DOCUMENT_NAME));
        final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME);
        doc.write(copy);
        doc.close();

        new HWPFDocument(new FileInputStream(copy));

And it throws with this stacktrace:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:
-1845343745
    at org.apache.poi.util.LittleEndian.getUByte(LittleEndian.java:274)
    at
org.apache.poi.hwpf.model.FormattedDiskPage.<init>(FormattedDiskPage.java:61)
    at
org.apache.poi.hwpf.model.PAPFormattedDiskPage.<init>(PAPFormattedDiskPage.java:85)
    at org.apache.poi.hwpf.model.PAPBinTable.<init>(PAPBinTable.java:75)
    at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:226)
    at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:157)
    at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:145)
    at com.cosi.SaveToAnotherDocumentBug.main(SaveToAnotherDocumentBug.java:20)

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

--- Comment #4 from Kostiantyn Miklevskyi <[hidden email]> ---
>Javen O'Neal 2016-10-08 22:16:16 UTC
>Can you check if Word reports that the attached file is corrupted?

Yes. The same error message that Word reported previously.
Attaching a screenshot.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

--- Comment #5 from Kostiantyn Miklevskyi <[hidden email]> ---
Created attachment 34350
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34350&action=edit
Screenshot of Word error message when opening a corrupted file

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

--- Comment #6 from Kostiantyn Miklevskyi <[hidden email]> ---
Created attachment 34351
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34351&action=edit
LibreOffice 5.2.2.2 original file and corrupted file side-to-side

Downloaded latest stable LibreOffice version 5.2.2.2 and it indeed doesn't
complain about the corruption but, so I opened original document and a
corrupted one to show the difference.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60217] Word document with a single table gets corrupted after load/save with no changes

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60217

Dominik Stadler <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED

--- Comment #7 from Dominik Stadler <[hidden email]> ---
This looks quite similar to bug #60097, so I am closing this one as duplicate
to have one place to continue discussion.

*** This bug has been marked as a duplicate of bug 60097 ***

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]