[Bug 58002] New: EscherAggregate.createAggregate doesn't always read NoteRecord

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 58002] New: EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

            Bug ID: 58002
           Summary: EscherAggregate.createAggregate doesn't always read
                    NoteRecord
           Product: POI
           Version: 3.12-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSSF
          Assignee: [hidden email]
          Reporter: [hidden email]

Created attachment 32793
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32793&action=edit
Sample code to read comments.

I have this XLS file that when I attempt to retrieve all HSSFComment via
worksheet's drawing patriarch's children, I get a NullPointerException when I
attempt to get the comment's column (and row). The _note component didn't get
initialized. I have tracked this issue down to:


org.apache.poi.hssf.record.EscherAggregate.createAggregate, near line 442:

        // any NoteRecords that follow the drawing block must be aggregated and
and saved in the tailRec collection
        while (loc < records.size()) {
            if (sid(records, loc) == NoteRecord.sid) {
                NoteRecord r = (NoteRecord) records.get(loc);
                agg.tailRec.put(r.getShapeId(), r);
            } else {
                break;
            }
            loc++;
        }

If the current loc points to a record where the sid is NOT NoteRecord, the loop
exits without checking additional records. In my test file, the next record is
a DrawingSelectionRecord [MSODRAWINGSELECTION] with sid 237 (0xED). The fix
that works for my document is to remove the else break, as in:

        // any NoteRecords that follow the drawing block must be aggregated and
and saved in the tailRec collection
        while (loc < records.size()) {
            if (sid(records, loc) == NoteRecord.sid) {
                NoteRecord r = (NoteRecord) records.get(loc);
                agg.tailRec.put(r.getShapeId(), r);
            }
            loc++;
        }

This code change seems without side effects because loc is not referenced
anymore in the method.

Attached is my document that causes this error. I tried changing the pictures
or adding new comments, and it ends up rewriting the file in such a way that
there's no error. So I have it in the original form. (The pictures come from
Windows 7 sample pictures). The document is currently 1.6 MB and might be too
large to upload.

Also attached is a simple program (Java 8) to read the file comments:

Expected output (when I use the debugger to change loc to the next value before
testing if it's a NoteRecord):
Robert Kish:
Yo!: 0,2

Robert Kish:
2nd note!: 1,2

Robert Kish:
Woah!: 22,45

Robert Kish:
Woah top!: 21,0


Actual output:
Robert Kish:
Yo!: Exception in thread "main" java.lang.NullPointerException
    at
org.apache.poi.hssf.usermodel.HSSFComment.getColumn(HSSFComment.java:184)
    at NoteTest.main(NoteTest.java:26)

Thank you.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

Robert Kish <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #1 from Robert Kish <[hidden email]> ---
As mentioned before, when I try to change the pictures in the xls file, the
file gets rewritten in a better way and I can't reproduce the error. If you
want to see the file, please recommend some file sharing service I should use.
The error message that bugzilla displayed didn't provide recommendations. Or
even email works.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

Dominik Stadler <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #2 from Dominik Stadler <[hidden email]> ---
Does it get smaller when you compress it using zip? Oterwise please try sending
the document via mail to the poi-dev or user-list or upload it to dropbox or
some other file sharing functionality.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

Robert Kish <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #3 from Robert Kish <[hidden email]> ---
I tried dropbox and placed a zip file with an xls inside.

https://www.dropbox.com/s/im6khacdotpv28i/Book1.zip?dl=0

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #4 from Dominik Stadler <[hidden email]> ---
FYI, the proposed change breaks unit tests, at least TestDrawingAggregate, not
sure if these are expected or if they indicate a problem with the changes...

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #5 from Robert Kish <[hidden email]> ---
You are correct; my patch fails :(. I downloaded trunk from source and tried at
it some more. This will take some time for me to figure out, I'll give it a try
in the next week or so and try to submit a new patch. The problem is still the
same, NoteRecords out of sequence. What I learned is that the code assumes the
records are always in sequence, so it can remove records start to end. But what
we really have is start1 to end1, start2 to end2, etc. The referenced JUnit has
the same assumption and needs to be updated also.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #6 from Robert Kish <[hidden email]> ---
Created attachment 32868
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32868&action=edit
Debug version of TestDrawingAggregate

get method in this test now detects all DrawingRecord blocks in the file - not
just the first. There is debugging with println and stacktrace to aid in trying
to get all the unit tests to pass.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #7 from Robert Kish <[hidden email]> ---
Created attachment 32869
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32869&action=edit
Debug version of EsceherAggregate.java

This debug version has a modified version of createAggregate which processes
multiple DrawingRecords - not just the one mentioned at locFirstDrawingRecord.
There is use of println debugging.

The section "// Create one big buffer" will now find all sections to copy
bytes.

The section "// Associate the object records with the shapes" now has an
additional loop around it so that each DrawingRecord can be processed. The
records added to the aggregate are removed at the end of the loop.

The EscherAggregate is added after the loop.

With the updated JUnit, this still fails some files. Many files pass, including
those with multiple DrawingRecords. But some files fail.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #8 from Robert Kish <[hidden email]> ---
I have spent many hours in the past day trying to figure this out, but it is
beyond my current level of experience in the POI innards. I leave my work to
someone else who may continue at what I started, or learn from my mistakes and
come up with a better fix.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

--- Comment #9 from Dominik Stadler <[hidden email]> ---
Created attachment 32887
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32887&action=edit
Small addition to the integration tests to verify the comments which triggers
this bug

Attached a small addition to the integration test which we can apply when this
is fixed as it currently triggers this bug as well.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58002] EscherAggregate.createAggregate doesn't always read NoteRecord

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58002

ithan <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]