[Bug 61300] New: Very slow processing on corrupted file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 61300] New: Very slow processing on corrupted file

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61300

            Bug ID: 61300
           Summary: Very slow processing on corrupted file
           Product: POI
           Version: 3.17-dev
          Hardware: PC
            Status: NEW
          Severity: minor
          Priority: P2
         Component: POIFS
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

Created attachment 35141
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35141&action=edit
triggering file

I need to figure out if this is a POIFs bug or a parseSummaries bug.  This is
triggered by a corrupted file.

At this location:
          at org.apache.poi.util.IOUtils.copy(IOUtils.java:296)
          at org.apache.poi.util.IOUtils.peekFirstNBytes(IOUtils.java:64)
          at
org.apache.poi.hpsf.PropertySet.isPropertySetStream(PropertySet.java:393)
          at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:191)
          at
org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)
          at
org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73)

        while((count = inp.read(buff)) != -1) {
            if(count > 0) {
                out.write(buff, 0, count);
            }
        }

On the first iteration, the pos in inp is 0, but then the pos goes negative on
each iteration, and this loop iterates for a very long time.

The source file that I corrupted is: testEXCEL_embeddedPDF_windows.xls

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 61300] Very slow processing on corrupted file

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61300

Dominik Stadler <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 61300] Very slow processing on corrupted file

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61300

Dominik Stadler <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Dominik Stadler <[hidden email]> ---
How can we reproduce this with POI alone? How is the document opened in Tika?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...