[Bug 61478] New: POI OOXML-Schema lookup uses wrong classloader

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug 61478] New: POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

            Bug ID: 61478
           Summary: POI OOXML-Schema lookup uses wrong classloader
           Product: POI
           Version: unspecified
          Hardware: PC
            Status: NEW
          Severity: major
          Priority: P2
         Component: POI Overall
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

When poi-ooxml uses reflection to locate classes in poi-ooxml-schema, it
apparently does not use the classloader of the current class or thread to do
this.  As a result, you cannot get poi-ooxml to work properly in a non-root
classloader.

Furthermore, even if you move poi-ooxml-schema to the root level, it has
callbacks into poi-ooxml, so you are forced to move THAT jar to root level as
well.  And once you do that, poi-ooxml calls back to org.apache.poi.util, which
is in poi.jar, so that jar also needs to be run at root level.

(A side note: having poi and poi-ooxml and poi-ooxml-schemas be separate makes
little sense if they all depend on each other in this way.)

We discovered this trying to integrate the latest Tika (which uses POI version
3.9) with ManifoldCF.  ManifoldCF runs tika-parsers at the connector level,
which has its own classloader.  We were forced to move all of POI, and its
dependencies, to the root classloader level, which greatly increases the size
of our binary image.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

Nick Burch <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <[hidden email]> ---
Any chance you could work up a small junit unit test (probably with dummy
classloader) to show the problem, and/or a fix?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #2 from Karl Wright <[hidden email]> ---
I verified that this occurs also with POI 3.15.

I should be able to come up with a classloader code snippet that demonstrates
the problem.  It will occur when trying to parse any Windows Office file, e.g.
xlsx or docx.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #3 from Karl Wright <[hidden email]> ---
I looked at providing an example but unfortunately, this occurs under the
execution of Tika, which has many dozen dependent jars. If you want a code
snippet, you'll either need to set up directory with all Tika dependent jars in
it, or you will need to provide me a snippet of code which parses a Microsoft
Office file.  Alternatively I can upload a many-megabyte zip file containing
the Tika parser with all dependencies that you can just unpack.  Please let me
know what you prefer.

Another way forward is to discuss how you use reflection in POI.  If you use
the following method to locate your classes, all should be well:

https://docs.oracle.com/javase/8/docs/api/java/lang/Class.html#forName-java.lang.String-

But if you use this method, then you will have to be very certain you know what
you are doing to get the right class loader:

https://docs.oracle.com/javase/8/docs/api/java/lang/Class.html#forName-java.lang.String-boolean-java.lang.ClassLoader-

I suspect it is the latter, and perhaps you are using the thread class loader
rather than the current class's class loader?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

Karl Wright <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

Nick Burch <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #4 from Nick Burch <[hidden email]> ---
Does using XWPFWordExtractor
<http://poi.apache.org/apidocs/org/apache/poi/xwpf/extractor/XWPFWordExtractor.html>
trigger the problem in your environment? That's the easiest way to do roughly
what Tika does in just a few lines

XWPFWordExtractor doc = new XWPFWordExtractor(OPCPackage.open("input.docx"));
doc.getText();
doc.close();

If not, just send over a unit test that triggers the problem with Tika, we'll
pop the test in the Tika codebase + fix here + check with Tika after our next
release

(There's a good overlap in POI and Tika committers!)

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #5 from Karl Wright <[hidden email]> ---
Created attachment 35277
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35277&action=edit
Testbed for exercising POI classloader

This testbed needs xmlbeans-2.6.0.jar, poi-3.15.jar, poi-ooxml-3.15.jar, and
poi-ooxml-schemas-3.15.jar added to the lib directory after unzipping.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #6 from Andreas Beeker <[hidden email]> ---
Please check if #60226, which was applied after 3.15, makes any difference

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #7 from Karl Wright <[hidden email]> ---
I've uploaded the testbed.  The testbed does not seem to cause the failure we
were seeing, however.

If the codepath for a docx document goes through poi-ooxml-schemas, that means
that we're seeing something that's being triggered somehow by Tika.

In the MCF setup, we have tika-core at the root level, and tika-parsers (and
most of its dependencies, including poi) at the connector level.  This has
worked in the past, at least until POI started using reflection to look up
classes in poi-ooxml-schemas.

The classloader setup I'm using in the testbed is cribbed directly from
ManifoldCF classes that set the class loaders up, so that's clearly not the
issue.  Any ideas?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

Karl Wright <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #8 from PJ Fanning <[hidden email]> ---
Could you provide some stack traces?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #9 from Karl Wright <[hidden email]> ---
Created attachment 35278
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35278&action=edit
User stack trace with all jars loaded by connector classloader

Image of user stacktrace when all Tika parser dependencies (including POI)
loaded with connector classpath attached.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #10 from Karl Wright <[hidden email]> ---
Created attachment 35279
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35279&action=edit
Image of user stacktrace when poi-ooxml-schema.jar moved to root classloader

This is the user stacktrace when poi-ooxml-schema and poi-ooxml both moved to
root classloader.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #11 from Karl Wright <[hidden email]> ---
Mr. Beeker, I cannot readily modify Tika to call the new method.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #12 from PJ Fanning <[hidden email]> ---
Could you call the setClassLoader method before calling Tika code?
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/POIXMLTypeLoader.java?view=markup&pathrev=1763922

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #13 from Karl Wright <[hidden email]> ---
Is the patched POI binary available on Maven yet?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #14 from PJ Fanning <[hidden email]> ---
Try poi 3.16 or 3.17-beta1. Both are in maven central.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #15 from Karl Wright <[hidden email]> ---
OK -- I am certain that this workaround would solve the problem.  But it is
pretty ugly, and we already have a workaround implemented and a patch release
is up for a vote.

May I ask if the POI team attends to address this in a more official manner?
If not, perhaps the Tika team should?  If neither team wishes to address the
issue, I will put this fix into ManifoldCF.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #16 from Andreas Beeker <[hidden email]> ---
> But it is pretty ugly, and we already have a workaround implemented and a patch release is up for a vote.

Could you recommend how to use/identify the correct classloader?
(as we need access the .xsb files, it's neither the thread context classload,
nor the classloader of any generated OOXML xml beans class)

As I had not much of an idea what the OSGi classloader does, I thought that
this is the KISS way to solve the problem, but I'm happy to learn ... and if
you ask nicely, we could postpone our final version which I wanted to prepare
the release candidate today.

> May I ask if the POI team attends to address this in a more official manner?

Official manner, as in a technical solution, if you know how, we will adapt it.
Otherwise it's officially in our FAQ:

http://poi.apache.org/faq.html#faq-N1029C

> We discovered this trying to integrate the latest Tika (which uses POI version 3.9) ...

Seriously? I always rant about users not able to use a recent version, but I
never would
imagine that a PMC chair would write something like that

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 61478] POI OOXML-Schema lookup uses wrong classloader

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=61478

--- Comment #17 from Karl Wright <[hidden email]> ---
>>>
Seriously? I always rant about users not able to use a recent version, but I
never would
imagine that a PMC chair would write something like that
<<<

My apologies -- the actual POI version we were using was 3.15, not 3.9, as I
explained elsewhere.  I hope that is clear now.

>>>
Could you recommend how to use/identify the correct classloader?
<<<

Yes, as I explained above, what you really want to emulate is what happens when
you do Class.forName(String classname).  There is a Class.forName() variant
which accepts a passed-in class loader, which is what you use.  So you need to
do this:

>>>
Invoking this method is equivalent to:
Class.forName(className, true, currentLoader)
where currentLoader denotes the defining class loader of the current class.
<<<

The defining class loader of the current class is:

xxx.Class.getClassLoader();

That should be the default behavior, I believe.  What do you think?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12