https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Bug ID: 57699 Summary: Suport Strict OOXML files Product: POI Version: 3.12-dev Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: XSSF Assignee: [hidden email] Reporter: [hidden email] Office 2013 has added the option to save as "strict" ooxml files, which as reported in http://stackoverflow.com/questions/29023542/how-to-parse-strict-xlsx-file-in-java have a different core type In r1666410 some sample strict xlsx files have been added, support is needed to support them (for reading at least) -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #1 from Nick Burch <[hidden email]> --- It looks like some namespace munging is going to be required to properly support this. After making changes to ExtractorFactory and POIXMLDocumentPart to handle the differing core relationship type, it now fails at the xmlbeans level: org.apache.xmlbeans.XmlException: error: The document is not a workbook@http://schemas.openxmlformats.org/spreadsheetml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/spreadsheetml/2006/main" got "http://purl.oclc.org/ooxml/spreadsheetml/main" at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:399) Caused by: org.apache.xmlbeans.XmlException: error: The document is not a workbook@http://schemas.openxmlformats.org/spreadsheetml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/spreadsheetml/2006/main" got "http://purl.oclc.org/ooxml/spreadsheetml/main" at org.apache.xmlbeans.impl.store.Locale.verifyDocumentType(Locale.java:459) at org.apache.xmlbeans.impl.store.Locale.autoTypeDocument(Locale.java:364) at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1280) at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1264) at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345) at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorkbookDocument$Factory.parse(Unknown Source) The purl namespace crops up in most of the xml files at least somewhere, so a general mapping solution is probably required if we want to take this further -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Dominik Stadler <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Dominik Stadler <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] --- Comment #2 from Dominik Stadler <[hidden email]> --- *** Bug 57914 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #3 from PJ Fanning <[hidden email]> --- http://pyxb.sourceforge.net/PyXB-1.2.2/bundles.html has a list of namespace URLs that could be used in a mapping class. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #4 from PJ Fanning <[hidden email]> --- Without spending much time on this, I have been unable to track down the XSDs with the purl namespaces (OOXML Strict). From accounts, they should be very similar to the OOXML Transitional schemas other than the namespaces. 2 approaches pop to mind. 1. In poi-ooxml-schemas, we could create XmlBeans for the OOXML Strict namespaces by using modified versions of the OOXML Transitional schemas. 2. support a transformation of the XML in input docs so that the OOXML Strict namespaces are replaced by OOXML transitional equivalents. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #5 from PJ Fanning <[hidden email]> --- I have added some basic prototype code to convert Strict OOXML files to https://github.com/pjfanning/ooxml-strict-converter - there is still a lot of work to do but I'm just posting it here if anyone wants to review what I'm doing. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #6 from Javen O'Neal <[hidden email]> --- Looks good so far. In the interest of wanting to start committing this early so that we can update our unit tests to handle XSSF Strict: * Are we planning on having XSSFWorkbook transparently handle strict workbooks or will be have a different class for that? Will this be in the o.a.p.xssf.usermodel package or are we going to package it in o.a.p.xssf.extractor or create o.a.p.xssf.strict? In the long term, I would like for POI to be able to read and write strict files without having to downconvert to non-strict. This probably affects how we go about packaging this--making it more than a distant examples or static utility converter class. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #7 from Dominik Stadler <[hidden email]> --- FYI, there is also a converter provided by Microsoft: https://www.microsoft.com/en-us/download/details.aspx?id=38828, could come in handy when doing development work on this topic. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #8 from PJ Fanning <[hidden email]> --- Hi Javen, I can understand that we will want to be able to save POI documents using Strict OOXML but my focus for now is just on the down-porting to Transitional OOXML to allow parsing. For now, I'm looking at a standalone utility to down-port but this could be plugged into XSSFWorkbook and XSSF extractor under the hood. They could either do some pre-processing of the input doc to determine if it is Strict OOXML and the down-port to a temp file and then read from the temp file. My prototype code is working now for the SimpleStrict.xlsx in the POI test data folder. I'll see about testing with more input files. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Javen O'Neal <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Suport Strict OOXML files |Support Strict OOXML files -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Sergei Malafeev <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Matafagafo <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
[hidden email] changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] --- Comment #9 from [hidden email] --- Created attachment 35988 --> https://bz.apache.org/bugzilla/attachment.cgi?id=35988&action=edit Engineering portfolio -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
Dominik Stadler <[hidden email]> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #35988|Engineering portfolio |spam description| | Attachment #35988|T MASOCHA R062075B MEPE504 |spam filename|HVE Portfolio final | |(1).docx | Attachment #35988|0 |1 is obsolete| | Attachment #35988|application/vnd.openxmlform |application/binary mime type|ats-officedocument.wordproc | |essingml.document | -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
[hidden email] changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #10 from Piotr Wilkin <[hidden email]> --- Over two years have passed - has there been any work done on this / any milestone? -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #11 from Dominik Stadler <[hidden email]> --- No, it seems none of the contributors needs it urgently enough to warrant spending time on it. As this is a purely community supported project without commercial backing, your best bet to get progress on this will be to provide patches/time yourself if you can contribute in any way. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
--- Comment #12 from Piotr Wilkin <[hidden email]> --- Yeah, which is why I was asking :> there were some partial results done by some people, I'll see if something can be done. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57699
[hidden email] changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[hidden email] --- Comment #13 from [hidden email] --- I am interested in working on this issue. Will be willing to work with some if somebody is already is working on it otherwise I can take it independently. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Free forum by Nabble | Edit this page |