[Bug 58087] New: Integration of Visio .vsdx parser into POI

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug 58087] New: Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

            Bug ID: 58087
           Summary: Integration of Visio .vsdx parser into POI
           Product: POI
           Version: unspecified
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: SS Common
          Assignee: [hidden email]
          Reporter: [hidden email]
        Depends on: 57484

We've just released poi-visio, an Apache 2.0 licensed .vsdx parser built on
POI. It would be great to add this to the suite of parsers that POI currently
has.

One blocking constraint is that the project depends on this patch that's been
open for at least 6 months:
https://bz.apache.org/bugzilla/show_bug.cgi?id=57484

Github link: https://github.com/BBN-D/poi-visio

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

Nick Burch <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Nick Burch <[hidden email]> ---
It's certainly something we could look at doing, but you would need to push
through some paperwork on your side. The process we'd need to follow is
described at http://incubator.apache.org/ip-clearance/index.html - given the
organisations mentioned in the readme of the project, do you think we have any
hope of getting the OK + fairly simple grant form done before the heat-death of
the universe? :)

In the mean time, could you do a patch for
https://svn.apache.org/repos/asf/poi/site/src/documentation/content/xdocs/related-projects.xml
to mention the project?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #2 from virtuald <[hidden email]> ---
Interestingly enough, we started the process to open source some projects last
September, and finally figured out most of the red tape issues -- so this
project only took about two months to get out into the open. The government
doesn't own any IP in the project, so we would only need to get our legal to
sign off on any CLA et al... and I think we might be able to avoid our parent
company. So, maybe a month or two if I keep bothering them.

I'll submit a patch for the external projects site. Also, I'd love for you to
push my other patch (57484) that way I don't have to maintain a fork of POI. :)

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #3 from virtuald <[hidden email]> ---
Sorry for the delay, been off vacationing...

My understanding of the documents you linked to seems to imply that there are
two ways this could work out.

a) If my company signs the Corporate CLA, then the code can be imported under
that agreement
b) The other way is they could sign the software grant form
(http://www.apache.org/licenses/software-grant.txt), no Corporate CLA required

Does that sound about right?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087
Bug 58087 depends on bug 57484, which changed state.

Bug 57484 Summary: [PATCH] Allow processing of non-OOXML core namespace packages
https://bz.apache.org/bugzilla/show_bug.cgi?id=57484

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #4 from virtuald <[hidden email]> ---
Ping?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #5 from Nick Burch <[hidden email]> ---
Sorry, yes. Your company can either do a standalone Software Grant, or they can
file a CCLA and include a Software Grant in it. Best practice is normally to
pop the source code tarball/zip somewhere, take a suitable hash of it, and
include the hash in the grant. That way, people can review the code before, and
it's quite clear what the grant covered.

You personally should try to get an ICLA on file, so we can get you on board to
help maintain the code post-contribution.

Whether or not a CCLA is needed post-grant depends on the company in question.
What Apache really needs after that is an ICLA, where you (as the ongoing
contributor) basically confirm you've understood the Apache Software License
v2, and are willing + able to contribute under it. Depending on your employment
contract, you might not be able to sign that on your own, if so, a CCLA can be
helpful to show that your employer is happy for you to be committing work that
your employment contract says is "theirs" to the ASF.

Hope that helps? Sorry for the delay, I think everyone thought that someone
else was going to reply...

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #6 from virtuald <[hidden email]> ---
A CCLA + software grant form has been filled out and emailed to the Apache
Foundation secretary.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #7 from Nick Burch <[hidden email]> ---
Great stuff. Next step is to fill out the IP clearance template, and give the
IPMC a chance to review it.
http://incubator.apache.org/ip-clearance/ip-clearance-template.html . I can
hopefully do that on the weekend, if no-one beats me to it!

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #8 from Nick Burch <[hidden email]> ---
I've started on the clearance template, more still to do. It's at
http://incubator.apache.org/ip-clearance/poi-visio.html and the source is
https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/poi-visio.xml

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #9 from virtuald <[hidden email]> ---
Thanks Nick!

One thing I neglected to add to the software grant was
https://github.com/BBN-D/ooxml-visio-schemas, which is required to run the
existing code.

I think though, that if POI were to adopt this code, that it would make more
sense to just integrate the correct XSD with the existing ooxml packages,
instead of creating a new one just for visio.

The PDF download page for the visio specification
(https://msdn.microsoft.com/en-us/library/hh645006%28v=office.12%29.aspx) says
"Regardless of any other terms that are contained in the terms of use for the
Microsoft website that hosts this documentation, you may make copies of it in
order to develop implementations of the technologies described in the Open
Specifications and may distribute portions of it in your implementations using
these technologies or your documentation as necessary to properly document the
implementation.".

Which is good. The XSD is in appendix A of the specification, and has a BSD
style license and "No right to create modifications or derivatives of this
Specification is granted herein."... but I assume the download page notice
takes precedence.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #10 from Nick Burch <[hidden email]> ---
Those looks like they're covered by the same terms as the other OOXML schemas
we use, such as the Microsoft OSP, so I think that should be fine to use the
same way

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #11 from Nick Burch <[hidden email]> ---
On a different topic, I wonder how easy/hard it might be to replace the curves
external library dependency with the same approach taken in the new common SL
code, eg org.apache.poi.sl.draw.geom.Path?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #12 from virtuald <[hidden email]> ---
I tried using java.awt.geom for all my drawing needs, but the one item that I
could not find (and did not want to write) was a NURBS spline implementation
(and splines are used in visio all over the place, it turns out). I believe
that the NURBS spline is the only thing that is used from curvesapi.

curvesapi is from an apparently unmaintained package on sourceforge.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

virtuald <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|SS Common                   |XDGF

--- Comment #13 from virtuald <[hidden email]> ---
I've put together a first pass at integration on this branch on github:

https://github.com/virtuald/poi/tree/xdgf

Using the XSD integration branch as a starting point (bug 58408) I first copied
the original code over to src/ooxml (1st commit), and then ran the eclipse
formatter on the code using POI's custom profile (2nd commit), and then added
the curvesapi dependency. Compile + test seems to work.

Next, I'll do some cleanup, and I'll add a text extractor + unit test. I
suspect at some point some basic code review from someone would be useful, and
I can do those adjustments as necessary also.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #14 from virtuald <[hidden email]> ---
Ok, added some cleanup plus some simple tests. Waiting on a light code review
of some sort, then I'll "git svn dcommit" it to trunk.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #15 from virtuald <[hidden email]> ---
Now that POI 3.13 has been released, anything left to do here other than push
the code in? I'll be traveling starting tomorrow, so I won't be able to push my
changes in until Saturday at the earliest.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #16 from Dominik Stadler <[hidden email]> ---
Overall it looks good, although i did not look into the code in detail. Some
more unit testing would be nice to get at least part of the code covered a bit
more as otherwise overall coverage will drop a lot when this is pushed in.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #17 from virtuald <[hidden email]> ---
I guess at this point my main question is whether the IP stuff has been
completed? If it is, then I'd like to push my changes in that way others can
play with it and we can see what's broken (or not).

I think I've figured out how to run the coverage tests... on the scale of POI,
it's really not so bad. Looks like there's ~2100 lines of XDGF code, with 1400
missed lines. 30% is better than I would have guessed.

I mean, one low-effort solution that I could throw together to bump up the
coverage is just add a document with a lot of crazy stuff in it, and add a test
to parse that.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 58087] Integration of Visio .vsdx parser into POI

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=58087

--- Comment #18 from Nick Burch <[hidden email]> ---
The IP Clearance has passed (well, a week or so ago, but I've only just
remembered to update the site...)

You are now clear to go ahead and import the code from Github + update the
license headers

Test coverage wise, we tend to test
problematic/complex/confusing/key/heavily-reused parts of the code with "true"
code-level unit tests. Most other parts get tested by doing
create/save/load/read or load/modify/save/load/read or read/check type
unit/basic-integration tests.

Creating a handful of small test files touching key areas of functionality,
then adding read+check and read+text-extract tests is probably the quickest way
to increase coverage. Then look at using those for read+change+save+read+check
tests, and if possible later create+save+"ensure roughly the same as this other
one" tests!

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12