Writing unit tests for the powerpoint code

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Writing unit tests for the powerpoint code

Nick Burch
Hi All

I've been following the discussions on unit testing the formula code with
interest. I'm hoping to write some unit tests shortly for the powerpoint
code, but I'm not sure how best to go about it.

I can think of a few useful unit tests straight off. Things like ensuring
that a file gets read in and is then written out the same, checking that
records fine their children correctly, that sort of thing. I guess to do
these I'll need to knock up a few test ppt files to include with the
tests, but this shouldn't be too bad.

The main issue is that much of the hard work ATM is getting powerpoint to
like apparently valid files. How best should I handle a test that spits
out a file, but needs someone to copy it to windows, open up powerpoint,
and see if it likes the file?

Thanks
Nick


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Reply | Threaded
Open this post in threaded view
|

RE: Writing unit tests for the powerpoint code

Michael Zalewski
Those are good tests.

But think smaller. Think about testing a single method. It's kind of hard to
read a Power Point file, write it out, and expect that the result will be
byte for byte the same. It probably won't. It certainly doesn't happen with
Excel, which strips out all rich text formatting, converts all MULBLNK
records to runs of BLANK records, and many other things. I guess reading a
file into HSLF, writing it out, and verifying that it's the same Power Point
Presentation is more of a functional test than a unit test.

Here is a good example of unit test: test each record by converting from an
array of bytes to a Atom record (for example).

One unit test simply tries to construct various Atom records, by passing a
known byte array. You can figure out what to test by looking at the logic in
your code. For example, TextBytesAtom() seems to have logic to deal with
records that might be shorter than 8 bytes. So write a unit test that passes
in less than 8 bytes. (I bet you find a bug to crrect).

Another unit test makes the atom object, and tests that each field is as
expected. Another unit test converts the java object back to the array of
bytes, and compares that each byte is as expected.

Another thing I noticed about the power point format code:

I believe that each record begins with a two byte little-endian short
integer called 'instance data' (which you called 'info'). If you mask this
instance data with 0x000f, you obtain the record version. If the result is
0x000f, the record is a container. Otherwise, the record is an atom, but the
layout of the content depends on the version. The other 12 bits of the
instance dat is a field of the record (the only field other than children if
the record is a container). The meaning of the value in these 12 bits
depends on the record type, but they are often 0.

That insight might explain why some of the DDF classes don't seem to parse
their corresponding atoms in Power Point -- I bet the versions are
different.


-----Original Message-----
From: Nick Burch [mailto:[hidden email]]
Sent: Thursday, May 12, 2005 4:55 AM
To: POI Developers List
Subject: Writing unit tests for the powerpoint code

Hi All

I've been following the discussions on unit testing the formula code with
interest. I'm hoping to write some unit tests shortly for the powerpoint
code, but I'm not sure how best to go about it.

I can think of a few useful unit tests straight off. Things like ensuring
that a file gets read in and is then written out the same, checking that
records fine their children correctly, that sort of thing. I guess to do
these I'll need to knock up a few test ppt files to include with the
tests, but this shouldn't be too bad.

The main issue is that much of the hard work ATM is getting powerpoint to
like apparently valid files. How best should I handle a test that spits
out a file, but needs someone to copy it to windows, open up powerpoint,
and see if it likes the file?

Thanks
Nick


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Reply | Threaded
Open this post in threaded view
|

RE: Writing unit tests for the powerpoint code

Avik Sengupta
On Thu, 2005-05-12 at 09:05 -0400, Michael Zalewski wrote:
> Those are good tests.
>
> But think smaller.

I was composing a mail with the same sentiments, but Michael says it
better!

Its difficult (impossible?) to do a full functional test automated.
However, testing smaller pieces is well worth the effort.

In HSSF, I do the 'opening in excel' part manually, but that's a very
small part of the HSSF testsuite, and they are usually important only
while you are initially developing the funtionality. In the long term,
to prevent regressions, smaller unit tests are much more valuable.  If
the file doesnt open properly, there is usually an underlying cause that
can be tested independently. When you have a regression, smaller tests
will obviously provide better diagnostics.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Reply | Threaded
Open this post in threaded view
|

Re: Writing unit tests for the powerpoint code

andy-2
In reply to this post by Nick Burch
Unfortunately, you can't have that level of unit testing.  You can test
that things serialize the way you want them to, but the best you can do
is test that FooStructure when set to values X,Y,Z gives you bytes "08
0F 06 00 99".  We did some cool stuff in HSSF with generating records
for awhile that also generated unit tests (recordgenerator is in
cvs)...not sure if its useful for powerpoint.

Sorry to be such a hard ass, it didn't start out that way over night.  :-)

-Andy

Nick Burch wrote:

> Hi All
>
> I've been following the discussions on unit testing the formula code with
> interest. I'm hoping to write some unit tests shortly for the powerpoint
> code, but I'm not sure how best to go about it.
>
> I can think of a few useful unit tests straight off. Things like ensuring
> that a file gets read in and is then written out the same, checking that
> records fine their children correctly, that sort of thing. I guess to do
> these I'll need to knock up a few test ppt files to include with the
> tests, but this shouldn't be too bad.
>
> The main issue is that much of the hard work ATM is getting powerpoint to
> like apparently valid files. How best should I handle a test that spits
> out a file, but needs someone to copy it to windows, open up powerpoint,
> and see if it likes the file?
>
> Thanks
> Nick
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/