Modifying HSSFCell string values and SST record

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Modifying HSSFCell string values and SST record

Jason Height
All,

Basically i have finished implementing rich text support (yaa!) but have
come across an interesting quirk. Currently the SST record will grow based
on the number of nuique strings in the file, it never colapses (unless i
am missing something), so for example if i continually update a HSSFCell
string value by appending to it ie always creating a unique string by
overwriting the current value then the SST record will continue to grow
with many of the string entries being orphaned.

This will be the similar behaviour if i change the formatting of a
HSSFRichtextString.

Should there be something to delete orphaned strings in the SST record?
Presumably during serialization?

Thoughts?

Jason

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Reply | Threaded
Open this post in threaded view
|

Re: Modifying HSSFCell string values and SST record

andy-2
There should be a "coalesce" method.  However, it shouldn't run by
default as it is probably time consuming.  Many people would consider
extra filesize a happy tradeoff vs memory consumption or processor
utilization.  It just depends.

It might also be good to optionally (this costs more memory) keep
reference counts in the SST and support an "an autocoalesce" method such
that it can do it automatically; however, I really don't want to have
that happen by default (too much memory used up for ref counts + too
much processor for looking them up on every change).

-andy

Jason Height wrote:

> All,
>
> Basically i have finished implementing rich text support (yaa!) but have
> come across an interesting quirk. Currently the SST record will grow
> based on the number of nuique strings in the file, it never colapses
> (unless i am missing something), so for example if i continually update
> a HSSFCell string value by appending to it ie always creating a unique
> string by overwriting the current value then the SST record will
> continue to grow with many of the string entries being orphaned.
>
> This will be the similar behaviour if i change the formatting of a
> HSSFRichtextString.
>
> Should there be something to delete orphaned strings in the SST record?
> Presumably during serialization?
>
> Thoughts?
>
> Jason
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Reply | Threaded
Open this post in threaded view
|

Re: Modifying HSSFCell string values and SST record

Michael Zalewski
The problem might be a little more difficult. When you serialize the strings,
you have a list of Strings and Rich Text Strings as input, and you output a set
of 1 or more SST records.

When you "coalesce" the string data, you will end up removing some of the
strings from the input list. This will change the index of all the subsequent
strings in the list. That's not important in the SST record. But the LABELSST
records (which contain those index values) must be altered.

Here is an example. Let's say you start with a spreadsheet containing two
cells. The first cell has "Hello", and the second has "World". This would be
represeneted in 3 BIFF Records

  SST: String 0 = "Hello"
       String 1 = "World"
  LABELSST: Index = 0
  LABELSST: Index = 1

If you change the second cell from "World" to "POI", you will get

  SST: String 0 = "Hello"
       String 1 = "World"
       String 2 = "POI"
  LABELSST: Index = 0
  LABELSST: Index = 2

You can see that if you try to get rid of the orphan String 1 = "World", you
will have to change the index value in all the subsequent LABELSST records.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/