Performance Issue with XSSF as compared to HSSF in POI 3.7

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance Issue with XSSF as compared to HSSF in POI 3.7

Jiang Zhu
Hi,
I used the XSSF api in POI 3.7 to generate .xlsx files, but I found the time cost was long. Then I did the following tests to compare XSSF with HSSF.

 First, I ran the codes below for 3 times, and the average time cost when generating the .xlsx file was about 17s.

                              //Workbook wb = new HSSFWorkbook();
                    Workbook wb = new XSSFWorkbook();
                    Sheet sheet = wb.createSheet("new sheet");
       
                    for (int i = 0; i < 10000; i++) {
                       Row row = sheet.createRow(i);
                            Cell cell = row.createCell(0);
                            cell.setCellValue(1);
               
                            row.createCell(1).setCellValue(1.2);
                            row.createCell(2).setCellValue("This is a string");
                    }
       
                             //FileOutputStream fileOut = new FileOutputStream("d:\\workbook_hssf.xls");
                    FileOutputStream fileOut = new FileOutputStream("d:\\workbook_xssf.xlsx");
                    wb.write(fileOut);
                    fileOut.close();

 Second, I ran the codes below for 3 times in the same environment as before, and the average time cost when generating the .xls file was about 1.3s.

                    Workbook wb = new HSSFWorkbook();
                              //Workbook wb = new XSSFWorkbook();
                    Sheet sheet = wb.createSheet("new sheet");
       
                    for (int i = 0; i < 10000; i++) {
                       Row row = sheet.createRow(i);
                            Cell cell = row.createCell(0);
                            cell.setCellValue(1);
               
                            row.createCell(1).setCellValue(1.2);
                            row.createCell(2).setCellValue("This is a string");
                    }
       
                    FileOutputStream fileOut = new FileOutputStream("d:\\workbook_hssf.xls");
                              //FileOutputStream fileOut = new FileOutputStream("d:\\workbook_xssf.xlsx");
                    wb.write(fileOut);
                    fileOut.close();

The only difference between these two tests is that the first test used XSSF api, and the second test used HSSF api. The time cost in the first test is about 10 times longer than the time in the second test.
Is the time needed by using XSSF api usually 10 times longer than the time needed by using HSSF api? Is this natural? Or have I made some mistakes which resulted in costing so much time?

Any Help appreciated..
Reply | Threaded
Open this post in threaded view
|

Re: Performance Issue with XSSF as compared to HSSF in POI 3.7

Nick Burch-11
On Wed, 15 Dec 2010, Jiang Zhu wrote:
> Is the time needed by using XSSF api usually 10 times longer than the
> time needed by using HSSF api? Is this natural? Or have I made some
> mistakes which resulted in costing so much time?

XSSF usually takes more time than HSSF, and more memory. The xml
processing is heavier weight than the HSSF records are

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Performance Issue with XSSF as compared to HSSF in POI 3.7

Jiang Zhu
Nick Burch-11 wrote
XSSF usually takes more time than HSSF, and more memory. The xml
processing is heavier weight than the HSSF records are

Nick
Thanks for your answer.
Now I know it is natural that XSSF uses time which is 2 - 3 times longer than HSSF, but I am still not sure whether it is natural that XSSF uses time which is more than 10 times longer.
Did you ever meet the similar case as in my tests? If you did, in that case you met, is the time needed by XSSF also more than 10 times longer than the time needed by HSSF? Or only a little longer, such as 2 - 3 times longer?
Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Performance Issue with XSSF as compared to HSSF in POI 3.7

Johan N
I have also performed tests with similar results, before seeing this post; HXXF is about 10 times slower than HSSF.
For example saving the file using org.apache.poi.ss.usermodel.Workbook.write(OutputStream stream)  is also 10 times slower if using HXXF.
Is this something we have to live with for now?
All our code is written using org.apache.poi.ss.usermodel to be able to support both .xls and .xlsx.
I have seen comments that HXXF is expected to be 2 to 3 times slower than HSSF and I would be OK with that. Have someone managed to confirm this with real tests?
Cheers
 Johan
Reply | Threaded
Open this post in threaded view
|

Re: Performance Issue with XSSF as compared to HSSF in POI 3.7

Nick Burch-11
On Thu, 27 Oct 2011, Johan N wrote:
> I have also performed tests with similar results, before seeing this
> post; HXXF is about 10 times slower than HSSF.

You might want to try with a recent 3.8 beta, as there have been some
performance improvements since 3.7

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Performance Issue with XSSF as compared to HSSF in POI 3.7

Johan N
I have tried with 3.8 beta and it is faster than 3.7.
However the improvements are not big enough for us to make the switch at this point, especially since 3.8 is still in beta.
The test I have used is very close to real life usage for us; 1 workbook with 3 sheets and about 7000 lines with 10 columns in 2 of the sheets.
The timings are roughly like this:
xls-file with POI 3.7 0.5-1 second
xlsx-file with POI 3.7 10-12 seconds
xlsx-file with POI 3.8 beta 8-10 seconds

Johan