[Bug 60422] New: DataFormatter.formatCellValue retunrs incorrect value for german 'Buchhaltung' format

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] New: DataFormatter.formatCellValue retunrs incorrect value for german 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

            Bug ID: 60422
           Summary: DataFormatter.formatCellValue retunrs incorrect value
                    for german 'Buchhaltung' format
           Product: POI
           Version: 3.15-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSSF
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

Created attachment 34481
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34481&action=edit
the input file

I have some code that tries to reproduce the visible values of a spreadsheet.
So i made a small test sheet with cells using different formats.
Works fine so far.

But starting with version 3.14 i got strange results for one cell.

The cell is formatted as 'Buchhaltung' (see screenshot).
The (visible) value of the cell is '4,33 €' in excel and '0,004,,33.00 €' when
using poi.

My code is this / the locale used is Locale.GERMAN:

  /**
   * Reads the content of an excel cell and converts it into the string
   * visible in the excel sheet.
   *
   * @param aRow the row
   * @param aColumnsNo the column
   * @param aFormulaEvaluator the formula Evaluator
   * @param aLocale used for parsing and formating
   * @return the display string
   */
  public static String readCellContentAsString(final Row aRow, final int
aColumnsNo,
      final FormulaEvaluator aFormulaEvaluator, final Locale aLocale) {
    final Cell tmpCell = aRow.getCell(aColumnsNo);
    if (null == tmpCell) {
      return null;
    }

    final DataFormatter tmpDataFormatter = new DataFormatter(aLocale);
    try {
      final String tmpResult = tmpDataFormatter.formatCellValue(tmpCell,
aFormulaEvaluator);
      return tmpResult;
    } catch (final NotImplementedException e) {
      final StringBuilder tmpMsg = new StringBuilder(e.getMessage());
      if (null != e.getCause()) {
        tmpMsg.append(" (");
        tmpMsg.append(e.getCause().toString());
        tmpMsg.append(')');
      }
      LOG.error(tmpMsg.toString());
      final String tmpResult = tmpDataFormatter.formatCellValue(tmpCell, null);
      return tmpResult;
    }
  }

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue retunrs incorrect value for german 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #1 from RBRi <[hidden email]> ---
Created attachment 34482
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34482&action=edit
Screenshot of the format dialog

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for german 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

RBRi <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All
            Summary|DataFormatter.formatCellVal |DataFormatter.formatCellVal
                   |ue retunrs incorrect value  |ue returns incorrect value
                   |for german 'Buchhaltung'    |for german 'Buchhaltung'
                   |format                      |format

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for german 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

RBRi <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |regression

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

Tobias Wörenkämper <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|DataFormatter.formatCellVal |DataFormatter.formatCellVal
                   |ue returns incorrect value  |ue returns incorrect value
                   |for german 'Buchhaltung'    |for German 'Buchhaltung'
                   |format                      |format

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #2 from WI <[hidden email]> ---
C'mon guys, this library is useless if it only works for US/English. 8 months
passed and not a SINGLE response? Is this project dead and we should look for
alternatives?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #3 from Mark Murphy <[hidden email]> ---
(In reply to WI from comment #2)
> C'mon guys, this library is useless if it only works for US/English. 8
> months passed and not a SINGLE response? Is this project dead and we should
> look for alternatives?

I'm sorry, I don't know German, so it is difficult for me to help you.

This project is certainly not dead. The devs here all have jobs other than POI,
and work on this in their free time, or provide patches that pertain directly
to their jobs. So our priorities may not be the same as yours. However, we are
happy to review patches submitted from the community.

If you decide to contribute to the project, be sure to read the contribution
guidelines here https://poi.apache.org/guidelines.html, and include unit tests,
examples, and updates to the 'How To' document with your patch.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #4 from PJ Fanning <[hidden email]> ---
I added a test case and potential fix in https://github.com/apache/poi/pull/60

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #5 from RBRi <[hidden email]> ---
> I'm sorry, I don't know German, so it is difficult for me to help you.
Maybe yes but that was the reason why i spend some time to make a reproducible
case for you. I have attached the spreadsheet, have added the complete source
code for reproducing the problem and switching the default locale of the JVM is
no rocket science.
I think for open source software we have to take special care of different user
locales - that is part of the story.
And maybe this is a hint for your ci test suite also; maybe it is a good idea
to do the regression test with different locales.
And yes i know that your (we - because i spend also a lot of time doing open
source software development) are short of time. But maybe a little hint or
response will be great or even better that you have tried to reproduce my case.

Thanks for working on the this great project - it is a real help.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #6 from Andreas Beeker <[hidden email]> ---
Although I understand German ... I don't understand the potential fix :)

How about not using Locale.US directly, but setting it via
LocaleUtil.setUserLocale() before calling the formatter.

 ... so it looks like, the format strings are always parsed in a
Locale.US-manner?

Btw. it's ok to remind us, if a bug entry is important to you ... but we are
simply too few people to catch up with the 488 bugs.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #7 from PJ Fanning <[hidden email]> ---
Andreas - it seems like the existing code works if the user sets the POI
UserLocale to US.

import org.apache.poi.util.LocaleUtil;
LocaleUtil.setUserLocale(Locale.US);

Going forward, I think we should not require the user to make this LocaleUtil
call.

Would it make sense to have LocaleUtil user locale default to Locale.US?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #8 from Andreas Beeker <[hidden email]> ---
First of all, I know that DataFormatter is not trivial and we also had some
discussions about it [1]

When using an unspecific default locale, we usually use Locale.ROOT - but I'm
not sure, if using .ROOT or .US as the default for LocaleUtil.getUserLocale()
(instead of Locale.getDefault()) would be less pain, especially since there a
quite a few LocaleUtil invocations.

So how to proceed from here? ...
- check if Locale.US is working for other localized excel formats
- or what's wrong with the "Buchhaltung" format "_-* #.##0,00 "€"_-;-* #.##0,00
"€"_-;_-* "-"?? "€"_-;_-@_-" ... it looks like custom formats need to be
processed with the Locale of the file - where is it defined? (HPSF has
sometimes a Locale entry)



[1]
http://apache-poi.1045710.n5.nabble.com/DataFormatter-vs-org-apache-poi-ss-format-classes-td5721100.html

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #9 from Javen O'Neal <[hidden email]> ---
The changes in https://github.com/apache/poi/pull/60 look fine to me. Should we
continue the discussion til after the 3.17 beta 1 release?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #10 from Javen O'Neal <[hidden email]> ---
Calling LocaleUtil.setUserLocale inside the CellNumberFormatter.formatValue
means that the function has a side effect, which may be surprising to a user.
If we restore the UserLocale at the exit of formatValue, then there are still
side effects from parallel execution.

Rather than picking between ROOT and US as the default Locale, we could alias
the "Microsoft Excel default locale", using that locale throughout our code,
which would mean the only place we'd have to change it is where that alias is
assigned. We could also let the user change that alias if that's needed for any
reason.
setUserLocale and setDefaultLocale...

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #11 from Javen O'Neal <[hidden email]> ---
Calling LocaleUtil.setUserLocale inside the CellNumberFormatter.formatValue
means that the function has a side effect, which may be surprising to a user.
If we restore the UserLocale at the exit of formatValue, then there are still
side effects from parallel execution.

Are there parts of the Microsoft Office file format spec that use the user's OS
language and locale, some that use language and locale that are set in
Microsoft Office by the user or by the installation medium, and some that are
hard-coded to a particular value?

If these are distinct concepts, perhaps we should have different getters and
setters in LocaleUtil, and we can use the appropriate kind of locale (user, os,
hard-coded) throughout POI according to how Microsoft Office interprets the
files.

Rather than picking between ROOT and US as the default Locale, we could alias
the "Microsoft Excel default locale", using that locale throughout our code,
which would mean the only place we'd have to change it is where that alias is
assigned. We could also let the user change that alias if that's needed for any
reason.
setUserLocale and setDefaultLocale...

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

Javen O'Neal <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|HSSF                        |SS Common

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #12 from RBRi <[hidden email]> ---
>- check if Locale.US is working for other localized excel formats
I think that is the base line of this problem. We have to understand if the
format description is locale dependent or not. If you need someone to provide
more german excel files you can count on me.

And regarding the locale. My program processes the excel file on a machine
different from the one in was created on (i guess this is the usual use case
for POI). This implies from my point of view, that i have to know the locale
setting of the creating machine if i like to reproduce the same presentations
of the cell values. That is the reason, for the locale i hand over to the
method and in the end to the DataFormatter constructor. From the user point of
view this is all i can do and all i like to do.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #13 from PJ Fanning <[hidden email]> ---
I like Javen's idea of adding something like a LocaleUtil set/getDefaultLocale.

I'm wondering if it would be useful to add extra tests like this sample I just
produced:

    @Test
    public void testDateFormattingWithLocales() {
        // 2017-12-01 09:54:33 which is 42747.412892397523 as double
        DataFormatter dfDE = new DataFormatter(Locale.GERMANY);
        DataFormatter dfZH = new DataFormatter(Locale.PRC);
        DataFormatter dfIE = new DataFormatter(new Locale("GA", "IE"));
        double date = 42747.412892397523;
        String format = "dd MMMM yyyy HH:mm:ss";
        assertEquals("12 Januar 2017 09:54:33",
dfDE.formatRawCellContents(date, -1, format));
        assertEquals("12 \u4E00\u6708 2017 09:54:33",
dfZH.formatRawCellContents(date, -1, format));
        assertEquals("12 Ean\u00E1ir 2017 09:54:33",
dfIE.formatRawCellContents(date, -1, format));
    }

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #14 from PJ Fanning <[hidden email]> ---
I've been doing a bit more experimentation.
It seems that simpler formats are handled correctly, locale wise.
I think some of the main issues are:
* DataFormatter should set the LocaleUtils user locale to match the
DataFormatter locale if set (and if LocaleUtils does not have an explicit
locale set already)
* org.apache.poi.ss.format.CellNumberFormatter has a lot of hardcoding about
commas for grouping separators and dots for decimal separators - should use
DecimalFormatSymbols class
* org.apache.poi.ss.format.CellNumberFormatter also applies the separator from
the format string but my understanding is that "#,##0.00" means to apply locale
specific separators, that the '.' is assumed to mean apply locale specific
decimal separator as oppposed to specifically apply '.' as the decimal
separator.
* need to check if org.apache.poi.ss.format.CellNumberFormatter assumes 3
digits in each group - in India, they have lakhs and write 1 million (10 lakh)
as 10,00,000

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #15 from PJ Fanning <[hidden email]> ---
Actally, it seems that I'm wrong about the '.' in the format string - it does
seem that this should be a '.' in the result and that if you want ',' in the
output then your format must be something like '#.##0,00'.
I've changed my Mac's locale to Germany and the Accounting/Buchhaltung format
does appear to be '#.##0,00'. I'll need to double check the parsing code that I
used to extract the formatString from attached xls file because that seems to
give me a US locale formatString instead of the Germany equivalent.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12
Loading...