[Bug 62307] New: XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 62307] New: XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=62307

            Bug ID: 62307
           Summary: XSSFCell.getNumericCellValue() behaves differently
                    when formula is set or not set
           Product: POI
           Version: 4.0-dev
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

Javadoc for Cell.getNumericValue() says:
* For strings we throw an exception. For blank cells we return a 0.
* For formulas or error cells we return the precalculated value;
* </p>
* @return the value of the cell as a number
* @throws IllegalStateException if the cell type returned by {@link
#getCellType()} is {@link CellType#STRING}

E.g., javadoc doesn't specify a behavior for boolean cells.

@Override
public double getNumericCellValue() {
    CellType cellType = getCellType();
    switch(cellType) {
        case BLANK:
            return 0.0;
        case FORMULA:
            // fall-through
        case NUMERIC:
            if(_cell.isSetV()) {
               String v = _cell.getV();
               if (v.isEmpty()) {
                   return 0.0;
               }
               try {
                  return Double.parseDouble(v);
               } catch(NumberFormatException e) {
                  throw typeMismatch(CellType.NUMERIC, CellType.STRING, false);
               }
            } else {
               return 0.0;
            }
        default:
            throw typeMismatch(CellType.NUMERIC, cellType, false);
    }
}

Consequences:
1. BLANKs always return 0, regardless of the formula.
2. If formula is set, ANY value type is parsed as double, i.e. numerics,
booleans and errors. If the string value representation is not parsable as
double, NumberFormatException is thrown.
3. If formula is not set, then string, boolean and error types result in
IllegalStateException.

Although this logic corresponds to javadoc (except for booleans), it seems
really weird that it behaves in such different ways depending on the formula.
If this is really intended behavior, then soory to bother you. Otherwise, I
propose a fix:

@Override
    public double getNumericCellValue() {
        CellType valueType = getCellType() == CellType.FORMULA ?
getCachedFormulaResultType() : getCellType();

        switch(valueType) {
            case BLANK:
                return 0.0;
            case STRING:
                throw typeMismatch(CellType.NUMERIC, CellType.STRING, false);
            case ERROR:
                // Errors are stored as strings, so we cannot parse them
directly.
                // We could return an error code, but it's confusing.
                throw typeMismatch(CellType.NUMERIC, CellType.ERROR, false);
            case BOOLEAN:
                // fall-through
            case NUMERIC:
                if(_cell.isSetV()) {
                    String v = _cell.getV();
                    if (v.isEmpty()) {
                        return 0.0;
                    }
                    // 'Cannot get NUMERIC from NUMERIC' seems weird. I propose
to replace it with
                    // IllegalStateException("Invalid string representation of
a numeric: " + v)
                    // Or let the exception propagate. Anyway, if the internal
value representation is broken, means
                    // that something is really wrong in POI code.

                    return Double.parseDouble(v);

                } else {
                    return 0.0;
                }
            default: // accounts for _NONE and FORMULA
                throw new IllegalStateException("this should never happen");
        }
    }

I will be able to provide a set of test cases later, if necessary.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 62307] XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=62307

--- Comment #1 from [hidden email] <[hidden email]> ---
Bump.

Abstract Cell javadoc:
    /**
     * Get the value of the cell as a number.
     * <p>
     * For strings we throw an exception. For blank cells we return a 0.
     * For formulas or error cells we return the precalculated value;
     * </p>
     * @return the value of the cell as a number
     * @throws IllegalStateException if the cell type returned by {@link
#getCellType()} is {@link CellType#STRING}
     * @exception NumberFormatException if the cell value isn't a parsable
<code>double</code>.
     * @see DataFormatter for turning this number into a string similar to that
which Excel would render this number as.
     */
    double getNumericCellValue();

The javadoc is missing the following cases:
1. BLANK: implemented in code, returns zero.
2. ERROR: implemented in code, throws ISE (IllegalStateException).
3. BOOLEAN missing from both javadoc and behaves differently depending on the
implementation and whether formula is set:


formula set | HSSF | XSSF | SXSSF |
===================================
   no       | ISE  | ISE  |  ISE  |
   yes      | ISE  | cast |  ISE  |

What's the criterion that a cellType is convertible to numeric? These methods
are not used by the evaluator (the latter has format-independent conversion
logic) but may be used as a part of public API... but for what?

I guess the least invasive and most consistent way is to alter XSSF so that
calling getNumericCellValue on a cell with BOOLEAN cached result type produced
ISE.

Maybe it makes sense to prohibit *any* implicit typecasts in Cell#get*Value in
the future.

Any comments?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 62307] XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=62307

--- Comment #2 from PJ Fanning <[hidden email]> ---
I think we need to be pragmatic - breaking compatibility with previous versions
has its consequences.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 62307] XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=62307

[hidden email] <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from [hidden email] <[hidden email]> ---
Fixed via r1850207.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 62307] XSSFCell.getNumericCellValue() behaves differently when formula is set or not set

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=62307

--- Comment #4 from Greg Woolsey <[hidden email]> ---
Thanks for this, I hadn't realized this was one reason I have some custom
evaluation code (formatting requirements are another, but unrelated).  I'll
test eventually, but with this I may be able to remove some code in my
downstream project.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]