The CellType of a cell which is set to String

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

The CellType of a cell which is set to String

Young
Hi

In a excel,
Firstly, I typed '1' in A1, then I changed the format of this cell to String.
Secondly, I changed the format of A2 to Standard(the default format), then I typed '1' in this cell.
Finally, I saved this excel.

This is my excel file for test.
I’m using poi 3.14.
My source is very simple, as following:
================================================
File file = new File("1.xlsx");

Workbook wb = WorkbookFactory.create(file);

Sheet sheet = wb.getSheetAt(0);

System.out.println(sheet.getRow(0).getCell(0).getCellType());
System.out.println(sheet.getRow(1).getCell(0).getCellType());

wb.close();
================================================

output:
============
0
1
============

The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is CELL_TYPE_STRING.

Though I set both cells to String, the results are different.

In addition, in this excel, the appearance of these two cells are also different.
There is a green triangle at the top-right corner of A2.

Could anyone explain this for me?

Regards,
Young


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: The CellType of a cell which is set to String

Javen O'Neal
The cell data format is a distinct concept from the cell value. The format
only defines how the value should be printed, but doesn't change the
underlying value.

If you want to convert the cell value to a printed string, use
https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.html

On Aug 12, 2016 2:59 AM, "[hidden email]" <
[hidden email]> wrote:

Hi

In a excel,
Firstly, I typed '1' in A1, then I changed the format of this cell to
String.
Secondly, I changed the format of A2 to Standard(the default format), then
I typed '1' in this cell.
Finally, I saved this excel.

This is my excel file for test.
I’m using poi 3.14.
My source is very simple, as following:
================================================
File file = new File("1.xlsx");

Workbook wb = WorkbookFactory.create(file);

Sheet sheet = wb.getSheetAt(0);

System.out.println(sheet.getRow(0).getCell(0).getCellType());
System.out.println(sheet.getRow(1).getCell(0).getCellType());

wb.close();
================================================

output:
============
0
1
============

The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is
CELL_TYPE_STRING.

Though I set both cells to String, the results are different.

In addition, in this excel, the appearance of these two cells are also
different.
There is a green triangle at the top-right corner of A2.

Could anyone explain this for me?

Regards,
Young


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: The CellType of a cell which is set to String

Javen O'Neal
To see how Excel actually saved the values, unzip the xlsx file and open
sheet1.xml. You should see a couple <c> elements, with a value and type.

On Aug 12, 2016 8:32 AM, wrote:

The cell data format is a distinct concept from the cell value. The format
only defines how the value should be printed, but doesn't change the
underlying value.

If you want to convert the cell value to a printed string, use
https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.html

On Aug 12, 2016 2:59 AM, "[hidden email]" <
[hidden email]> wrote:

Hi

In a excel,
Firstly, I typed '1' in A1, then I changed the format of this cell to
String.
Secondly, I changed the format of A2 to Standard(the default format), then
I typed '1' in this cell.
Finally, I saved this excel.

This is my excel file for test.
I’m using poi 3.14.
My source is very simple, as following:
================================================
File file = new File("1.xlsx");

Workbook wb = WorkbookFactory.create(file);

Sheet sheet = wb.getSheetAt(0);

System.out.println(sheet.getRow(0).getCell(0).getCellType());
System.out.println(sheet.getRow(1).getCell(0).getCellType());

wb.close();
================================================

output:
============
0
1
============

The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is
CELL_TYPE_STRING.

Though I set both cells to String, the results are different.

In addition, in this excel, the appearance of these two cells are also
different.
There is a green triangle at the top-right corner of A2.

Could anyone explain this for me?

Regards,
Young


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

返: The CellType of a cell which is set to String

Young
Actually, I want to get the value in cells and change them to String in Java.(numeric -> String, string -> String, and so on).
Sometimes, I'm not sure what CellType it is. So I want to know the CellType at first, and then I can convert them to String.

As you said, I can directly use DataFormatter to convert the cell value to a printed String. The printed String is what I want. Am I correct?

> -----元のメッセージ-----
> 差出人: Javen O'Neal [mailto:[hidden email]]
> 送信日時: 2016年8月12日 23:35
> 宛先: POI Users List <[hidden email]>
> 件名: Re: The CellType of a cell which is set to String
>
> To see how Excel actually saved the values, unzip the xlsx file and open
> sheet1.xml. You should see a couple <c> elements, with a value and type.
>
> On Aug 12, 2016 8:32 AM, wrote:
>
> The cell data format is a distinct concept from the cell value. The format only
> defines how the value should be printed, but doesn't change the underlying value.
>
> If you want to convert the cell value to a printed string, use
> https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DataFormatter.ht
> ml
>
> On Aug 12, 2016 2:59 AM, "[hidden email]" < [hidden email]>
> wrote:
>
> Hi
>
> In a excel,
> Firstly, I typed '1' in A1, then I changed the format of this cell to String.
> Secondly, I changed the format of A2 to Standard(the default format), then I
> typed '1' in this cell.
> Finally, I saved this excel.
>
> This is my excel file for test.
> I’m using poi 3.14.
> My source is very simple, as following:
> ================================================
> File file = new File("1.xlsx");
>
> Workbook wb = WorkbookFactory.create(file);
>
> Sheet sheet = wb.getSheetAt(0);
>
> System.out.println(sheet.getRow(0).getCell(0).getCellType());
> System.out.println(sheet.getRow(1).getCell(0).getCellType());
>
> wb.close();
> ================================================
>
> output:
> ============
> 0
> 1
> ============
>
> The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is
> CELL_TYPE_STRING.
>
> Though I set both cells to String, the results are different.
>
> In addition, in this excel, the appearance of these two cells are also different.
> There is a green triangle at the top-right corner of A2.
>
> Could anyone explain this for me?
>
> Regards,
> Young
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email] For additional
> commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: 返: The CellType of a cell which is set to String

Javen O'Neal
Yes. DataFormatter does what you want: get the string representation of the
cell contents (numeric, date, string, boolean, blank, error, formula)

On Aug 18, 2016 18:54, "陈 杨阳" <[hidden email]> wrote:

> Actually, I want to get the value in cells and change them to String in
> Java.(numeric -> String, string -> String, and so on).
> Sometimes, I'm not sure what CellType it is. So I want to know the
> CellType at first, and then I can convert them to String.
>
> As you said, I can directly use DataFormatter to convert the cell value to
> a printed String. The printed String is what I want. Am I correct?
>
> > -----元のメッセージ-----
> > 差出人: Javen O'Neal [mailto:[hidden email]]
> > 送信日時: 2016年8月12日 23:35
> > 宛先: POI Users List <[hidden email]>
> > 件名: Re: The CellType of a cell which is set to String
> >
> > To see how Excel actually saved the values, unzip the xlsx file and open
> > sheet1.xml. You should see a couple <c> elements, with a value and type.
> >
> > On Aug 12, 2016 8:32 AM, wrote:
> >
> > The cell data format is a distinct concept from the cell value. The
> format only
> > defines how the value should be printed, but doesn't change the
> underlying value.
> >
> > If you want to convert the cell value to a printed string, use
> > https://poi.apache.org/apidocs/org/apache/poi/ss/
> usermodel/DataFormatter.ht
> > ml
> >
> > On Aug 12, 2016 2:59 AM, "[hidden email]" <
> [hidden email]>
> > wrote:
> >
> > Hi
> >
> > In a excel,
> > Firstly, I typed '1' in A1, then I changed the format of this cell to
> String.
> > Secondly, I changed the format of A2 to Standard(the default format),
> then I
> > typed '1' in this cell.
> > Finally, I saved this excel.
> >
> > This is my excel file for test.
> > I’m using poi 3.14.
> > My source is very simple, as following:
> > ================================================
> > File file = new File("1.xlsx");
> >
> > Workbook wb = WorkbookFactory.create(file);
> >
> > Sheet sheet = wb.getSheetAt(0);
> >
> > System.out.println(sheet.getRow(0).getCell(0).getCellType());
> > System.out.println(sheet.getRow(1).getCell(0).getCellType());
> >
> > wb.close();
> > ================================================
> >
> > output:
> > ============
> > 0
> > 1
> > ============
> >
> > The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is
> > CELL_TYPE_STRING.
> >
> > Though I set both cells to String, the results are different.
> >
> > In addition, in this excel, the appearance of these two cells are also
> different.
> > There is a green triangle at the top-right corner of A2.
> >
> > Could anyone explain this for me?
> >
> > Regards,
> > Young
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email] For additional
> > commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: 返: The CellType of a cell which is set to String

Young
If I use the default format of the cell(Standard) and type 'FALSE' in this cell, then I get the value 'false' in the cell by using formatCellValue method.
Actually, I hope that I can get the value 'FALSE', not lowercase 'false'.

Could I achieve this by using DataFormatter?

On 2016-08-19 10:48 (+0800), "Javen O'Neal" <[hidden email]> wrote:

> Yes. DataFormatter does what you want: get the string representation of the
> cell contents (numeric, date, string, boolean, blank, error, formula)
>
> On Aug 18, 2016 18:54, "陈 杨阳" <[hidden email]> wrote:
>
> > Actually, I want to get the value in cells and change them to String in
> > Java.(numeric -> String, string -> String, and so on).
> > Sometimes, I'm not sure what CellType it is. So I want to know the
> > CellType at first, and then I can convert them to String.
> >
> > As you said, I can directly use DataFormatter to convert the cell value to
> > a printed String. The printed String is what I want. Am I correct?
> >
> > > -----元のメッセージ-----
> > > 差出人: Javen O'Neal [mailto:[hidden email]]
> > > 送信日時: 2016年8月12日 23:35
> > > 宛先: POI Users List <[hidden email]>
> > > 件名: Re: The CellType of a cell which is set to String
> > >
> > > To see how Excel actually saved the values, unzip the xlsx file and open
> > > sheet1.xml. You should see a couple <c> elements, with a value and type.
> > >
> > > On Aug 12, 2016 8:32 AM, wrote:
> > >
> > > The cell data format is a distinct concept from the cell value. The
> > format only
> > > defines how the value should be printed, but doesn't change the
> > underlying value.
> > >
> > > If you want to convert the cell value to a printed string, use
> > > https://poi.apache.org/apidocs/org/apache/poi/ss/
> > usermodel/DataFormatter.ht
> > > ml
> > >
> > > On Aug 12, 2016 2:59 AM, "[hidden email]" <
> > [hidden email]>
> > > wrote:
> > >
> > > Hi
> > >
> > > In a excel,
> > > Firstly, I typed '1' in A1, then I changed the format of this cell to
> > String.
> > > Secondly, I changed the format of A2 to Standard(the default format),
> > then I
> > > typed '1' in this cell.
> > > Finally, I saved this excel.
> > >
> > > This is my excel file for test.
> > > I’m using poi 3.14.
> > > My source is very simple, as following:
> > > ================================================
> > > File file = new File("1.xlsx");
> > >
> > > Workbook wb = WorkbookFactory.create(file);
> > >
> > > Sheet sheet = wb.getSheetAt(0);
> > >
> > > System.out.println(sheet.getRow(0).getCell(0).getCellType());
> > > System.out.println(sheet.getRow(1).getCell(0).getCellType());
> > >
> > > wb.close();
> > > ================================================
> > >
> > > output:
> > > ============
> > > 0
> > > 1
> > > ============
> > >
> > > The result shows that A1's type is CELL_TYPE_NUMERIC, and A2's type is
> > > CELL_TYPE_STRING.
> > >
> > > Though I set both cells to String, the results are different.
> > >
> > > In addition, in this excel, the appearance of these two cells are also
> > different.
> > > There is a green triangle at the top-right corner of A2.
> > >
> > > Could anyone explain this for me?
> > >
> > > Regards,
> > > Young
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email] For additional
> > > commands, e-mail: [hidden email]
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Apach POI doc/docx parser

Teresa Kim-2
Hi


I have documents (either 'doc' or 'docx') that have a special character
for 'greater than equal' and using codes in 'WordToHtmlConverter', I see
those characters are converted into '('.

I tried with the latest apache poi release 4.1.0.


My java code is:


public class TestWordtoHtmlConverter {

     public static void main(String[] args ) {
         try {
         HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream(args[0]));

         WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
                 DocumentBuilderFactory.newInstance().newDocumentBuilder()
                         .newDocument());

         wordToHtmlConverter.processDocument(wordDocument);
         Document htmlDocument = wordToHtmlConverter.getDocument();
         ByteArrayOutputStream out = new ByteArrayOutputStream();
         DOMSource domSource = new DOMSource(htmlDocument);
         StreamResult streamResult = new StreamResult(out);

         TransformerFactory tf = TransformerFactory.newInstance();
         Transformer serializer = tf.newTransformer();
         serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
         serializer.setOutputProperty(OutputKeys.INDENT, "yes");
         serializer.setOutputProperty(OutputKeys.METHOD, "html");
         serializer.transform(domSource, streamResult);
         out.close();

         String result = new String(out.toByteArray());
         System.out.println(result);
       } catch (Exception e) {
       }

Is there anyway I can correctly identify these symbols?


In the sample document, I am interested in getting 'bad one'.


Thanks

T.






---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Apach POI doc/docx parser

Dominik Stadler
Hi,

can you share an example document which shows the behavior?

Thanks... Dominik.


On Sun, Oct 6, 2019 at 6:48 AM Teresa Kim
<[hidden email]> wrote:

> Hi
>
>
> I have documents (either 'doc' or 'docx') that have a special character
> for 'greater than equal' and using codes in 'WordToHtmlConverter', I see
> those characters are converted into '('.
>
> I tried with the latest apache poi release 4.1.0.
>
>
> My java code is:
>
>
> public class TestWordtoHtmlConverter {
>
>      public static void main(String[] args ) {
>          try {
>          HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new
> FileInputStream(args[0]));
>
>          WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
>                  DocumentBuilderFactory.newInstance().newDocumentBuilder()
>                          .newDocument());
>
>          wordToHtmlConverter.processDocument(wordDocument);
>          Document htmlDocument = wordToHtmlConverter.getDocument();
>          ByteArrayOutputStream out = new ByteArrayOutputStream();
>          DOMSource domSource = new DOMSource(htmlDocument);
>          StreamResult streamResult = new StreamResult(out);
>
>          TransformerFactory tf = TransformerFactory.newInstance();
>          Transformer serializer = tf.newTransformer();
>          serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
>          serializer.setOutputProperty(OutputKeys.INDENT, "yes");
>          serializer.setOutputProperty(OutputKeys.METHOD, "html");
>          serializer.transform(domSource, streamResult);
>          out.close();
>
>          String result = new String(out.toByteArray());
>          System.out.println(result);
>        } catch (Exception e) {
>        }
>
> Is there anyway I can correctly identify these symbols?
>
>
> In the sample document, I am interested in getting 'bad one'.
>
>
> Thanks
>
> T.
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Apach POI doc/docx parser

Teresa Kim-2
Hi Dominik


Sure I attached the symbol_test.doc document in the previous email.

I think I cannot attach the document in email?

Is there anyway I can share the document?


Thanks

T.

On 06/10/2019 16:29, Dominik Stadler wrote:

> Hi,
>
> can you share an example document which shows the behavior?
>
> Thanks... Dominik.
>
>
> On Sun, Oct 6, 2019 at 6:48 AM Teresa Kim
> <[hidden email]> wrote:
>
>> Hi
>>
>>
>> I have documents (either 'doc' or 'docx') that have a special character
>> for 'greater than equal' and using codes in 'WordToHtmlConverter', I see
>> those characters are converted into '('.
>>
>> I tried with the latest apache poi release 4.1.0.
>>
>>
>> My java code is:
>>
>>
>> public class TestWordtoHtmlConverter {
>>
>>       public static void main(String[] args ) {
>>           try {
>>           HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new
>> FileInputStream(args[0]));
>>
>>           WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
>>                   DocumentBuilderFactory.newInstance().newDocumentBuilder()
>>                           .newDocument());
>>
>>           wordToHtmlConverter.processDocument(wordDocument);
>>           Document htmlDocument = wordToHtmlConverter.getDocument();
>>           ByteArrayOutputStream out = new ByteArrayOutputStream();
>>           DOMSource domSource = new DOMSource(htmlDocument);
>>           StreamResult streamResult = new StreamResult(out);
>>
>>           TransformerFactory tf = TransformerFactory.newInstance();
>>           Transformer serializer = tf.newTransformer();
>>           serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
>>           serializer.setOutputProperty(OutputKeys.INDENT, "yes");
>>           serializer.setOutputProperty(OutputKeys.METHOD, "html");
>>           serializer.transform(domSource, streamResult);
>>           out.close();
>>
>>           String result = new String(out.toByteArray());
>>           System.out.println(result);
>>         } catch (Exception e) {
>>         }
>>
>> Is there anyway I can correctly identify these symbols?
>>
>>
>> In the sample document, I am interested in getting 'bad one'.
>>
>>
>> Thanks
>>
>> T.
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Apach POI doc/docx parser

Dominik Stadler
Hi,

it seems the document does not make it through the list for some reason,
can you report an issue at https://bz.apache.org/bugzilla/ and attach it
there. This way we also have a better trail of work on the problem.

Dominik.

On Mon, Oct 7, 2019 at 6:33 AM Teresa Kim
<[hidden email]> wrote:

> Hi Dominik
>
>
> Sure I attached the symbol_test.doc document in the previous email.
>
> I think I cannot attach the document in email?
>
> Is there anyway I can share the document?
>
>
> Thanks
>
> T.
>
> On 06/10/2019 16:29, Dominik Stadler wrote:
> > Hi,
> >
> > can you share an example document which shows the behavior?
> >
> > Thanks... Dominik.
> >
> >
> > On Sun, Oct 6, 2019 at 6:48 AM Teresa Kim
> > <[hidden email]> wrote:
> >
> >> Hi
> >>
> >>
> >> I have documents (either 'doc' or 'docx') that have a special character
> >> for 'greater than equal' and using codes in 'WordToHtmlConverter', I see
> >> those characters are converted into '('.
> >>
> >> I tried with the latest apache poi release 4.1.0.
> >>
> >>
> >> My java code is:
> >>
> >>
> >> public class TestWordtoHtmlConverter {
> >>
> >>       public static void main(String[] args ) {
> >>           try {
> >>           HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new
> >> FileInputStream(args[0]));
> >>
> >>           WordToHtmlConverter wordToHtmlConverter = new
> WordToHtmlConverter(
> >>
>  DocumentBuilderFactory.newInstance().newDocumentBuilder()
> >>                           .newDocument());
> >>
> >>           wordToHtmlConverter.processDocument(wordDocument);
> >>           Document htmlDocument = wordToHtmlConverter.getDocument();
> >>           ByteArrayOutputStream out = new ByteArrayOutputStream();
> >>           DOMSource domSource = new DOMSource(htmlDocument);
> >>           StreamResult streamResult = new StreamResult(out);
> >>
> >>           TransformerFactory tf = TransformerFactory.newInstance();
> >>           Transformer serializer = tf.newTransformer();
> >>           serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
> >>           serializer.setOutputProperty(OutputKeys.INDENT, "yes");
> >>           serializer.setOutputProperty(OutputKeys.METHOD, "html");
> >>           serializer.transform(domSource, streamResult);
> >>           out.close();
> >>
> >>           String result = new String(out.toByteArray());
> >>           System.out.println(result);
> >>         } catch (Exception e) {
> >>         }
> >>
> >> Is there anyway I can correctly identify these symbols?
> >>
> >>
> >> In the sample document, I am interested in getting 'bad one'.
> >>
> >>
> >> Thanks
> >>
> >> T.
> >>
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>