avik 2005/05/28 12:28:22
Modified: src/documentation/content/xdocs book.xml Added: src/documentation/content/xdocs/hslf book.xml index.xml quick-guide.xml Log: documentation for powerpoint support Revision Changes Path 1.20 +2 -1 jakarta-poi/src/documentation/content/xdocs/book.xml Index: book.xml =================================================================== RCS file: /home/cvs/jakarta-poi/src/documentation/content/xdocs/book.xml,v retrieving revision 1.19 retrieving revision 1.20 diff -u -r1.19 -r1.20 --- book.xml 18 Feb 2005 10:03:55 -0000 1.19 +++ book.xml 28 May 2005 19:28:21 -0000 1.20 @@ -21,7 +21,8 @@ <menu-item label="HSSF" href="hssf/index.html"/> <menu-item label="HWPF" href="hwpf/index.html"/> <menu-item label="HPSF" href="hpsf/index.html"/> - <menu-item label="POI-Ruby" href="poi-ruby.html"/> + <menu-item label="HSLF" href="hslf/index.html"/> + <menu-item label="POI-Ruby" href="poi-ruby.html"/> <menu-item label="POI-Utils" href="utils/index.html"/> <menu-item label="Download" href="ext:download"/> </menu> 1.1 jakarta-poi/src/documentation/content/xdocs/hslf/book.xml Index: book.xml =================================================================== <?xml version="1.0"?> <!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. --> <!DOCTYPE book PUBLIC "-//APACHE//DTD Cocoon Documentation Book V1.0//EN" "../dtd/book-cocoon-v10.dtd"> <book software="POI Project" title="HSSF" copyright="@year@ POI Project"> <menu label="Jakarta POI"> <menu-item label="Top" href="../index.html"/> </menu> <menu label="HSLF"> <menu-item label="Overview" href="index.html"/> <menu-item label="Quick Guide" href="quick-guide.html"/> </menu> </book> 1.1 jakarta-poi/src/documentation/content/xdocs/hslf/index.xml Index: index.xml =================================================================== <?xml version="1.0" encoding="UTF-8"?> <!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. --> <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> <document> <header> <title>POI-HSLF - Java API To Access Microsoft Powerpoint Format Files</title> <subtitle>Overview</subtitle> <authors> <person name="Avik Sengupta" email="avik at apache dot org"/> </authors> </header> <body> <section> <title>Overview</title> <p>HSLF is the POI Project's pure Java implementation of the Powerpoint file format.</p> <p>HSSF provides a way to read powerpoint presentations, and extract text from it. It also provides some (currently limited) edit capabilities. </p> <note> This code currently lives the scratchpad area of the POI CVS repository. Ensure that you have the scratchpad jar or the scratchpad build area in your classpath before experimenting with this code. </note> <p>The <link href="./quick-guide.html">quick guide</link> documentation provides information on using this API. Comments and fixes gratefully accepted on the POI dev mailing lists.</p> </section> </body> </document> 1.1 jakarta-poi/src/documentation/content/xdocs/hslf/quick-guide.xml Index: quick-guide.xml =================================================================== <?xml version="1.0" encoding="UTF-8"?> <!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. --> <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> <document> <header> <title>POI-HSLF - A Quick Guide</title> <subtitle>Overview</subtitle> <authors> <person name="Nick Burch" email="nick at torchbox dot com"/> </authors> </header> <body> <section><title>Basic Text Extraction</title> <p>For basic text extraction, make use of <code>org.apache.poi.extractor.PowerPointExtractor</code>. It accepts a file or an input stream. The <code>getText()</code> method can be used to get the text from the slides, from the notes, or from both. </p> </section> <section><title>Specific Text Extraction</title> <p>To get specific bits of text, first create a <code>org.apache.poi.usermodel.SlideShow</code> (from a <code>org.apache.poi.HSLFSlideShow</code>, which accepts a file or an input stream). Use <code>getSlides()</code> and <code>getNotes()</code> to get the slides and notes. These can be queried to get their page ID (though they should be returned in the right order). You can also call <code>getTextRuns()</code> on these, to get their blocks of text. From the <code>TextRun</code>, you can extract the text, and check what type of text it is (eg Body, Title) </p> </section> <section><title>Changing Text</title> <p>It is possible to change the text via <code>TextRun.setText(String)</code>. However, if the length of the text is changed, things will break because PowerPoint has internal file references in byte offsets, which are not yet all updated when the size changes. </p> </section> <section><title>Guide to key classes</title> <ul> <li><code>org.apache.poi.hslf.HSLFSlideShow</code> Handles reading in and writing out files. Generates a tree of the records in the file </li> <li><code>org.apache.poi.hslf.usermode.SlideShow</code> Builds up model entries from the records, and presents a user facing view of the file </li> <li><code>org.apache.poi.hslf.extractor.PowerPointExtractor</code> Uses the model code to allow extraction of text from files </li> </ul> </section> </body> </document> --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta POI Project: http://jakarta.apache.org/poi/ |
Free forum by Nabble | Edit this page |