Calligra/Words/Tables

From KDE Community Wiki

Tables in KWord

...and other KOffice applications.

In the context of this topic we are focussing on what KWord as a word processor wants to have for organizing its text flow into cells formatted in columns and rows. We get inspiration for feature set and breath from ODF.

As tables are quite common as well as complex the showing of usecases as we do elsewhere on this wiki is probably not the easiest way to show our intention. Instead I'll just dive into the common user expectations placed on any tables implementation. This list is compiled from user feedback based on the KOffice1 implementation of tables.

KOffice specific requirements;

  • Basic (html-like) table features should be available. Think about having rowspan/columnspan, cell borders, cell spacing, pre-defined column height, adjusting the whole table to the width of the page(/shape), embedded tables etc.
  • KOffice should allow sizes of cells, insets and border thickness to be defined in real-world sizes. Postscript points or millimeters, for example.
  • Users should be able to select text and cells using the same user interface elements without switching. Essentially selecting all text in a cell is the same as selecting a cell.
  • Tables should be able to be split over pages (shapes) at any position. Not just at row boundaries.
  • Integration into text is important; being able to use arrow right before the table to get into the first cell is something the user expects to do.

There was a research project on tables for KOffice2 based on a KoShape which would then embed in the text-flow. This project was not continued and just before the release of 2.0 removed from svn. Its available for continued research at websvn The tableshape design is based on having a separate component that is then placed by the text-rendering engine 'inline' as one big item. What we learned from this project as well as from the KOffice1 tables design is that the inlined object (shape in koffice2 and frames in koffice1) has some upper limit that you can't really get past. Specifically the inline text-object approach makes it practically impossible to have proper integration with text editing and practically impossible to have a table be split over two or more pages. Both requirements in KOffice.

In KoText we use QTextDocument. This Qt component has tables design build in already, but doesn't provide public API for doing layout of the cells. Yet the fact that its build in makes building KOffice tables on top of QTextDocument much easier. The rest of the document shall be about the design as QTextDocument has for tables.

Qt

Internally to a QTextDocument the entire document is one continues string of (unicode) characters. Special characters have special meaning to the document. The easiest one of those is the linefeed. A linefeed character separates two paragraphs. Implemented in Qt-scribe as QTextBlock.

A higher level structure is called a frame in Qt-scribe. A QTextFrame groups a set of blocks together. Again you will find a special character internally in the text-document to mark the frame-boundaries.

  • TextBeginningOfFrame QChar(0xfdd0)
  • TextEndOfFrame QChar(0xfdd1)

So, the QTextFrame is a collection of QTextBlocks. Grouping a set of paragraphs is from the Qt perspective the beginnings of a table. As a table is essentially a collection of cells and each cell has at least one QTextBlock of text. Marking a table as a collection of blocks (aka paragraphs) is something you find in the API of QTextTable which is a class that inherits from a QTextFrame.

The observant reader would then see that in the underlying data structures in Qt a table starts at a TextBeginningOfFrame and ends at TextEndOfFrame. Additionally each cell will be separated by another TextBeginningOfFrame. And naturally any paragraph separators will be marked by end-of-line characters.

This design leads to a very easy way of editing text together with tables. No need to keep multiple data structures in sync as tables are supported in the lowest level of Qtext. And KOffice can reuse this normally.

QTextFormat

The QTextFormat class is a basic set of key/value pairs. Using a big enum each of these keys is essentially an integer. This makes the format a basic storage of data. As described in the previous section the basic internal data structure for a QTextDocument is one long string. Next to that we have a list of formats. Each character in the text maps to a QTextFormat. A normal character has a QTextCharFormat with all the relevant properties. A block-separator (aka linefeed) has a QTextBlockFormat. A QTextTable then has a QTextTableFormat. No surprises there. There is also a QTextTableCellFormat and others.

KOffice uses these formats in the same was that Qt does but adds many more properties to them. This is how we support ODF features that Qt doesn't know about. For tables I expect us to be able to separate which ODF property goes into which format-type and then apply that to the document on loading.

QTextTable Layout

If you followed along till here you would notice that the text in each cell of a table is actually a normal QTextBlock. The fact that we can find it intersects with the table start/end positions is basically the only difference between normal paragraphs and paragraphs that are supposed to be fit into a cell.

KOffice at the 2.0 release will gladly layout a table using the KOffice-specific text-layout engine. Unfortunately it doesn't know anything about tables or cells so if you have a table in your document it will ignore this extra information. Prices for the conclusion what this means on actual layout using the KOffice2.0 layout engine :) Thats right, each cell will just be a full-width paragraph. One below each other.

The Qt version of the layout class that KOffice uses is the QTextDocumentLayout. Which inherits from the QAbstractTextDocumentLayout. Just like the KOffice based KoTextDocumentLayout does. The QTextDocumentLayout is a not exported private class that we can't use in KOffice because of this.

The low level text-layout concept is pretty basic and I suggest to understand it to first look at the KoTextDocumentLayout::layout() method. But the qt4-scribe docs will certainly help too.

To do text-layout in table-cells we must essentially just create text-lines that are of a specific width that is based on the cell size instead of on the total width of the shape we are doing layout in. Next to that we should position the blocks at the cell positions instead of in a linear always increasing y-position. Paragraph position is stored on the QTextLayout class.

Ideas

KOffice uses basic Qt html export/import for the clipboard. So we can just copy / paste a table into a KWord document using this approach. Makes for easy debugging ;)