Jump to content

KDb/Rich Data Containers

From KDE Community Wiki
< KDb
 started: june 2006, jstaniek

Introduction

Since KEXI 1.6 we have BLOB data type supported. It's called "Object" in KEXI for simplicity. There is defined "subType" property for this data type, but for now it contains only one item: Image. In the future KEXI version there will be more subtypes like document (probably ODF), multimedia file (movie, sound), and user defined subtypes.

From storage point of view, each additional subtype does not matter - the data is still a binary array. But each subtype requires it's own set of widgets, namely Table View cell editor, Form widget, and (in to future) Report widget. These widgets can be implemented as read-only or read/write.

Below we'll mostly focus on the simplest and already implemented case: Image Object (subtype).

Needs

What we want to achieve is to keep backward compatibility. There is currently something we may want to address before 1.6 release:

  • Support for storing multiple objects per "Object" field
  • Support for storing metadata for each "Object"

The extenstions can be implemented by storing rich data within a single field of type "Object". We can refer to such field as to rich data container.

About metadata: the minimal metadata includes:

  • filenames
  • mimetype

By inserting files like images user provides filenames. These are lost unless we store them somewhere near the BLOB data. In fact we already store the metadata for any static image (the one that is put onto the form surface) in the kexi__blob "system" table.

Use Case

User defined a table ''"persons (name Text, surname Text, photo Object) and entered some records. She prefers to save another image for a given person. She can redesign (normalize) the database and create additional child table containing images for persons. But having rich data containers available she doesn't need to. What she does:

  • set "allowMultipleObjects" property to true (it is false by default) in the Table Designer
  • in the Table View or Data View of a form, she clicks a "+" button within the Image widget and adds another "place" (the "place" can be displayed as somethign like a page of a book)
  • then she inserts another image into the new empty "place"
  • she can iterate over the items to display particular images

Note: the widget supporting multiple images is more complicated compared to the original Image widget

Ideas

It's interesting that this feature, being planned for KEXI, is already somewhat implemented in MSA 2k7 co(see New Attachment Data Type section).

There can be two methods to implement the functinality:

METHOD1: save an xml string at the beginning of the BLOB data and then append data objects. The xml strong could contain sizes of the data objects and some metadata for them. The layout:

 size_of_xml_stingxml stringOBJECT_DATA1...OBJECT_DATAn

Where size_of_xml_sting is and integer equal to the size of the xml string.

Example xml string: <container> <object><mimetype>image/png</mimetype> <filename>moon.png</filename> </object> </container>

Pros of this method:

  • we have still used only one physical database field, so it's easy to alter the table schema when user turns "allowMultipleObjects" property on.

Cons of this method:

  • we cannot have queries like "SELECT object.data, object.filename from table" for free using the database engine.
  • inserting, modifying or deleting the objects require the entire BLOB to be loaded, rebuilt, and saved again to the backed, as in general there is no way to get random access to the BLOB data.

METHOD2.1: A case when "allowMultipleObjects" property is false for the field. Upon defining the field as supporting multiple objects, create the following fields instead of a single BLOB:

  • filename
  • mimetype
  • data

Within KEXI GUI, the field will be still visible as single item.

METHOD2.2: A case when "allowMultipleObjects" property is true for the field. Upon defining the field as supporting multiple objects, instead of a single BLOB, create a separate table containing the following fields:

  • parent_id
  • filename
  • mimetype
  • data

Within KEXI GUI, the field will be still visible as single item.

In fact, METHOD2.2 is what MSA utilizes.

Pros of this method:

  • we have queries like "SELECT object.data, object.filename from table" for free using the database engine.
  • parent_id is a foreign key referring to the parent record. Thus, there's one-to-many relationship allowing multiple objects to be inserted, modified or deleted at a reasonable cost.

Cons of this method:

  • for each Objecty field wh have additional internal fields. Internally, the database schema becomes more complex. This should not be a problem however, since in most cases there is just one column of type Object defined per table.

Proposed verdict: Methods 2.1 and 2.2 are better.

RFC