Jump to content

Kexi/Migration/MDB: Difference between revisions

From KDE Community Wiki
Jstaniek (talk | contribs)
mNo edit summary
Jstaniek (talk | contribs)
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is MDB file migration driver for KEXI.
This is a MDB file migration driver for KEXI.


MDB files are the native database format of MS Access (and also some other MS applications).  
MDB files are the native database format of Microsoft Access (and also some other MS applications). Access was created in early 90's and is still offered today. We often use shortened term MSA for it.
 
MDB files are self-contained; they contain both data and database schema.


* Location of the sub-project is in KEXI's src/migration/mdb dir.
* Location of the sub-project is in KEXI's src/migration/mdb dir.
* Read the project [https://cgit.kde.org/kexi.git/tree/src/migration/mdb/README README].
* Read the project [https://invent.kde.org/office/kexi/-/blob/master/src/migration/mdb/README README].
* '''You are invited to join the project.'''
* '''You are invited to join the project.'''
* Contact for support [[Kexi/Contact|here]].
* Contact for support [[Kexi/Contact|here]].


__TOC__
== Introduction ==
The Kexi MDB driver is based upon mdbtools.
Currently the Migration Driver is developed, what means MDB driver '''can only import'''  MDB databases as a new KEXI database. It cannot be used to edit an MDB database.
== Supported features ==
Currently .mdb and .mde database files are supported because they are the formats supported by ''mdbtools''. They are created with MS Jet Engine version 3.0 (MS Access 95 and 97) and MS Jet Engine 4.0 (MS Access 2000, XP, 2003...). Access 2007's .accdb files are currently unsupported. This format has not been reverse engineered. It also has nothing in common with MS Office Open XML format so there is still no official documentation. Our advice is to avoid this format if possible and stay with older .mdb file format if you have to use Access.
Files created with MS Jet 2.0 (MS Access 2) are not and probably will not be supported. You can use free Microsoft conversion tools (Windows-only) to convert these files to a supported mdb format and import the converted files using the MDB migration driver.
== Sample Databases ==
* [https://github.com/statdata/Northwind-Databse/blob/master/Northwind.mdb Northwind Traders Sample Database]
** See also [https://medium.com/@graceidiare/northwind-traders-analysis-d74e4b3cf55f analysis of this design]
* [https://github.com Free MS Access templates from Microsoft] useful for test purposes (TODO jstaniek: I have them locally as msa_templates.tar.gz, will upload to github)
== Development ==
=== Packaging ===
''mdbtools'' source code is maintained within the KEXI migration scope to ensure that exact version that is required, not too new, not too old, is in use.
=== Future plans ===
As soon as mdbtools provide full write support for MDB databases, we wouldd like to extend the MDB support to a full KDb driver. However, it's possible that a read-only KDb Driver could also be developed before.
Maintaining ''mdbtools'' within the KEXI migration scope can be stopped if it's possible and practical.
=== Comparison to Libre/OpenOffice support ===
'''mdb-sdbc-driver:''' oo.org uses .mdb files in read-write mode on Win32 using a native libraries. On Linux, it uses the MDBTools library too. You can also take a look at the patches published on the page. Bqack in 200x years when we monitored the situation, there were just SQL parsers that we wouldn't like to use in KEXI because KDb already had its own, more advanced SQL parser.
=== TODOs ===
(in random order)
*Move to KDb?
*Test import of OLE and other complex data
*Import relationships
=== Read-write support for .mdb files?===
(started by JS, January 2005)
By in-place read-write support we mean opening .mdb files without migrating it to other (e.g. more KEXI-compatible) format.
For an average Joe user this topic looks quite simple: we're opening .mdb files, doing changes (adding records to existing tables, adding/removing tables, changing schema, designing and executing queries, and so on...). There are some issues though:
*mdbtools provides incomplete write support. Its author already did really huge reverse-engineering work but, at the time of this writing, adding a new table may by problematic. For example, people on win32 platforms can assume that it's all work well, eg. when trying LO/OO beta, but looks like win32's API is used there, not mdbtools. There are problems reported with MS Access files with the same application running on Linux instead of MS Windows.
*Reading tables by simple listing rows is only a small part of the task. To make a full use of database features user needs to be able to perform SQL queries on the database with JOINs and WHERE clauses, no matter if by entering a query statement or by using a GUI tools. Unfortunately, to be able to perform more sophisticated queries than a SELECT, one needs to add a Query Engine which can perform joins transparently, using foreign keys, and also other means like: rows filtering, computing expressions. Such engine doesn't exists within mdbtools and isn't even planned in the near future.
What about reusing such an engine? It's hard or impossible to effectively take SQL engine out of, say, SQLite project and reuse it just by replacing its data storage layer with .mdb-compatible one. Why? SQLite (and probably most or all engines) are designed and optimized with one particular storage method in mind. Similarity of any such a method to .mdb storage handling seems to be only superficial, e.g. regarding to splitting memory to small pages.
===Issues Related to mdbtools===
*No support for MONEY and DECIMAL values. There is a bug in mdbtools (<0.6) related to handling these data types. DECIMAL value is always fetched as "000000."
*Fields ordering. Sometimes order of the imported fields for the given table is not the same as the one visible in MS Access. This can be related to moving and/or removing columns. To fix on mdbtools level.
=== Exporting and Importing design from/to MS Access files ===
Note: this tool requires running MSA.
There is an '''undocumented''' feature in MSA allowing to import and export a full definition of an object to a file. Queries, forms, reports, modules, data acces pages and macros can be exported and imported (i.e. all except table design, but tables can be exchanged using mdbtools).
The advantage of this method is that we can discover the full object schema that is probably very similar in terms of layout to what is stored in memory by MSA. Example fragment of a query design listing inserted tables:
  Begin InputTables
    Name ="Authors"
    Name ="Publishers"
    Name ="Titles"
  End
The available methods belong to ''Access.Application'' object:
  Sub SaveAsText(ObjectType As AcObjectType, ObjectName As String,
                FileName As String)
  Sub LoadFromText(ObjectType As AcObjectType, ObjectName As String,
                FileName As String)
To export a report definition to a file: open the Immediate window (Ctrl+G) and type:
  Application.SaveAsText acReport, "ReportName", "C:\reportdef.txt"
You can import the report from file later by typing:
  Application.LoadFromText acReport, "ReportName", "C:\reportdef.txt"
''acReport'' constant stands for ''report'' type. Other available constants are ''acFrom'', ''acQuery'', ''acModule'', ''acMacro'', ''acDataAccessPage''.
*Note 1: These methods are useful to backup MSA database objects. Employed in a loop using VBA can export all the objects in one go (except tables).
*Note 2: Funny, but intellisense "knows" about the signature of the methods while you entering the args. In fact you can use Object browser to find these methods and many other: for example select ''Application'' class, press right mouse button and set "Show Hidden Members" option on. You will see grayed names of classes and members.
===MS Access Annoyances===
Compatibility:
* If you even '''open''' an MS Access 97 database using MS Access 2000, it can't be accessed from 97 anymore. That's why many users gave up and still are using Access 97...
* MS Access 2003 can't open 97 databases at all, and MS Access 2003 refuses to co-exist with 97 on the same machine the way 2000 can.
Multiuser Access:
* When multiple clients are accessing the same MS Access database, they may usually encounter data locks. Moreover, record file locks can be notoriously orphaned what leads to an unaccessible database (all users need to be disconnected).
Queries:
* MS Access does not support case-sensitive queries. VBA written by hand is requires or some other middle tier code as a workaround. See also [https://web.archive.org/web/20061228031415/http://support.microsoft.com/?kbid=209674] and [[KDb/Database_Collation_Issues|Collation Issues]].
What Others Say:
* [https://web.archive.org/web/20080915072927/http://www.vistadb.net/compare_jet.asp Comparison from VistaDB vendor]
===MS Access Stupidities===
* [https://web.archive.org/web/20080506075549/http://support.microsoft.com/kb/304206 http://support.microsoft.com/kb/304206]
* "Missing References" Hell: http://www.mvps.org/access/bugs/index.html
* Forms:
** We are in XXI century, but form styles (available in Form Wizard) are still generated with 16 colors.
* MSA allows users to use any (non latin-1) characters and then they are wondering why their database does not work when on MS Windows with other GUI language. After a cryptic error message users are forced to '''find and rename objects by hand''' to fix the bug. Moreover, MSA itself creates non latin-1 identifiers automatically, e.g. for "Details" identifier - "Szczegóły" in polish. This is example of badly designed core functionality that cannot be patched.
* Flawed sort order for text (MSA version 2000 or newer): dash characters (-) are ommited while sorting. For example the result can be as unexpected as the following:
  aza
  a-z-z
  -zoo
instead of:
  -zoo
  a-z-z
  aza
Comment: this is hidden (not well undocumented) bit of Access' sorting strategy. Moreover '.' or '_' characters are not affected, what looks inconsistent. As a result, there may be invalid query results or reports if user depends on "typical" sorting order.


Old page: http://www.kexi-project.org/wiki/wikiview/index.php@MDBDriver.html
===Links===
* [https://mdbtools.sourceforge.net/ MDBTools]
* [https://jackcess.sourceforge.io/ Jackcess]  - a pure Java library for reading from and writing to MS Access databases (originally based on mdbtools)
* [https://poi.apache.org/ Jakarta POI] - Java API To Access Microsoft Format Files
* [https://web.archive.org/web/20210427181953/https://www.databasejournal.com/features/msaccess/article.php/3528491/Use-System-Tables-to-Manage-Objects.htm Databasejournal.com: Use System Tables to Manage Objects]
* [http://www.delphikingdom.com/asp/viewitem.asp?catalogid=889 Accessing Query defs in .mdb files without using MS Access] (in Russian; [https://www-delphikingdom-com.translate.goog/asp/viewitem.asp?catalogid=889&_x_tr_sch=http&_x_tr_sl=fr&_x_tr_tl=pl&_x_tr_hl=en&_x_tr_pto=wapp english autotranslation]) - this info can be used for implementing MSA query importer.
* [https://web.archive.org/web/20130917025322/http://support.microsoft.com/kb/275561 Description of the new features that are included in Microsoft Jet 4.0]
* [http://talks.php.net/show/mdbtool/ Using mdbtools with PHP]
* [https://web.archive.org/web/20011211031752/http://www.microsoft.com/accessdev/articles/jetwp.htm#JET MS Jet Database Engine 2.0: A User's Overview] at MSA Developer Forum Archives
* [https://web.archive.org/web/20080723185404/http://bryanmills.net:8086/archives/2003/11/microsoft-access-database-using-linux-and-php/ Microsoft Access Database using Linux and PHP]
* [https://web.archive.org/web/20080118030253/http://www.microsoft.com/sql/solutions/migration/access/whenmigrate.mspx When to Migrate from MSA to MSSQL Server] - a source of information about MSA drawbacks compared to db servers
* [https://en.wikipedia.org/wiki/Compound_File_Binary_Format Compound File Binary Format (CFBF)]  - on-disk storage format of data, opened by MS for use by others and it is now used in a variety of programs; used by MSA and Business Objects.
* [https://en.wikipedia.org/wiki/SNP_file_format SNP File Format], based on the Compound File Binary Format (CFBF), used by MSA to store Report Snapshots in a single file which can be viewed and printed by the Microsoft Snapshot Viewer (available as a free download from MS). This allows report output to be exported and viewed on computers which do not have MSA installed.

Latest revision as of 22:26, 5 October 2024

This is a MDB file migration driver for KEXI.

MDB files are the native database format of Microsoft Access (and also some other MS applications). Access was created in early 90's and is still offered today. We often use shortened term MSA for it.

MDB files are self-contained; they contain both data and database schema.

  • Location of the sub-project is in KEXI's src/migration/mdb dir.
  • Read the project README.
  • You are invited to join the project.
  • Contact for support here.

Introduction

The Kexi MDB driver is based upon mdbtools.

Currently the Migration Driver is developed, what means MDB driver can only import MDB databases as a new KEXI database. It cannot be used to edit an MDB database.

Supported features

Currently .mdb and .mde database files are supported because they are the formats supported by mdbtools. They are created with MS Jet Engine version 3.0 (MS Access 95 and 97) and MS Jet Engine 4.0 (MS Access 2000, XP, 2003...). Access 2007's .accdb files are currently unsupported. This format has not been reverse engineered. It also has nothing in common with MS Office Open XML format so there is still no official documentation. Our advice is to avoid this format if possible and stay with older .mdb file format if you have to use Access.

Files created with MS Jet 2.0 (MS Access 2) are not and probably will not be supported. You can use free Microsoft conversion tools (Windows-only) to convert these files to a supported mdb format and import the converted files using the MDB migration driver.

Sample Databases

Development

Packaging

mdbtools source code is maintained within the KEXI migration scope to ensure that exact version that is required, not too new, not too old, is in use.

Future plans

As soon as mdbtools provide full write support for MDB databases, we wouldd like to extend the MDB support to a full KDb driver. However, it's possible that a read-only KDb Driver could also be developed before.

Maintaining mdbtools within the KEXI migration scope can be stopped if it's possible and practical.

Comparison to Libre/OpenOffice support

mdb-sdbc-driver: oo.org uses .mdb files in read-write mode on Win32 using a native libraries. On Linux, it uses the MDBTools library too. You can also take a look at the patches published on the page. Bqack in 200x years when we monitored the situation, there were just SQL parsers that we wouldn't like to use in KEXI because KDb already had its own, more advanced SQL parser.

TODOs

(in random order)

  • Move to KDb?
  • Test import of OLE and other complex data
  • Import relationships

Read-write support for .mdb files?

(started by JS, January 2005)

By in-place read-write support we mean opening .mdb files without migrating it to other (e.g. more KEXI-compatible) format.

For an average Joe user this topic looks quite simple: we're opening .mdb files, doing changes (adding records to existing tables, adding/removing tables, changing schema, designing and executing queries, and so on...). There are some issues though:

  • mdbtools provides incomplete write support. Its author already did really huge reverse-engineering work but, at the time of this writing, adding a new table may by problematic. For example, people on win32 platforms can assume that it's all work well, eg. when trying LO/OO beta, but looks like win32's API is used there, not mdbtools. There are problems reported with MS Access files with the same application running on Linux instead of MS Windows.
  • Reading tables by simple listing rows is only a small part of the task. To make a full use of database features user needs to be able to perform SQL queries on the database with JOINs and WHERE clauses, no matter if by entering a query statement or by using a GUI tools. Unfortunately, to be able to perform more sophisticated queries than a SELECT, one needs to add a Query Engine which can perform joins transparently, using foreign keys, and also other means like: rows filtering, computing expressions. Such engine doesn't exists within mdbtools and isn't even planned in the near future.

What about reusing such an engine? It's hard or impossible to effectively take SQL engine out of, say, SQLite project and reuse it just by replacing its data storage layer with .mdb-compatible one. Why? SQLite (and probably most or all engines) are designed and optimized with one particular storage method in mind. Similarity of any such a method to .mdb storage handling seems to be only superficial, e.g. regarding to splitting memory to small pages.

Issues Related to mdbtools

  • No support for MONEY and DECIMAL values. There is a bug in mdbtools (<0.6) related to handling these data types. DECIMAL value is always fetched as "000000."
  • Fields ordering. Sometimes order of the imported fields for the given table is not the same as the one visible in MS Access. This can be related to moving and/or removing columns. To fix on mdbtools level.

Exporting and Importing design from/to MS Access files

Note: this tool requires running MSA.

There is an undocumented feature in MSA allowing to import and export a full definition of an object to a file. Queries, forms, reports, modules, data acces pages and macros can be exported and imported (i.e. all except table design, but tables can be exchanged using mdbtools).

The advantage of this method is that we can discover the full object schema that is probably very similar in terms of layout to what is stored in memory by MSA. Example fragment of a query design listing inserted tables:

 Begin InputTables
    Name ="Authors"
    Name ="Publishers"
    Name ="Titles"
 End

The available methods belong to Access.Application object:

 Sub SaveAsText(ObjectType As AcObjectType, ObjectName As String,
                FileName As String)
 Sub LoadFromText(ObjectType As AcObjectType, ObjectName As String,
                FileName As String)

To export a report definition to a file: open the Immediate window (Ctrl+G) and type:

 Application.SaveAsText acReport, "ReportName", "C:\reportdef.txt"

You can import the report from file later by typing:

 Application.LoadFromText acReport, "ReportName", "C:\reportdef.txt"

acReport constant stands for report type. Other available constants are acFrom, acQuery, acModule, acMacro, acDataAccessPage.

  • Note 1: These methods are useful to backup MSA database objects. Employed in a loop using VBA can export all the objects in one go (except tables).
  • Note 2: Funny, but intellisense "knows" about the signature of the methods while you entering the args. In fact you can use Object browser to find these methods and many other: for example select Application class, press right mouse button and set "Show Hidden Members" option on. You will see grayed names of classes and members.

MS Access Annoyances

Compatibility:

  • If you even open an MS Access 97 database using MS Access 2000, it can't be accessed from 97 anymore. That's why many users gave up and still are using Access 97...
  • MS Access 2003 can't open 97 databases at all, and MS Access 2003 refuses to co-exist with 97 on the same machine the way 2000 can.

Multiuser Access:

  • When multiple clients are accessing the same MS Access database, they may usually encounter data locks. Moreover, record file locks can be notoriously orphaned what leads to an unaccessible database (all users need to be disconnected).

Queries:

  • MS Access does not support case-sensitive queries. VBA written by hand is requires or some other middle tier code as a workaround. See also [1] and Collation Issues.

What Others Say:

MS Access Stupidities

  • http://support.microsoft.com/kb/304206
  • "Missing References" Hell: http://www.mvps.org/access/bugs/index.html
  • Forms:
    • We are in XXI century, but form styles (available in Form Wizard) are still generated with 16 colors.
  • MSA allows users to use any (non latin-1) characters and then they are wondering why their database does not work when on MS Windows with other GUI language. After a cryptic error message users are forced to find and rename objects by hand to fix the bug. Moreover, MSA itself creates non latin-1 identifiers automatically, e.g. for "Details" identifier - "Szczegóły" in polish. This is example of badly designed core functionality that cannot be patched.
  • Flawed sort order for text (MSA version 2000 or newer): dash characters (-) are ommited while sorting. For example the result can be as unexpected as the following:
 aza
 a-z-z
 -zoo

instead of:

 -zoo
 a-z-z
 aza

Comment: this is hidden (not well undocumented) bit of Access' sorting strategy. Moreover '.' or '_' characters are not affected, what looks inconsistent. As a result, there may be invalid query results or reports if user depends on "typical" sorting order.

Links