Osnabrück icaldir proposal
icaldir
KDE PIM Team Mark Bucciarelli
Request for Comments February 8, 2005
CONTENTS
I. PROBLEM
II. SOLUTION
III. DIRECTORY LAYOUT
IV. FILE NAMES
V. TIME ZONES
VI. OPERATIONS
VII. CONTACT
I.) PROBLEM
This document proposes a maildir-like approach for storing
iCalendar data [2]. The motivation is to provide a scalable
way for client applications to coordinate concurrent updates
to iCalendar data.
Currently, client programs that access iCalendar data (for
example, KDE's KOrganizer [2], Apple's iCal [3], Mozilla's
Sunbird [4], the PHP iCalendar [5]) read and save iCalendar
data in one large file. This works fine if there is one
program updating the calendar data but is cumbersome if
multiple clients are updating the same calendar data.
Since all data is stored in a single file, each client must
lock the entire file to update one single event. If one
client program wants to add an event for January 2006 and
another client is editing an old event in March 2005, one
must wait for the other one to finish.
If an iCalendar file is changed, all client programs
displaying data from that file must rescan the entire file to
be sure they have the most current data.
When editing a file, a program must either load the entire
file into memory and keep it there or reread the entire file
to find the right spot to edit. For large calendars, either
of these approaches requires an inefficient use of system
memory.
II.) SOLUTION
The solution is inspired by and borrows heavily from the
maildir format [6]. As maildir broke the one large mbox file
into many small files (one for each email), we break the
icalendar file into many small files, one for each icalendar
event.
Client applications then create a lock that is specific to
the file they are editing. Any client that creates a lock
must touch the file periodically to update it's attributes;
in this way, other clients can tell without question when a
lock file has become stale. (See the 1997 USENIX paper on
the Earthlink Mail System for a great writeup on this
approach [7].)
The icaldir is designed to be tolerant of network transience.
If a client application begins to edit a calendar entry then
but crashes or loses network connectivity before completing
the transaction,
- there is no data loss and
- the lock file unambiguously goes stale and clients are
free to update that object.
III.) DIRECTORY LAYOUT
.../icaldir
|
+- vcalendar_header
|
+- vcalendar_footer
|
+- /cur
|
+- /tmp
III.A.) The vcalendar_header file
This stores any iCalendar header information that comes
before the iCalendar objects start; for example,
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//KDE::pim//icaldir//EN
III.B.) The vcalendar_footer file
Contains any iCalendar information that comes after the
iCalendar objects end; for example,
END:VCALENDAR
III.C.) The cur directory
This directory holds one file for every iCalendar object.
The file names are created according to Section IV.) FILE
NAMES.
III.D.) The tmp directory
A temporary working directory. No persistent data is stored here.
IV.) FILE NAMES
(YYYYMMDD|R)-host-pid-timeslice.type
where
YYYYMMDD : date event starts
R : event is recurring
host : hostname of client creating object
pid : process id of client program creating object
timeslice : exact time on client machine of object creation
type : ical object type; for example, journal
TODO: review how maildir names files; ie host/pid/time
TODO: hmmm, get ical text strings in file/names.h?? ;)
TODO: think about sort order that default ls returns
V.) TIME ZONES
The iCalendar RFC [1] mentions a time zone registry but
does not formally define one. The icaldir format
specifies the Olson tz database [8] as it's time zone
registry.
In icaldir, VTIMEZONE objects do not need to be
identifed, but all TZID codes must be valid codes from
the Olson tz database.
Per RFC2445, all TZID must be prefixed with the solidus
character (aka forward slash); for example:
TZID=/America/New_York
TODO: Read through Unicode "Common Locale Data Repository" docs.
http://www.unicode.org/cldr/
Someone did a big chunk of work analyzing Olson db for
the CLDR project: see
http://www.unicode.org/cldr/data/docs/design/formatting/time_zone_localization.html
TODO: What about Windows?
Apparantly, there are registry keys
(http://wiki.osafoundation.org/bin/view/Journal/JeffreyHarris20041119):
SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Time Zones
SYSTEM\\CurrentControlSet\\Control\\TimeZoneInformation
The first one is a listing of all timezones that the operating
system knows about and the second one is detail information on
what the current timezone is set to.
TODO: What about OSX? Double check it comes with zoneinfo file.
TODO: read: http://www.chronos-st.org/Discovering the Local Time Zone--Why It's a Hard Problem.html
** This is a good doc! **
TODO: What about using "POSIX time zone rule literal"?
ref: http://www.chronos-st.org/Installation.html
VI.) OPERATIONS
VI.A.) Create a New Calendar Object
1. Generate a file name (follow same rules as maildir).
2. Write file to tmp directory using this file name.
Use the file name as object's iCalendar UID property.
3. Move file from tmp to cur directory.
VI.B.) Update a Calendar Object
1. Create new file in tmp with same filename as event plus a
.lock extension. The file has one line, the refresh
interval at which you will touch this file. This
operation should fail is a file with this name already
exists.
2. If lock file creation is successful, go to step 4.
3. The iCalendar object may be in use by another process.
Read the refresh interval in the existing lock file and if
the file has not been touched for this many seconds, the
lock is stale. Overwrite the lock file and goto step 4.
Otherwise, either stop or wait refresh interval seconds
and go to step 1.
4. Create a temporary file in tmp with file name using
file name rules below. Goto step 5.
5. Write the new version of the event to this tmp file. If
this process takes longer than the lock refresh you
specified in step 1, touch the file before the interval
elapses. If you lose network connectivity for more than
the refresh interval, go back to step 1.
6. Move file from tmp to cur.
7. Delete the .lock file.
VI.C.) Delete a Calendar Object
1. Delete the object's file from the cur directory.
Note that since an update transaction is effectively an
atomic delete + insert that it is legal for one client
process to reinsert an event after another process has
deleted it.
TODO: Get lock when deleting?
VI.D.) Change Calendar Object StartDate
Since the start date is in an objects file name, changing the
start date requires a rename.
This is a delete + insert.
VII.) CONTACT
Please send any comments on this document to Mark
Bucciarelli, c/o the KDE PIM mailing list <a
href="mailto:[email protected]">[email protected]</a>.
-----------------------------------------------------------------
SECTION TODO'S
- mention possibility of multiple ical version within
same icaldir (Ingo). Give example of min/max syntax
of VERSION tag.
- specify that recurrence expections go in same file.
From Rienhold's email:
If you change only one instance of a recurring
event, the RFC-compliant way is to generate an event
with the same UID, but a RECURRENCE-ID that
indicates which item of the recurring sequence is
replaced by that other event.
E.g. a recurring event (daily) has:
BEGIN:VEVENT
UID:KOrganizer-702267947.838
DTSTART:20051018T093000Z
DTEND:20051018T160000Z
SUMMARY:recurring sequence
RRULE;:FREQ=DAILY;COUNT=11;INTERVAL=1
END:VEVENT
If you want to change only tomorrow's event (e.g. move it from 9:30 UTC to
12:00 UTC), that would be
BEGIN:VEVENT
UID:KOrganizer-702267947.838
RECURRENCE-ID:20051019T093000Z
DTSTART:20051019T120000Z
DTEND:20051019T183000Z
SUMMARY:recurring sequence (one moved event)
END:VEVENT
-----------------------------------------------------------------
[1] http://www.ietf.org/rfc/rfc2445.txt
[2] http://pim.kde.org/korganizer
[3] http://www.apple.com/macosx/features/ical/
[4] http://www.mozilla.org/projects/calendar/sunbird.html
[5] http://phpicalendar.net/
[6] http://cr.yp.to/proto/maildir.html
[7] http://www.usenix.org/publications/library/proceedings/usits97/christenson.html
A Highly Scalable Electronic Mail Service Using Open Systems
Nick Christenson, Tim Bosserman, and David Beckemeyer
EarthLink Network, Inc.
[8] http://www.twinsun.com/tz/tz-link.htm
[9] ACID is an acronym for
Atomic : either all changes associated with a
transaction take place, or none do.
Consistent: the database is transformed from one valid
state to another valid state.
Isolated : a transaction's results are not visible to
other transactions until the transaction is
complete.
Durable : once committed, the results of a transaction
are permanent and survive future system and
media failures.