KDE PIM/Meetings/Osnabrueck 4/Proposal icaldir
Appearance
Osnabrück icaldir proposal
icaldir KDE PIM Team Mark Bucciarelli Request for Comments February 8, 2005 CONTENTS I. PROBLEM II. SOLUTION III. DIRECTORY LAYOUT IV. FILE NAMES V. TIME ZONES VI. OPERATIONS VII. CONTACT I.) PROBLEM This document proposes a maildir-like approach for storing iCalendar data [2]. The motivation is to provide a scalable way for client applications to coordinate concurrent updates to iCalendar data. Currently, client programs that access iCalendar data (for example, KDE's KOrganizer [2], Apple's iCal [3], Mozilla's Sunbird [4], the PHP iCalendar [5]) read and save iCalendar data in one large file. This works fine if there is one program updating the calendar data but is cumbersome if multiple clients are updating the same calendar data. Since all data is stored in a single file, each client must lock the entire file to update one single event. If one client program wants to add an event for January 2006 and another client is editing an old event in March 2005, one must wait for the other one to finish. If an iCalendar file is changed, all client programs displaying data from that file must rescan the entire file to be sure they have the most current data. When editing a file, a program must either load the entire file into memory and keep it there or reread the entire file to find the right spot to edit. For large calendars, either of these approaches requires an inefficient use of system memory. II.) SOLUTION The solution is inspired by and borrows heavily from the maildir format [6]. As maildir broke the one large mbox file into many small files (one for each email), we break the icalendar file into many small files, one for each icalendar event. Client applications then create a lock that is specific to the file they are editing. Any client that creates a lock must touch the file periodically to update it's attributes; in this way, other clients can tell without question when a lock file has become stale. (See the 1997 USENIX paper on the Earthlink Mail System for a great writeup on this approach [7].) The icaldir is designed to be tolerant of network transience. If a client application begins to edit a calendar entry then but crashes or loses network connectivity before completing the transaction, - there is no data loss and - the lock file unambiguously goes stale and clients are free to update that object. III.) DIRECTORY LAYOUT .../icaldir | +- vcalendar_header | +- vcalendar_footer | +- /cur | +- /tmp III.A.) The vcalendar_header file This stores any iCalendar header information that comes before the iCalendar objects start; for example, BEGIN:VCALENDAR VERSION:2.0 PRODID:-//KDE::pim//icaldir//EN III.B.) The vcalendar_footer file Contains any iCalendar information that comes after the iCalendar objects end; for example, END:VCALENDAR III.C.) The cur directory This directory holds one file for every iCalendar object. The file names are created according to Section IV.) FILE NAMES. III.D.) The tmp directory A temporary working directory. No persistent data is stored here. IV.) FILE NAMES (YYYYMMDD|R)-host-pid-timeslice.type where YYYYMMDD : date event starts R : event is recurring host : hostname of client creating object pid : process id of client program creating object timeslice : exact time on client machine of object creation type : ical object type; for example, journal TODO: review how maildir names files; ie host/pid/time TODO: hmmm, get ical text strings in file/names.h?? ;) TODO: think about sort order that default ls returns V.) TIME ZONES The iCalendar RFC [1] mentions a time zone registry but does not formally define one. The icaldir format specifies the Olson tz database [8] as it's time zone registry. In icaldir, VTIMEZONE objects do not need to be identifed, but all TZID codes must be valid codes from the Olson tz database. Per RFC2445, all TZID must be prefixed with the solidus character (aka forward slash); for example: TZID=/America/New_York TODO: Read through Unicode "Common Locale Data Repository" docs. http://www.unicode.org/cldr/ Someone did a big chunk of work analyzing Olson db for the CLDR project: see http://www.unicode.org/cldr/data/docs/design/formatting/time_zone_localization.html TODO: What about Windows? Apparantly, there are registry keys (http://wiki.osafoundation.org/bin/view/Journal/JeffreyHarris20041119): SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Time Zones SYSTEM\\CurrentControlSet\\Control\\TimeZoneInformation The first one is a listing of all timezones that the operating system knows about and the second one is detail information on what the current timezone is set to. TODO: What about OSX? Double check it comes with zoneinfo file. TODO: read: http://www.chronos-st.org/Discovering the Local Time Zone--Why It's a Hard Problem.html ** This is a good doc! ** TODO: What about using "POSIX time zone rule literal"? ref: http://www.chronos-st.org/Installation.html VI.) OPERATIONS VI.A.) Create a New Calendar Object 1. Generate a file name (follow same rules as maildir). 2. Write file to tmp directory using this file name. Use the file name as object's iCalendar UID property. 3. Move file from tmp to cur directory. VI.B.) Update a Calendar Object 1. Create new file in tmp with same filename as event plus a .lock extension. The file has one line, the refresh interval at which you will touch this file. This operation should fail is a file with this name already exists. 2. If lock file creation is successful, go to step 4. 3. The iCalendar object may be in use by another process. Read the refresh interval in the existing lock file and if the file has not been touched for this many seconds, the lock is stale. Overwrite the lock file and goto step 4. Otherwise, either stop or wait refresh interval seconds and go to step 1. 4. Create a temporary file in tmp with file name using file name rules below. Goto step 5. 5. Write the new version of the event to this tmp file. If this process takes longer than the lock refresh you specified in step 1, touch the file before the interval elapses. If you lose network connectivity for more than the refresh interval, go back to step 1. 6. Move file from tmp to cur. 7. Delete the .lock file. VI.C.) Delete a Calendar Object 1. Delete the object's file from the cur directory. Note that since an update transaction is effectively an atomic delete + insert that it is legal for one client process to reinsert an event after another process has deleted it. TODO: Get lock when deleting? VI.D.) Change Calendar Object StartDate Since the start date is in an objects file name, changing the start date requires a rename. This is a delete + insert. VII.) CONTACT Please send any comments on this document to Mark Bucciarelli, c/o the KDE PIM mailing list <a href="mailto:[email protected]">[email protected]</a>. ----------------------------------------------------------------- SECTION TODO'S - mention possibility of multiple ical version within same icaldir (Ingo). Give example of min/max syntax of VERSION tag. - specify that recurrence expections go in same file. From Rienhold's email: If you change only one instance of a recurring event, the RFC-compliant way is to generate an event with the same UID, but a RECURRENCE-ID that indicates which item of the recurring sequence is replaced by that other event. E.g. a recurring event (daily) has: BEGIN:VEVENT UID:KOrganizer-702267947.838 DTSTART:20051018T093000Z DTEND:20051018T160000Z SUMMARY:recurring sequence RRULE;:FREQ=DAILY;COUNT=11;INTERVAL=1 END:VEVENT If you want to change only tomorrow's event (e.g. move it from 9:30 UTC to 12:00 UTC), that would be BEGIN:VEVENT UID:KOrganizer-702267947.838 RECURRENCE-ID:20051019T093000Z DTSTART:20051019T120000Z DTEND:20051019T183000Z SUMMARY:recurring sequence (one moved event) END:VEVENT ----------------------------------------------------------------- [1] http://www.ietf.org/rfc/rfc2445.txt [2] http://pim.kde.org/korganizer [3] http://www.apple.com/macosx/features/ical/ [4] http://www.mozilla.org/projects/calendar/sunbird.html [5] http://phpicalendar.net/ [6] http://cr.yp.to/proto/maildir.html [7] http://www.usenix.org/publications/library/proceedings/usits97/christenson.html A Highly Scalable Electronic Mail Service Using Open Systems Nick Christenson, Tim Bosserman, and David Beckemeyer EarthLink Network, Inc. [8] http://www.twinsun.com/tz/tz-link.htm [9] ACID is an acronym for Atomic : either all changes associated with a transaction take place, or none do. Consistent: the database is transformed from one valid state to another valid state. Isolated : a transaction's results are not visible to other transactions until the transaction is complete. Durable : once committed, the results of a transaction are permanent and survive future system and media failures.