KDE PIM/Meetings/Osnabrueck 4/Proposal icaldir

From KDE Community Wiki

Osnabrück icaldir proposal


                             icaldir 



KDE PIM Team                                     Mark Bucciarelli
Request for Comments                             February 8, 2005
                                                

CONTENTS

    I.    PROBLEM 
    II.   SOLUTION
    III.  DIRECTORY LAYOUT 
    IV.   FILE NAMES
    V.    TIME ZONES
    VI.   OPERATIONS
    VII.  CONTACT


I.) PROBLEM

    This document proposes a maildir-like approach for storing 
    iCalendar data [2].  The motivation is to provide a scalable 
    way for client applications to coordinate concurrent updates 
    to iCalendar data.

    Currently, client programs that access iCalendar data (for 
    example, KDE's KOrganizer [2], Apple's iCal [3],  Mozilla's 
    Sunbird [4], the PHP iCalendar [5]) read and save iCalendar 
    data in one large file.  This works fine if there is one 
    program updating the calendar data but is cumbersome if 
    multiple clients are updating the same calendar data.

    Since all data is stored in a single file, each client must 
    lock the entire file to update one single event.  If one 
    client program wants to add an event for January 2006 and 
    another client is editing an old event in March 2005, one 
    must wait for the other one to finish.
    
    If an iCalendar file is changed, all client programs  
    displaying data from that file must rescan the entire file to 
    be sure they have the most current data.

    When editing a file, a program must either load the entire 
    file into memory and keep it there or reread the entire file 
    to find the right spot to edit.  For large calendars, either 
    of these approaches requires an inefficient use of system 
    memory.


II.) SOLUTION

    The solution is inspired by and borrows heavily from the 
    maildir format [6].  As maildir broke the one large mbox file 
    into many small files (one for each email), we break the 
    icalendar file into many small files, one for each icalendar 
    event.

    Client applications then create a lock that is specific to 
    the file they are editing.  Any client that creates a lock 
    must touch the file periodically to update it's attributes; 
    in this way, other clients can tell without question when a 
    lock file has become stale.  (See the 1997 USENIX paper on 
    the Earthlink Mail System for a great writeup on this 
    approach [7].)

    The icaldir is designed to be tolerant of network transience.  
    If a client application begins to edit a calendar entry then 
    but crashes or loses network connectivity before completing 
    the transaction,
    
        - there is no data loss and
          
        - the lock file unambiguously goes stale and clients are 
          free to update that object.


III.) DIRECTORY LAYOUT 


    .../icaldir
        |
        +- vcalendar_header
        |
        +- vcalendar_footer
        |
        +- /cur 
        |
        +- /tmp
    

III.A.) The vcalendar_header file

    This stores any iCalendar header information that comes 
    before the iCalendar objects start; for example,

        BEGIN:VCALENDAR
        VERSION:2.0
        PRODID:-//KDE::pim//icaldir//EN

III.B.) The vcalendar_footer file

    Contains any iCalendar information that comes after the 
    iCalendar objects end; for example,

        END:VCALENDAR 

III.C.) The cur directory

    This directory holds one file for every iCalendar object.  
    The file names are created according to Section IV.) FILE 
    NAMES.


III.D.) The tmp directory

    A temporary working directory.  No persistent data is stored here.



IV.) FILE NAMES


    (YYYYMMDD|R)-host-pid-timeslice.type

    where
        YYYYMMDD    : date event starts
        R           : event is recurring
        host        : hostname of client creating object
        pid         : process id of client program creating object
        timeslice   : exact time on client machine of object creation
        type        : ical object type; for example, journal


    TODO: review how maildir names files; ie host/pid/time
    TODO: hmmm, get ical text strings in file/names.h?? ;)
    TODO: think about sort order that default ls returns



V.) TIME ZONES

    
    The iCalendar RFC [1] mentions a time zone registry but
    does not formally define one.  The icaldir format 
    specifies the Olson tz database [8] as it's time zone 
    registry.

    In icaldir, VTIMEZONE objects do not need to be 
    identifed, but all TZID codes must be valid codes from 
    the Olson tz database.

    Per RFC2445, all TZID must be prefixed with the solidus 
    character (aka forward slash); for example:

        TZID=/America/New_York

    TODO: Read through Unicode "Common Locale Data Repository" docs.
          http://www.unicode.org/cldr/

          Someone did a big chunk of work analyzing Olson db for 
          the CLDR project: see 
          http://www.unicode.org/cldr/data/docs/design/formatting/time_zone_localization.html

    TODO: What about Windows?

          Apparantly, there are registry keys 
          (http://wiki.osafoundation.org/bin/view/Journal/JeffreyHarris20041119):

                SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Time Zones
                SYSTEM\\CurrentControlSet\\Control\\TimeZoneInformation

          The first one is a listing of all timezones that the operating 
          system knows about and the second one is detail information on 
          what the current timezone is set to. 

    TODO: What about OSX?  Double check it comes with zoneinfo file.

    TODO: read: http://www.chronos-st.org/Discovering the Local Time Zone--Why It's a Hard Problem.html

          ** This is a good doc! **

    TODO: What about using "POSIX time zone rule literal"?
          ref: http://www.chronos-st.org/Installation.html
            

VI.) OPERATIONS


VI.A.) Create a New Calendar Object

    1. Generate a file name (follow same rules as maildir).

    2. Write file to tmp directory using this file name.
       Use the file name as object's iCalendar UID property.

    3. Move file from tmp to cur directory.


VI.B.) Update a Calendar Object

    1. Create new file in tmp with same filename as event plus a 
       .lock extension.  The file has one line, the refresh 
       interval at which you will touch this file.  This 
       operation should fail is a file with this name already 
       exists.

    2. If lock file creation is successful, go to step 4.

    3. The iCalendar object may be in use by another process.  

       Read the refresh interval in the existing lock file and if 
       the file has not been touched for this many seconds, the 
       lock is stale.  Overwrite the lock file and goto step 4.  

       Otherwise, either stop or wait refresh interval seconds  
       and go to step 1.

    4. Create a temporary file in tmp with file name using 
       file name rules below.  Goto step 5.

    5. Write the new version of the event to this tmp file.  If 
       this process takes longer than the lock refresh you 
       specified in step 1, touch the file before the interval 
       elapses.  If you lose network connectivity for more than 
       the refresh interval, go back to step 1.

    6. Move file from tmp to cur.

    7. Delete the .lock file.


VI.C.) Delete a Calendar Object

    1. Delete the object's file from the cur directory.

    Note that since an update transaction is effectively an 
    atomic delete + insert that it is legal for one client 
    process to reinsert an event after another process has 
    deleted it.

    TODO: Get lock when deleting?

VI.D.) Change Calendar Object StartDate

    Since the start date is in an objects file name, changing the 
    start date requires a rename.

    This is a delete + insert.

VII.) CONTACT

    Please send any comments on this document to Mark 
    Bucciarelli, c/o the KDE PIM mailing list <a 
    href="mailto:[email protected]">[email protected]</a>.

-----------------------------------------------------------------
SECTION TODO'S

    - mention possibility of multiple ical version within 
      same icaldir (Ingo).  Give example of min/max syntax 
      of VERSION tag.

    - specify that recurrence expections go in same file. 
      From Rienhold's email:

        If you change only one instance of a recurring
        event, the RFC-compliant way is to generate an event
        with the same UID, but a RECURRENCE-ID that
        indicates which item of the recurring sequence is
        replaced by that other event.

        E.g. a recurring event (daily) has:
          BEGIN:VEVENT
          UID:KOrganizer-702267947.838
          DTSTART:20051018T093000Z
          DTEND:20051018T160000Z
          SUMMARY:recurring sequence
          RRULE;:FREQ=DAILY;COUNT=11;INTERVAL=1
          END:VEVENT

        If you want to change only tomorrow's event (e.g. move it from 9:30 UTC to
        12:00 UTC), that would be
          BEGIN:VEVENT
          UID:KOrganizer-702267947.838
          RECURRENCE-ID:20051019T093000Z
          DTSTART:20051019T120000Z
          DTEND:20051019T183000Z
          SUMMARY:recurring sequence (one moved event)
          END:VEVENT
-----------------------------------------------------------------


[1] http://www.ietf.org/rfc/rfc2445.txt

[2] http://pim.kde.org/korganizer

[3] http://www.apple.com/macosx/features/ical/

[4] http://www.mozilla.org/projects/calendar/sunbird.html

[5] http://phpicalendar.net/

[6] http://cr.yp.to/proto/maildir.html

[7] http://www.usenix.org/publications/library/proceedings/usits97/christenson.html
    A Highly Scalable Electronic Mail Service Using Open Systems
    Nick Christenson, Tim Bosserman, and David Beckemeyer
    EarthLink Network, Inc.

[8] http://www.twinsun.com/tz/tz-link.htm

[9] ACID is an acronym for

        Atomic    : either all changes associated with a 
                    transaction take place, or none do.

        Consistent: the database is transformed from one valid 
                    state to another valid state.

        Isolated  : a transaction's results are not visible to 
                    other transactions until the transaction is 
                    complete.

        Durable   : once committed, the results of a transaction 
                    are permanent and survive future system and 
                    media failures.