Jump to content

PIM/Akonadi/VirtualCollections: Difference between revisions

From KDE Community Wiki
Vkrause (talk | contribs)
m Add one more mailinglist thread.
Ochurlaud (talk | contribs)
m 24 revisions imported
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
All about virtual resources, virtual collections and searching. So far mostly a brain dump of ideas and requirements so we can come up with a nice design for all this. It is based on previous discussions for this topic, those are linked in the last section for reference.
All about virtual resources, virtual collections and searching. So far mostly a brain dump of ideas and requirements so we can come up with a nice design for all this. It is based on previous discussions about this topic, those are linked in the last section for reference.


== Terms ==
== Terms ==
Line 13: Line 13:
* An Akonadi collection can either be a storage collection or a virtual collection, i.e. it can either contain "real" items or links to items stored elsewhere, never both.
* An Akonadi collection can either be a storage collection or a virtual collection, i.e. it can either contain "real" items or links to items stored elsewhere, never both.
* An Akonadi resource can either contain storage collections or virtual collections, never both.
* An Akonadi resource can either contain storage collections or virtual collections, never both.
* We do not supported linked collections for now, due to lack of use-cases.




== Use-Cases for Virtual Collections ==
== Use-Cases for Virtual Collections ==


* Representing the results of a persistent search:
* Representing the results of a persistent search (see also [[PIM/Akonadi/SearchInfrastructure|Akonadi Search Infrastructure]]):
** in a per-session scope, e.g. the currently displayed week in KOrganizer
** in a per-session scope, e.g. the currently displayed week in KOrganizer
** in a per-application scope, e.g. user-defined persistent search folders in KMail
** in a per-application scope, e.g. user-defined persistent search folders in KMail
Line 24: Line 25:
* The Nepomuk tag resource
* The Nepomuk tag resource
** Representing the entire tag-tree
** Representing the entire tag-tree
 
*** It might make sense for the tag resource to insert multiple top-level collections into Collection::root(), with each top-level collection representing one mimetype. Child collections of that collection would represent tags, and the items in them would be items tagged with that tag. This has several complications like where to show items tagged with more than one tag. The simpler solution is to have a flat list of tag collections, and make items appear in both tag collections, but some day it might be good to have a navigatable hierarchy of tags in a tree.
 
== Ideas on the Search Infrastructure ==
 
* Delegation to resources that support search in the back-ends
** Requires query transformation
** Requires management for live searches
** Requires the ability to report search results (needs protocol extension)
* Query language still undecided, current options: XESAM, SPARQL
* Back-end still undecided, current options: XESAM, Nepomuk
 
[[PIM/Akonadi/SearchInfrastructure|Akonadi Search Infrastructure (concept)]]




Line 43: Line 33:


A virtual resource can exist in two forms:
A virtual resource can exist in two forms:
* Physical virtual resources: I.e. there is a conventional agent instance for that resource that operates like any other resource but instead creates and maintains virtual collections.
* Physical virtual resources: I.e. there is a conventional agent instance for that resource that operates like any other resource but instead creates and maintains virtual collections. (e.g., the Nepomuk tag resource)
** These have the same explicit life-time management by akonadi_control as any other resource.
** A ''Virtual'' capability could indicate that they are virtual collections.
** RIDs can only be edited by the resource itself, as usual
* Virtual virtual resources: These have no agent instance responsible for their collections, but are used for scoping search folders. The Akonadi server itself is responsible for maintaining those, while the applications decide about their lifetime.
* Virtual virtual resources: These have no agent instance responsible for their collections, but are used for scoping search folders. The Akonadi server itself is responsible for maintaining those, while the applications decide about their lifetime.
** Lifetime can be session or permanent
** Implicitly created and deleted by the Akonadi server whenever persistent searches are created or deleted.
** Resource identifier is provided by the client, and is used to indicate the visibility of the resulting search folder (session internal, app global, desktop wide, etc).
*** I thought that allowing the client to set the RID was considered bad? I thought the RID would be available though a method on the job that created the search when its result() is emitted? -- Steve.
** RIDs can be edited by anyone to simplify identification of search folders for clients
*** Where would this feature be used? I can see why a client would want to decide whether to show a particular newly available virtual collection, but why would it want to edit the RID?
** Not visible for the AgentManager, shouldn't be a problem as the current internal search resource isn't either.


We definitely need better terms for these...
We definitely need better terms for these...
Line 56: Line 56:
* Name: stays the same, for display purposes only
* Name: stays the same, for display purposes only
* Resource Id: see virtual resources
* Resource Id: see virtual resources
* Content mime-types: good question actually...
* Content mime-types: empty
** used for determining whether DnD or copy/paste etc. should be allowed for a given destination collection, however that has to be done differently for virtual destinations anyway.
** will make them implicitly hidden in ETMs with content mime type filters, which is desired
* Parent: Can be either Collection::root() or another virtual collection of the same resource, that is we support nested virtual collections.
* ACL: extend by link/unlink operations
* ACL: extend by link/unlink operations
** Item creation: not allowed, would lack a storage location
** Item creation: not allowed, would lack a storage location
Line 62: Line 65:
** Item deletion: in theory possible if the storage location allows it
** Item deletion: in theory possible if the storage location allows it
** Collection creation: depends on resources, useful for the Nepomuk tag resource, not for persistent search folders
** Collection creation: depends on resources, useful for the Nepomuk tag resource, not for persistent search folders
*** Why is this useful for the Tag resource? Presumably, a tag might be created by either tagging an item, or by creating a tag without tagging an item. Perhaps by a TagCreateJob. Either way, the server would create the tag collection if it doesn't already exist, right? If the user was able to select 'Create tag...' from a context menu somewhere, that wouldn't be a CollectionCreateJob, but a TagCreateJob.
** Collection modification
** Collection modification
*** Name: no problem
*** Name: no problem
Line 67: Line 71:
*** Query: requires special support in the server, but no principal problem
*** Query: requires special support in the server, but no principal problem
*** Parent (that is moving): tricky for the tag resource, not really a problem for search folders, moving to non-virtual resources needs to be prevented.
*** Parent (that is moving): tricky for the tag resource, not really a problem for search folders, moving to non-virtual resources needs to be prevented.
**** How tricky this is for the tag resource depends on whether tag collections can be nested, and that's a whole other can of worms.
*** Any other attribute: no problem
*** Any other attribute: no problem
** Collection deletion: no effects on stored items/collections, so no problem
** Collection deletion: no effects on stored items/collections, so no problem
Line 82: Line 87:
** Useful for the tag resource
** Useful for the tag resource
** Makes no sense for search folders
** Makes no sense for search folders
*** The reason for this is that any item which should match the search will already be in the search folder. linking any random item to a search folder will not make it contain the search term.
* Move/Copy/DnD a "link" out of a virtual collection into a storage one: That would mean moving/copying the item to a different storage location, not changing anything regarding the actual linking.
* Move/Copy/DnD a "link" out of a virtual collection into a storage one: That would mean moving/copying the item to a different storage location, not changing anything regarding the actual linking.
* Moving/Copying/DnDing links between two virtual collections
* Moving/Copying/DnDing links between two virtual collections
Line 90: Line 96:


That requires accessing the ACL of the actual storage collection.
That requires accessing the ACL of the actual storage collection.
* It should be possible to create a search for items matching the term 'giraffe' and deleting the matched items. This could be a 'delete items' action, whereas clearing the search would be a 'remove search folder' action.


==== Operations on the Virtual Collection ====
==== Operations on the Virtual Collection ====
Line 97: Line 105:
*** Only for display purposes anyway for search folders
*** Only for display purposes anyway for search folders
*** Rename the tag in case of the tag resource?
*** Rename the tag in case of the tag resource?
**** Should check with Nepomuk people about whether tags should be mutable. It's possible that they shouldn't be. That might have unintended consequences. Tags may be uris to Nepomuk. In that case, the operation would be 1) Get items tagged with X 2) Tag all resulting items with Y 3) Remove X tag from resulting items.
* Moving
* Moving
** Only for display purposes for search folders
** Only for display purposes for search folders
*** What does this mean?
** Change the tag hierarchy?
** Change the tag hierarchy?
* Deletion
* Deletion
** Delete persistent search
** Delete persistent search
** Delete tag
** Delete tag
** Both of these need to have a clear distinction between actions that will delete the items, and actions that will not delete the items.
==== Usage in ETM ====
* hidden by default due to empty content mime-types field
* can be explicitly subscribed by using the virtual resource identifier, so applications can show only the session-local/application-local/type-global virtual collections relevant to them
** that does map nicely to search folders, not to the tag resource though...
*** If there is only one global tag resource things are more complicated than if there can be many. One global tag resource would have to contain all linked items for all mimetypes. If there can be multiple tag resources, one per mimetype, it may be easier for ETM to decide whether to present it to the user.
One proposal was to give particular virtual resources special RIDs with double underscores, so we could have "__tag_rfc822__", and the client would instruct the ETM to include the resource with that identifier.


== Open Questions ==
== Open Questions ==


* RID usage in virtual collections
* RID usage in virtual collections
* How does the ETM hide virtual collections by default?
** Using search terms for RIDs is problematic because search terms can change during for example autocompletion with a lineedit -- Steve
* What are the content mime-types of a virtual collection?
* What about the semantics of Item::parentCollection() in the context of virtual collections, e.g. when used in ItemFetchJob and Monitor?
** Tricky. The fetching of items in ETM is currently implemented in a way that allows putting items into the Collection that the job was started with. Ie, it doesn't currently use parentCollection. So it should be possible to fetch linked items in virtual collections and insert them into the model where they are supposed to go. For modify jobs, the model would need to emit item changed for each model index representing a link to an item. This is already done IIRC. For moves and insertions the ETM uses parentCollection to determine the destination/source collection, but moves are not relevant to linked items. To them, a move is a unlink/link and a insert is a link. Removal is an unlink.
* ETM needs to explicitly show virtual resources monitored in its Monitor, regardless of whether the mimetype matches the filter.
* ETM may need to maintain a separate internal data structure for links.
* The virtual resource sections sounds scary...
* How should we handle different item types in the tag resource? Applications probably only want one...
 
== Tasks ==
 
Concrete tasks extracted from all the chaos above, for implementing virtual collection support. Topologically sorted by dependencies on each other.
 
{| class="sortable" border="1" cellpadding="5" cellspacing="0" style="border: gray solid 1px; border-collapse: collapse; text-align: left; width: 100%;"
|- style="background: #ececec; white-space:nowrap;"
! Status !! Item !! Description !! Contact
{{FeatureDone|Extend collection ACLs|Add CanLinkItem/CanUnlinkItem flags|Volker}}
{{FeatureDone|Add virtual capability|Resource capability, propagate to Akonadi server and store it there|Volker}}
{{FeatureDone|Fix item retrieval|Make use of the knowledge of what is a virtual and what is a storage collection|Volker}}
{{FeatureTodo|Move query from RID to attribute|Migrate legacy search folders, adapt persistent search creation|}}
{{FeatureTodo|Virtual virtual collections|Lifetime management, client-side provided identifiers|}}
{{FeatureDone|Observer|Support link/unlink operations|Volker}}
{{FeatureDone|Adapt tag resource|Set the correct ACLs, support link/unlink operations|Volker}}
{{FeatureInProgress|Interactions|Adapt StandardActionManager/PasteHelper/ETM etc. to respect the extended ACLs|Volker}}
{{FeatureTodo|Content mime-types|Report as empty, adapt ETM filtering if needed|}}
{{FeatureTodo|ETM|Allow subscription to several arbitrary virtual resources|Steve}}
|}


== References ==
== References ==

Latest revision as of 13:01, 11 March 2016

All about virtual resources, virtual collections and searching. So far mostly a brain dump of ideas and requirements so we can come up with a nice design for all this. It is based on previous discussions about this topic, those are linked in the last section for reference.

Terms

  • Storage collection: An Akonadi collections that represents a storage location for items.
  • Virtual collection: An Akonadi collection that does represent an arbitrary set of items stored in any storage collection. Virtual collections are implemented by "linking" to existing items, comparable to symlinks in a file system.
  • Virtual resource: An Akonadi resource that creates virtual collections instead of storage collections.

Assumptions

Assumptions all the following is based on.

  • An Akonadi collection can either be a storage collection or a virtual collection, i.e. it can either contain "real" items or links to items stored elsewhere, never both.
  • An Akonadi resource can either contain storage collections or virtual collections, never both.
  • We do not supported linked collections for now, due to lack of use-cases.


Use-Cases for Virtual Collections

  • Representing the results of a persistent search (see also Akonadi Search Infrastructure):
    • in a per-session scope, e.g. the currently displayed week in KOrganizer
    • in a per-application scope, e.g. user-defined persistent search folders in KMail
    • in a global scope, e.g. user-defined persistent search folders shared between KMail, Mailody and LionMail.
  • The Nepomuk tag resource
    • Representing the entire tag-tree
      • It might make sense for the tag resource to insert multiple top-level collections into Collection::root(), with each top-level collection representing one mimetype. Child collections of that collection would represent tags, and the items in them would be items tagged with that tag. This has several complications like where to show items tagged with more than one tag. The simpler solution is to have a flat list of tag collections, and make items appear in both tag collections, but some day it might be good to have a navigatable hierarchy of tags in a tree.


Virtual Resources

A resource that creates virtual collections instead of storage collections.

A virtual resource can exist in two forms:

  • Physical virtual resources: I.e. there is a conventional agent instance for that resource that operates like any other resource but instead creates and maintains virtual collections. (e.g., the Nepomuk tag resource)
    • These have the same explicit life-time management by akonadi_control as any other resource.
    • A Virtual capability could indicate that they are virtual collections.
    • RIDs can only be edited by the resource itself, as usual
  • Virtual virtual resources: These have no agent instance responsible for their collections, but are used for scoping search folders. The Akonadi server itself is responsible for maintaining those, while the applications decide about their lifetime.
    • Lifetime can be session or permanent
    • Implicitly created and deleted by the Akonadi server whenever persistent searches are created or deleted.
    • Resource identifier is provided by the client, and is used to indicate the visibility of the resulting search folder (session internal, app global, desktop wide, etc).
      • I thought that allowing the client to set the RID was considered bad? I thought the RID would be available though a method on the job that created the search when its result() is emitted? -- Steve.
    • RIDs can be edited by anyone to simplify identification of search folders for clients
      • Where would this feature be used? I can see why a client would want to decide whether to show a particular newly available virtual collection, but why would it want to edit the RID?
    • Not visible for the AgentManager, shouldn't be a problem as the current internal search resource isn't either.

We definitely need better terms for these...

Virtual Collections

The following lists properties of "normal" Akonadi::Collection objects and maps them to the corresponding semantics of virtual collections:

  • UID: stays the same
  • RID: for application use (currently it holds the query, but that's a dirty hack) (search folders only, still used for its original purpose by the tag resource)
  • Name: stays the same, for display purposes only
  • Resource Id: see virtual resources
  • Content mime-types: empty
    • used for determining whether DnD or copy/paste etc. should be allowed for a given destination collection, however that has to be done differently for virtual destinations anyway.
    • will make them implicitly hidden in ETMs with content mime type filters, which is desired
  • Parent: Can be either Collection::root() or another virtual collection of the same resource, that is we support nested virtual collections.
  • ACL: extend by link/unlink operations
    • Item creation: not allowed, would lack a storage location
    • Item modification: in theory possible if the storage collection allows it
    • Item deletion: in theory possible if the storage location allows it
    • Collection creation: depends on resources, useful for the Nepomuk tag resource, not for persistent search folders
      • Why is this useful for the Tag resource? Presumably, a tag might be created by either tagging an item, or by creating a tag without tagging an item. Perhaps by a TagCreateJob. Either way, the server would create the tag collection if it doesn't already exist, right? If the user was able to select 'Create tag...' from a context menu somewhere, that wouldn't be a CollectionCreateJob, but a TagCreateJob.
    • Collection modification
      • Name: no problem
      • RID: not allowed once there is a actual resource behind it, not a problem when used by the application
      • Query: requires special support in the server, but no principal problem
      • Parent (that is moving): tricky for the tag resource, not really a problem for search folders, moving to non-virtual resources needs to be prevented.
        • How tricky this is for the tag resource depends on whether tag collections can be nested, and that's a whole other can of worms.
      • Any other attribute: no problem
    • Collection deletion: no effects on stored items/collections, so no problem
  • Query: for search folders only, should be moved into an dedicated attribute.

Note that virtual collections so far are not explicitly marked as such, to keep them as transparent as possible for the clients. However, since the interaction with them differs from that with storage collections, we need ways to distinguish them at least implicitly. All the below for example is covered by the additional ACL properties for linking/unlinking.

Interaction with Virtual Collections

This is about possible user interaction patterns with virtual collections.

Linking/Unlinking of Items

  • Copy/DnD a "real" item into a virtual collection: That would mean "linking" the item there.
    • Useful for the tag resource
    • Makes no sense for search folders
      • The reason for this is that any item which should match the search will already be in the search folder. linking any random item to a search folder will not make it contain the search term.
  • Move/Copy/DnD a "link" out of a virtual collection into a storage one: That would mean moving/copying the item to a different storage location, not changing anything regarding the actual linking.
  • Moving/Copying/DnDing links between two virtual collections
    • Makes no sense for search folders, they are read-only for the clients
    • useful for the tag resource though

Modification/Deletion of Items

That requires accessing the ACL of the actual storage collection.

  • It should be possible to create a search for items matching the term 'giraffe' and deleting the matched items. This could be a 'delete items' action, whereas clearing the search would be a 'remove search folder' action.

Operations on the Virtual Collection

  • Modifications
    • Renaming
      • Only for display purposes anyway for search folders
      • Rename the tag in case of the tag resource?
        • Should check with Nepomuk people about whether tags should be mutable. It's possible that they shouldn't be. That might have unintended consequences. Tags may be uris to Nepomuk. In that case, the operation would be 1) Get items tagged with X 2) Tag all resulting items with Y 3) Remove X tag from resulting items.
  • Moving
    • Only for display purposes for search folders
      • What does this mean?
    • Change the tag hierarchy?
  • Deletion
    • Delete persistent search
    • Delete tag
    • Both of these need to have a clear distinction between actions that will delete the items, and actions that will not delete the items.

Usage in ETM

  • hidden by default due to empty content mime-types field
  • can be explicitly subscribed by using the virtual resource identifier, so applications can show only the session-local/application-local/type-global virtual collections relevant to them
    • that does map nicely to search folders, not to the tag resource though...
      • If there is only one global tag resource things are more complicated than if there can be many. One global tag resource would have to contain all linked items for all mimetypes. If there can be multiple tag resources, one per mimetype, it may be easier for ETM to decide whether to present it to the user.

One proposal was to give particular virtual resources special RIDs with double underscores, so we could have "__tag_rfc822__", and the client would instruct the ETM to include the resource with that identifier.

Open Questions

  • RID usage in virtual collections
    • Using search terms for RIDs is problematic because search terms can change during for example autocompletion with a lineedit -- Steve
  • What about the semantics of Item::parentCollection() in the context of virtual collections, e.g. when used in ItemFetchJob and Monitor?
    • Tricky. The fetching of items in ETM is currently implemented in a way that allows putting items into the Collection that the job was started with. Ie, it doesn't currently use parentCollection. So it should be possible to fetch linked items in virtual collections and insert them into the model where they are supposed to go. For modify jobs, the model would need to emit item changed for each model index representing a link to an item. This is already done IIRC. For moves and insertions the ETM uses parentCollection to determine the destination/source collection, but moves are not relevant to linked items. To them, a move is a unlink/link and a insert is a link. Removal is an unlink.
  • ETM needs to explicitly show virtual resources monitored in its Monitor, regardless of whether the mimetype matches the filter.
  • ETM may need to maintain a separate internal data structure for links.
  • The virtual resource sections sounds scary...
  • How should we handle different item types in the tag resource? Applications probably only want one...

Tasks

Concrete tasks extracted from all the chaos above, for implementing virtual collection support. Topologically sorted by dependencies on each other.

Status Item Description Contact
DONE Extend collection ACLs Add CanLinkItem/CanUnlinkItem flags
DONE Add virtual capability Resource capability, propagate to Akonadi server and store it there
DONE Fix item retrieval Make use of the knowledge of what is a virtual and what is a storage collection
TO DO Move query from RID to attribute Migrate legacy search folders, adapt persistent search creation
TO DO Virtual virtual collections Lifetime management, client-side provided identifiers
DONE Observer Support link/unlink operations
DONE Adapt tag resource Set the correct ACLs, support link/unlink operations
IN PROGRESS Interactions Adapt StandardActionManager/PasteHelper/ETM etc. to respect the extended ACLs
TO DO Content mime-types Report as empty, adapt ETM filtering if needed
TO DO ETM Allow subscription to several arbitrary virtual resources <Steve>

References

Past discussions about this topic: