Jump to content

Projects/Nepomuk/SystemLinkHandling: Difference between revisions

From KDE Community Wiki
Vhanda (talk | contribs)
Created the page
 
Vhanda (talk | contribs)
Added the alternative solution
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
This is a page documenting how we should be handling system links in Nepomuk.
In Nepomuk, we always store the full url of the file. While this makes our lives easier in certain cases, it also makes handling stuff like system links really hard. This page documents how we deal with system links.
 
We always try to store the full qualified name of the URL. That means that given the following directory structure
 
<tt>
DirA/FolderA <br\>
DirA/LinkA  ---> DirB <br\>
DirB/FolderB <br\>
DirB/FolderB/FileC <br\>
</tt>
 
We would save save the URL of <tt>DirA/LinkA/FolderB/FileC</tt> as <tt>DirB/FolderB/FileC</tt>.


= File Watcher =
= File Watcher =
Line 11: Line 22:
= File Indexer =
= File Indexer =


How should the url of the file be saved? As the orignal url or the one with the system link? If we only decide to save the full url, then how do we deal with IndexCleaner? It would delete the file metadata.
When saving the URL of the file, we save the canoncialFilePath. However, when testing if the file should be saved one probably should test its relative URL.
 
How does one avoid cycles in BasicIndexingQueue?
 
== Index Cleaner ==
 
If we only save the canonical file path, then the index cleaner will clear out the metadata of those resources when it is run. How do we counter that? We will need some way to map the canonical file URL to its many possible relative URLs and check if those files should have been indexed before remove the metadata. This cannot be done in a query. It will have to be done manually. That will result in large amount of CPU consumption in the cases when the number of files in the system linked directory are very large. Eg - The user has a system link to /media/LargeMusicDatabase in their /home/ directory.


= Resource =
= Resource =


We need to be able to initialize a resource with both the file urls. How does one go about doing this?
We need to be able to initialize a resource with all possible combinations of file URLs. How does one go about doing this?
 
When given <tt>Resource("DirA/LinkA/FolderB/FileC")</tt> we would need to translate it to <tt>DirB/FolderB/FileC</tt>. This can easily be done using <tt>QFileInfo::canonicalFilePath()</tt>. How would we do the inverse? Also there might be many URLs whose canonical file path is <tt>DirB/FolderB/FileC</tt>. We don't really need to ever do the inverse. Do we?
 
== ResourceManager cache ==
 
The <tt>ResourceManager</tt> has a cache which maps the <tt>nie:url</tt> to the resource URI. Should we be storing the system link URLs in this cache? It doesn't seem like a good idea, cause then we will have to keep the cache up to date, which is quite hard.
 
= Alternative Solution =
 
One really neat solution would be to not store the nie:url at all, and only rely on <tt>nfo:fileName</tt> and <tt>nie:isPartOf</tt>. That way system link handling might be easier. Might.
 
That will however only happen with KDE 4.11. It's too big of a change to introduce in KDE 4.10.

Latest revision as of 17:50, 3 December 2012

In Nepomuk, we always store the full url of the file. While this makes our lives easier in certain cases, it also makes handling stuff like system links really hard. This page documents how we deal with system links.

We always try to store the full qualified name of the URL. That means that given the following directory structure

DirA/FolderA <br\> DirA/LinkA ---> DirB <br\> DirB/FolderB <br\> DirB/FolderB/FileC <br\>

We would save save the URL of DirA/LinkA/FolderB/FileC as DirB/FolderB/FileC.

File Watcher

We should not be installing watches twice. How do we go about doing this? There are a lot of edge cases.

Some of the crazy cases from the top of my head -

1. /media/disk/systemLink -> /home/vishesh/Music/SomeFolder/. When one unmounts /media/disk, one shouldn't remove the watches for /home/vishesh/Music/SomeFolder.

File Indexer

When saving the URL of the file, we save the canoncialFilePath. However, when testing if the file should be saved one probably should test its relative URL.

How does one avoid cycles in BasicIndexingQueue?

Index Cleaner

If we only save the canonical file path, then the index cleaner will clear out the metadata of those resources when it is run. How do we counter that? We will need some way to map the canonical file URL to its many possible relative URLs and check if those files should have been indexed before remove the metadata. This cannot be done in a query. It will have to be done manually. That will result in large amount of CPU consumption in the cases when the number of files in the system linked directory are very large. Eg - The user has a system link to /media/LargeMusicDatabase in their /home/ directory.

Resource

We need to be able to initialize a resource with all possible combinations of file URLs. How does one go about doing this?

When given Resource("DirA/LinkA/FolderB/FileC") we would need to translate it to DirB/FolderB/FileC. This can easily be done using QFileInfo::canonicalFilePath(). How would we do the inverse? Also there might be many URLs whose canonical file path is DirB/FolderB/FileC. We don't really need to ever do the inverse. Do we?

ResourceManager cache

The ResourceManager has a cache which maps the nie:url to the resource URI. Should we be storing the system link URLs in this cache? It doesn't seem like a good idea, cause then we will have to keep the cache up to date, which is quite hard.

Alternative Solution

One really neat solution would be to not store the nie:url at all, and only rely on nfo:fileName and nie:isPartOf. That way system link handling might be easier. Might.

That will however only happen with KDE 4.11. It's too big of a change to introduce in KDE 4.10.