Projects/Nepomuk/FileIndexing: Difference between revisions
Appearance
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
This page attempts to catalogue the list of files formats Nepomuk supports, and what formats are remaining. | This page attempts to catalogue the list of files formats Nepomuk supports, and what formats are remaining. | ||
= Mime Types = | |||
|- | {| class="wikitable" style="text-align: center;" | ||
! MimeType | |||
! Status | |||
! Plugin | |||
! Comments | |||
|- | |- | ||
| | | image/jpeg | ||
| | | Testing | ||
| | | Exiv2Extractor | ||
| No Comments | | No Comments | ||
|- | |||
| image/png | |||
| Testing | |||
| Exiv2Extractor | |||
| - | |||
| | |- | ||
| image/gif | |||
| ? | |||
| ? | |||
|- | |||
| image/exif | |||
|- | |||
| image/tiff | |||
|- | |||
| image/bmp | |||
|- | |||
| image/svg | |||
|- | |||
| audio/mpeg | |||
| Requires Polish | |||
| Taglib Extractor | |||
|- | |||
| audio/mp4 | |||
|- | |||
| audio/wav | |||
|- | |||
| audio/x-aiff | |||
|- | |||
| application/pdf | |||
| Implemented - Requires Testing | |||
| PopplerExtractor | |||
| --- | |||
|- | |||
| Other Office Formats | |||
| ? | |||
|- | |||
| Ebook Formats | |||
| ? | |||
|- | |||
| Archives | |||
| ? | |||
|- | |||
| video/mpeg | |||
| Testing | |||
| FFmpeg | |||
|- | |||
| video/x-msvideo | |||
| Testing | |||
| FFmpeg | |||
|- | |||
| Other video formats | |||
| ? | |||
|- | |||
| text/plain | |||
| Plain Text Extractor | |||
| Implemented | |||
| This should be extended to support other text files | |||
|} | |||
= Notes = | |||
== Documents == | == Documents == | ||
=== Microsoft Formats === | === Microsoft Formats === | ||
Line 74: | Line 105: | ||
=== Open document formats === | === Open document formats === | ||
ODF | ODF - Strigi had their own inbuilt. What are our options? | ||
=== Ebook formats === | === Ebook formats === | ||
* epub - Strigi reuses their ODF parser for epub | * epub - Strigi reuses their ODF parser for epub. We could use libepub | ||
* mobi | * mobi | ||
* rtf | * rtf | ||
* lrf | * lrf | ||
Checkout what Okular uses for all these files and use that. | |||
=== Other === | === Other === | ||
Line 91: | Line 122: | ||
== Archives == | == Archives == | ||
We just need to add the <tt>nfo:Archive</tt> type based on the mimetype. Is there anything else that we can add? | |||
== Emails == | == Emails == | ||
* mbox format - | * mbox format - How? Something from pim? | ||
Latest revision as of 01:23, 6 November 2012
This page attempts to catalogue the list of files formats Nepomuk supports, and what formats are remaining.
Mime Types
MimeType | Status | Plugin | Comments |
---|---|---|---|
image/jpeg | Testing | Exiv2Extractor | No Comments |
image/png | Testing | Exiv2Extractor | - |
image/gif | ? | ? | |
image/exif | |||
image/tiff | |||
image/bmp | |||
image/svg | |||
audio/mpeg | Requires Polish | Taglib Extractor | |
audio/mp4 | |||
audio/wav | |||
audio/x-aiff | |||
application/pdf | Implemented - Requires Testing | PopplerExtractor | --- |
Other Office Formats | ? | ||
Ebook Formats | ? | ||
Archives | ? | ||
video/mpeg | Testing | FFmpeg | |
video/x-msvideo | Testing | FFmpeg | |
Other video formats | ? | ||
text/plain | Plain Text Extractor | Implemented | This should be extended to support other text files |
Notes
Documents
Microsoft Formats
DOC - OLE 2 Compound Document and Office Open XML - Custom parser by Strigi. What can we use? <br\> XSL - http://qt-project.org/wiki/Handling_Microsoft_Excel_file_format <br\> spreadsheet formats <br\>
Maybe we can use some libreoffice or calligra libraries?
Open document formats
ODF - Strigi had their own inbuilt. What are our options?
Ebook formats
- epub - Strigi reuses their ODF parser for epub. We could use libepub
- mobi
- rtf
- lrf
Checkout what Okular uses for all these files and use that.
Other
- lyx
- tex
- cbz - Comic books
Archives
We just need to add the nfo:Archive type based on the mimetype. Is there anything else that we can add?
Emails
- mbox format - How? Something from pim?