◀ 10.1. Preservation in the Semantic Desktop

The Personal Preservation Pilot is connected to the PoF Framework and allows to preserve files if either the PoF Framework decides to do so or preservation is triggered manually by the user. The pilot allows this preservation embedded in the file system of the user, supported by the Semantic Desktop, allowing to easily connect a user's files to the PoF Framework's preservation workflow.

The first solution allows to preserve files if either the PoF Framework decides to do so or preservation is triggered manually by the user. The pilot allows this preservation embedded in the file system of the user, supported by the PIMO, allowing to easily connect a user's files to the PoF Framework's preservation process.

The PoF Framework accesses the CMIS (OAIS Content Management Interoperability Service) interface of the Semantic Desktop server to retrieve resources for preservation if it decides to preserve a resource (see for these details deliverable D8.3, the actual preservation process is not directly visible in the end-user interface of the pilot (however, as part of the pilot this is visible in the PoF Middleware dashboard).

As preservation is ideally done in the background, users will not be aware of this, therefore, we start with manual preservation to show the functionality in the Semantic Desktop.

10.1.1 Manual Preservation of Files

As use case shown in the video, the pilot implements also that a user is able to manually preserve resources. To do this, the user browses to a file and selects "Preserve" in the file's context menu in the file explorer. Preservation of the file is issued without further interaction of the user by the PoF Middleware storing it in the preservation system on the ForgetIT sandbox environment. If once preserved, the file is modified locally and then restored from the archive. The video shows the interaction embedded in the SemanticFileExplorer and PIMOCloud environment.

Pilot II

The implementation of the Preservation Value assessment in Pilot II allows to embed a Preservation Strategy into the Semantic Desktop. This Preservation Strategy is based on indicators from the Semantic Desktop and the Personal Information Model (PIMO) and is defined by users using policies and rules listed along the dimensions for assessment identified by ForgetIT. This is embedded in a user interface to define the Preservation Service Contract with a service provider including Preservation Levels for various Preservation Values.

Having Preservation Value assessment and individual Preservation Strategy now in place, Pilot II realises the full Preserve-or-Forget (PoF) Framework’s Preservation Preparation Workflow allowing the PoF Middleware to decide on preservation on its own terms, thus enabling Synergetic Preservation for the Active System Semantic Desktop.

These major extensions are presented in the following sections.

10.1.2 Motivation of Preservation Strategy in the Semantic Desktop

Whereas the focus of the Personal Preservation Pilot I was on manual preservation of re- sources, research and implementation done in ForgetIT now allows for a PV assessment of resources in order to enable Synergetic Preservation by the ForgetIT PoF Middleware connected to the Semantic Desktop.

In the ForgetIT PoF Framework this is done in the functional entity called Content Value Assessment defined in D8.5 (see also the Preservation Preparation Workflow).

This entity has the task of assessing resources wrt. Memory Buoyancy (MB) and Preservation Value (PV). In Pilot II we focus especially on the Preservation Value which reflects the long-term importance or relevance of a resource. This PV is then used by the PoF Middleware as a basis for making preservation decisions, e.g., if a resource should be preserved or how much should be invested in the preservation of a resource, i.e., which Preservation Level (see Section 2.5.1) should be used.

To allow an assessment of resources, first, it must be clarified what aspects in the assessment could possibly contribute to the long-term importance and then consider individual preferences of the user in a so-called Preservation Strategy (as addressed in D8.2 Section Preservation Strategies). The following Sections explain how Pilot II supports a Personal Preservation Strategy.

To ease the usability for the user, such a Preservation Strategy can be defined along specific dimensions for assessment which will be detailed in Section 2.4.1. The Pilot II allows to define Preservation Strategies in two ways: pre-sets based on dedicated user profiles (Section 2.4.2) or a combination of pre-sets and a more detailed customization using policies and rules (Section 2.4.3).

10.1.3 Preservation Strategy

The Preservation Strategy will be based on a set of policies and rules which support an assessment of the PV. In WP3, 6 dimensions for this assessment were identified (see deliverable D3.3) which are – from the point of view of the Semantic Desktop – relevant for the Personal Preservation Scenario. These are:

These are the dimensions identified for collecting evidences for the assessment and finally the calculation of the Preservation Value. Which evidences are used and with which weighting they contribute in policies and rules in the calculation to the PV is defined by a Preservation Strategy. The following sections detail the two realised approaches in Pilot II. For a detailed explanation of the calculation please refer to deliverable D3.4

10.1.4 Personal Preservation Strategy based on Personas

To help users who are new to a personal preservation service, it would be useful to have a predefined set of policies and rules defined for the 6 dimensions mentioned above, so that unexperienced users don’t need to care about defining a rather complex Preservation Strategy.

In the following, we explain this Preservation Strategy. Please note, that usually the user is not required to manually check all resources to be preserved once a Preservation Strategy is set.

The ForgetIT survey on personal preservation of photos conducted in WP2 and preliminary results published in [Wolters et al., 2015] and final results reported in deliverable D2.4, identified four personas representing attitudes towards personal Preservation Strategies. The personas are defined along two key preservation dimensions of “Loss” – the user is worried about losing important photos – and “Generations” – importance to the user of preserving important photos for future generations.

To reflect these persona perspectives in the Preservation Value assessment, corresponding profiles were created for the Preservation Value calculation. The profiles mirror the basic distinction between Curators and Filers by mainly basing the preservation suggestions on the investment spent by the former, and an emphasis on popularity and material quality by the latter. For the safety-conscious persona profiles, the algorithm tries to ensure a certain coverage of the different subsets (e.g., photo collections) of the material to be preserved.

This calculation is used in the WP9 evaluation scenario for personal preservation and results will be reported in upcoming deliverable D9.5. For this evaluation we will use hired participants which are unknown to us. They will spend only a limited time with the PIMO Photo Organization app. Therefore, their settings will be chosen based on an interview, as explained below:

In the first interview, the user will answer several questions used in the WP2 survey which will result in an identification of the respective persona. The specific persona is then set as Preservation Strategy in the PIMO5 options as shown in Figure 9. (For a commercial personal preservation service this could be accomplished by an online questionnaire embedded in the service contract settings (see Section 10.1.6)).

The list of Preservation Strategies available in the Personal Preservation Scenario. The Preservation Strategy was set to “Safe Curator”; now the time capsule can be invoked.
A user can change the Preservation Strategy profile in the PIMO5 options dialog. Pressing “Show time capsule” will present the time capsule to the user. This will be the setting for the WP9 final evaluation.

Once the Preservation Strategy is set, the so-called time capsule can be invoked by pressing the “time capsule” button. The selected Preservation Strategy is used for a new calculation of the Preservation Value and the time capsule view is opened as shown in the next Figure.

PIMO5 Time Capsule for evaluation purposes
The time capsule lists all photo collections with an indication of the ones that will be preserved (left side, gold marking) and those that will not be preserved (right hand). The user is able to manually remove or add photos from/to the preserved list (the respective direction is shown by the yellow arrow).

The time capsule gives an overview on the photo collections and the respective set of photos to be preserved (on left-hand side) and those which will not be preserved (on the right-hand side). Users are able to inspect these automated decisions and change the decision manually. This is done by clicking on a photo either to add or to remove it from the list of photos to be preserved. Changes done by the users are logged and used for evaluating the persona-based PV assessment.

The next section will describe an extended view of inspecting preservation decisions, which is not restricted to photo collections only.

10.1.5 Customization for Preservation Strategies

In the previous section, the Preservation Strategy has been based on personas defining strategies where the user is not able to influence details but only can change the persona itself. To allow a more fine-grained Preservation Strategy setting for the Semantic Desktop along the preservation policy and rules as proposed in D3.3 [ForgetIT, 2015b], finally, the following more detailed strategy setting has been realised in Pilot II as an extension of the aforementioned persona-only approach:

For the application scenario of Personal Information Management, several evidences or indicators were identified in the Semantic Desktop along the aforementioned 6 dimensions which could help in the assessment of the expected long-term benefit of a resource. Now, the Semantic Desktop allows one to adapt these to one’s preferences without requiring the user to be an expert in policy or rule management:

The Figure below shows the PIMO5 user interface allowing to set and and now also to change the Preservation Strategy applied to a user.

Personalizing a Preservation Strategy in PIMO5
Personalizing a Preservation Strategy in PIMO5: allowing to easily select/deselect those policies and rules which matter the most to the user.

Here, a more detailed choice of indicators (i.e., evidences from the Semantic Desktop usage and PIMO to be used as policy as well as rules) to be applied for calculating the Preservation Value of the user’s resources are shown. Each indicator is explained in human readable terms explaining the effects to the user. Each indicator is internally bound to a calculation or a rule used in the algorithm which is then switched on or off.

For better readability, the set of policies and rules currently used in Pilot II are listed below along the 6 dimensions. In parentheses, a short explanation of the effect of choosing an indicator is explained.

These policies and rules could be even more detailed, e.g., letting the user specify which types in the type-based heuristic are important. For this Personal Preservation Scenario we think that simplicity is important for an acceptance and take-up by the user.

As in the previous section, the user can choose a persona (as done in Figure 12) which describes him/her the best (or rather could be inferred by preference questions as done in the evaluation and then set automatically). Selecting a persona will load a default profile and set the respective check mark to those indicators which best fit the respective persona (including a pre-defined weighting).

Loading the Safe Curator persona Preservation Strategy in PIMO5
The Safe Curator persona Preservation Strategy has been loaded as a preset.

Once changed and saved, the settings are used in calculating the Preservation Values of the user’s resources in the periodic PV calculations in the Semantic Desktop.
This customized Preservation Strategy is part of a Preservation Service Contract which is explained in the following Section.

10.1.6 Preservation Service Contract

In addition to the Preservation Strategy, another detail to be considered in the Personal Preservation Pilot is which service infrastructure is used for preservation. In other words, who provides the actual service for preservation to the user and what are the conditions? As pointed out in deliverable D11.4, preservation could be offered as a service by a provider using a PoF Framework instance for preservation services and a subset of the Semantic Desktop infrastructure – most prominently the Photo Organization – as applications for customers.

Assuming such a service is established, a contract between the user and the com- pany would state the conditions of the service and its costs. Moreover, such a con- tract would also detail how the service would be used, e.g., the amount of space to be consumed, preservation levels, how formats are migrated, etc. The ingredients of such contract between a user and a service provider will be discussed in detail in deliverable D5.4 also providing dedicated user interfaces.

To show a proof-of-concept for such a contract in the Personal Preservation Pilot II, the PIMO5 options page has been extended with basic options for selecting and changing conditions of a user’s service contract with a preservation service provider as shown in the next Figure under the tab “Contract”. The tab “Strategy” then leads to the Preservation Strategy settings as discussed above.

Preservation Service Contract in PIMO5
In the PIMO5 options, the user is able to set the Preservation Service Contract.

In the following, the details of this basic service contract preference selection are detailed.

10.1.7 Preservation Level Package

With regards to preservation, a huge amount of options and parameters have to be decided upon (see also D5.4). But complex decisions on preservation parameters might discourage normal customers to choose a preservation service.

Looking at the business of a telecom provider, pre-configured contracts are usually offered containing several bundles (flat rates for text messages, landline, broadband, mobile data, and options for roaming) making it (more or less) easier for customers to decide on suitable contracts.

As the goal of ForgetIT is to reduce the effort for Personal Preservation, offering such bundles for preservation options as in telecom contracts makes sense for the average customer but still allow advanced users more detailed options to adapt a contract to personal preferences. Finally, the bundling of preservation services with normal telecom contracts holds the potential of bringing preservation to a larger audience.

Therefore, we assume that a preservation service provider would offer bundles of preservation functionalities and options as packages similar to mobile contracts to customers. This allows customers to choose for the respective Preservation Value (PV) Category (gold, silver, bronze) between several preservation levels offered as Preservation Level Package as well as “do not preserve” for a category as shown in the Figure below.

Selecting Preservation Level Packages for each Preservation Value Category
In PIMO5 the user is able to select Preservation Level Packages for each Preservation Value Category.

Such a package contains a reasonable default set of preservation functionalities and options for a certain price. These packages could range from a basic package – which could be free with a mobile contract – up to packages with extended functionalities or security with additional costs. Furthermore, some options might be offered to the customer such as in the package “Premium Preservation WorldWide” to limit storage to European countries only, or similar. Such options are addressed in deliverable D5.4.

Furthermore, we also assume that for a contract bundle, the PV Categories come with a pre-set for the Preservation Levels as well as for the Preservation Strategy to accomplish an uncomplicated contract activation.

10.1.8 Alternative Contact Person

In deliverable D9.2 we identified the use case “death of a user” which addresses the requirement to hand over preserved material after the death of a user.

In the preservation contract tab, we allow to enter a contact from the PIMO which shall be contacted in the case a user passed away. The Figure with the Preservation Service Contract section where a contact can be selected via a dropdown box listing all persons in the PIMO. This contact is actually a pimo:Person which can either be imported from an address book together with address details or simply created in the Semantic Desktop.

Pilot II realises this as a proof-of-concept. It could be extended with more persons, with priorities, situations (death, unable to act, etc.) in which to contact, with a commitment of the selected user, e.g., by exchanging e-mails, etc. Furthermore, a functionality in the Semantic Desktop for small groups (such as as a family or a department) allows to easily integrate the selected pimo:Person as a user: in case a pimo:Person is not yet a Semantic Desktop user and an e-mail address is part of the contact details. Now, if this user creates an account in the Semantic Desktop using the same e-mail address, this user account is mapped to that instance of pimo:Person. This allows the new user to take over this pimo:Person. Then in case the situation occurred, the PoF Framework can use this functionality to grant access rights to the preserved material also to this new user account if the access to the preserved material is done via the Semantic Desktop.

10.1.9 Preservation Broker Contract

The terms of this Preservation Service Contract are transferred in a so-called Preservation Broker Contract (see deliverable D5.4) to the PoF Middleware via the SD/PoF Adapter (in the PoF Framework, this adapter is part of the Active System and communicates with the PoF as depicted in the ForgetIT architecture diagram) and used in the Managed Forgetting & Appraisal step (see Preservation Preparation Workflow) to decide on the preservation actions.

10.1.10 Investigating the results of the chosen Preservation Strategy

Once a user has set the Preservation Strategy, a new PV calculation is started after saving the new strategy by pressing the button “Save setting & recalculate” as shown in the Figure below. Pressing the button “Show preservation overview” opens a view which provides the user with an overview of the current material from the PIMO which would be preserved (if the PoF Framework decides to start preserving) using the selected Preservation Strategy. Compared to the photo collection-only time capsule view shown above which was used in the WP9 evaluation, this view is extended to all material from the PIMO such as documents and concepts such as projects and persons.

Preservation overview in PIMO5
For the Pilot II, a dedicated preservation view has been integrated in PIMO5 to check the Preservation Strategy setting. This view allows to browse through all resources and concepts to be preserved. The screenshot shows an individual (& partly anonymized) view on the DFKI PIMO with a Safe Curator profile.

For easier inspection, the view lists those images which would be preserved under their respective pimo:LifeSituation. Currently, the list starts with an overview of the resources with the highest Preservation Values first and allowing the user to request more items as shown in Figure 17.

PIMO resources to be preserved PIMO resources to be preserved
Choosing the File & Forget Preservation Strategy generates this list which contains mainly photos as media items as well as some important documents.

Clicking on a thing’s badge indicating the PV Category shows an explanation for the PV decision as shown in the Figure above and below.

Explaining the Preservation Value for a resource
A short explanation of the decision for PV using the Preservation Strategy settings. (Though, the explanation is not yet adapted for end users.)