Bug 30 - Files & Places
Summary: Files & Places
Status: ASSIGNED
Alias: None
Product: Varia
Classification: Unclassified
Component: Flow (show other bugs)
Version: unspecified
Hardware: PC Linux
: P1 normal
Assignee: Werner Van Belle
URL:
Depends on: 49 46 47 48 50
Blocks: 31
  Show dependency tree
 
Reported: 2007-09-23 16:14 CEST by Werner Van Belle
Modified: 2007-09-30 11:17 CEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Werner Van Belle 2007-09-23 16:14:34 CEST
is necessary before a transition can start working on the data
Comment 1 Werner Van Belle 2007-09-30 10:32:43 CEST
Three options for getting files
- simply copy the named file; then we don't know the content of a place
- execute a script at the place to retrieve/create the dataview on the fly
Putting data:
- add delete, overwrite a file at the destination place.

We need the ability to
- retrieve a named filed
- list the files available
- put a new file at the destination
- add a new file
- delete a file

we can use the filesystem for this, but we need a set of commands to provide all these functions. How can we know that a specific filename doesn't exist yet ?

We cannot write into a file since that does not work in a distributed environment. Opening a file is only possible when we retrieved it first. So we have
Get
Put
Del

We need a method to make sure that incoming files do not collide. This can be done by adding a prefix or postfix to each file. This postfix must be provided as

Get <zone> <target> <files>
Put <zone> <target> <files>
Del <zone> <file>

To obtain the list of files or specific indices we can use
List <zone> 

We already have a command that stores files in a non-duyplicate manner. This is a good thing, because we might be able to store the content of places there automatically. So instead of working with files as such we work with numbered files and these file numbers can be retrieved from a central place. That system is probably more viable. 

a source place basically manages the incoming files from time to time and will list them in specified directories. These can then be used to retrieve a file or store a file. 
Comment 2 Werner Van Belle 2007-09-30 10:59:06 CEST
When we put a file into a place we have the problem that we can no longer uniquely define the file and thus we are not able to detect changes. The best method to solve this problem is to assign a unique number to all files available in the system. When we then put a new file into place, it has its unique number anyway, and thus we can detect change.

The problem with unique file numbers is that they must be global; and garbage collectible. Files we want to keep should have a specific name. We still have the old source of course which can take a file place it in a store somewhere and the either link, copy, or scp the file. If we place it at the target we also imediatelly have its overall identifier. 

A file can 
- be in the system or not (have a unique number associated)
- imported into the system (will receive a unique number)
- retrieved from the system (will create a link on disk to the specified file)

This does not work in a distributed system
- multiple file systems make this difficult
- multiple computers make it difficult to associate each file uniquely with its content

The bottomline is tahta we need the ability to detect changes in a file; this means that they need to have a version number; which again implies that each place is a store that keeps track of its content in a different manner. The store itself can be located anywhere in the hierarchy, 

A- create an import command for each listed file
B- create a place-ls command that will return all the listed files and their unique number
C- create a put command that will add files to places
D- When we update a place we should issue an update command on that new files can be imported. 
E- this requires a priority for each command. Those with higher priority must be executed first. It will of course help to trigger an update on <placezone>/contentchanged instead of on the place itself.