Difference between revisions of "Files Dashboard"

From ecology
Jump to: navigation, search
(Files processing)
Line 2: Line 2:
  
 
Each cell from the table will then contain information for the files of that tracker for that day; in principle, one box for each file. And usually, no more than one file per day. The purpose is to have information at a glance, so if there are no files for a tracker for one day, an empty box is to be shown.
 
Each cell from the table will then contain information for the files of that tracker for that day; in principle, one box for each file. And usually, no more than one file per day. The purpose is to have information at a glance, so if there are no files for a tracker for one day, an empty box is to be shown.
 +
 +
Three pieces of information are to be shown for each file:
 +
* Whether a file exists for a given day and tracker
 +
* How big the file is
 +
* Whether errors happened during parsing of its contents
 +
  
 
== Files processing ==
 
== Files processing ==

Revision as of 09:03, 8 May 2012

The files dashboard is a web page to get an overview of the raw data files of trackers. For a chosen project and a date interval, it displays a table with a row for every day in the date interval and a column for every tracker that belongs to the project.

Each cell from the table will then contain information for the files of that tracker for that day; in principle, one box for each file. And usually, no more than one file per day. The purpose is to have information at a glance, so if there are no files for a tracker for one day, an empty box is to be shown.

Three pieces of information are to be shown for each file:

  • Whether a file exists for a given day and tracker
  • How big the file is
  • Whether errors happened during parsing of its contents


Files processing

Raw tracking data comes into the system sent as plain text files through a Dropbox account. Dropbox makes the files available in the server file system.

File properties

Once accessible through the local file system, they can be queried, read and parsed to extract the tracking information they contain.

The file name is expected to have the form Log_0533_13042012_xx.txt. This provides:

The tracker number
(in the example, 533)
The reported date
(in the example, April 13th 2012)

Apart from the name, the file has other attributes that the file system provides. Namely:

last modification date
tells when something was modified in the file for the last time.
Size
how big the file contents are

File processing lifecycle

The file name (alone, without the path) uniquely identifies it in the system. If this is a new file, a new entry will be added to the system. When a file is discovered in the file system, first its properties are analyzed, as described above, and stored next to the file name.

If there already existed an entry for the file name, then the last modification date must be checked. If the one from the file differs from that in the entry in the system, then the information found in the newly found file must be taken into consideration; so the new last modification date overrides the one in the system entry (which, in turn, gets discarded) and along with the new file size.

During all this process, nothing gets looked at inside the the file's contents.