Difference between revisions of "Useful Queries"

From ecology
Jump to: navigation, search
(How to separate birds or tag deployments when querying data when the same GPS logger has been used more than once or the same individual has been tracked with more than on logger)
(Re-sample tracking data to select data with a predefined interval between fixes: Do resample only on requested trackers)
 
(17 intermediate revisions by one other user not shown)
Line 1: Line 1:
== Calculating summary statistics ==
+
= UvAGPS functions =
=== Total distance ===
+
 
 +
== get_uvagps_track_speed ==
 +
 
 +
Compute distance (meters), time-interval (time stamp hh:mm:ss), speed (meters/second) and heading (degrees) for a series of subsequent tracking entries for a specific device_info_serial.
 +
For the first entry in the sequence these values will always be missing because they are computed  as the difference between subsequent rows.
 +
 
 +
Usage examples :
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_speed(<device_info_serial>);
 +
 
 +
select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00');
 +
 
 +
select t.*, a.distance, a.interval, a.speed, a.direction
 +
from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00') a
 +
join gps.ee_tracking_speed_limited t using(device_info_serial, date_time);
 +
</pre>
 +
 
 +
Distances and heading are calculated with spherical correction with the Haversine formula as described [http://http://www.movable-type.co.uk/scripts/latlong.html here].
 +
The heading is the initial heading. PostGIS does not calculate the heading, so therefore we don't use PostGIS.
 +
 
 +
In these functions the records that have been flagged as invalid by setting the '''userflag''' to 1 are not taken into account and discarded.
 +
If speed calculations are required for all records an additional boolean argument '''get raw data''' can be given. i.e.
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00', true);
 +
 
 +
select * from gps.get_uvagps_track_speed(119, true);
 +
</pre>
 +
 
 +
== get_uvagps_track_distance ==
 +
 
 +
Compute distance (meters) for UvAGPS tracks to arbitrary point in lat/long or geometry location
 +
 
 +
Example usage :
 +
 +
<pre>
 +
select * from gps.get_uvagps_track_distance(119, 52.1897017, 6.1996606);
 +
 
 +
select * from gps.get_uvagps_track_distance(119, (select location from gps.ee_tracking_speed_limited where device_info_serial = 119 and date_time = '2009-05-07 19:55:08'));
 +
 
 +
</pre>
 +
 
 +
Here the distance is calculated with the PostGIS distance_sphere function which uses spherical correction.
 +
 
 +
== get_uvagps_track_distance_direction ==
 +
 
 +
 
 +
Compute distance (meters) and direction (degrees) for UvAGPS tracks from arbitrary point in lat/long or geometry location
 +
 
 +
Example usage:
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606);
 +
</pre>
 +
 
 +
Distances and direction are calculated with spherical correction with the Haversine formula as described [http://http://www.movable-type.co.uk/scripts/latlong.html here]. The direction is the direction from the given point.
 +
 
 +
In this function the records where the '''userflag''' has been set are discarded. To see  these records a additional boolean argument for '''show raw data''' should  be added like :
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606, true);
 +
</pre>
 +
= Filtering tracking data =
 +
 
 +
== Select tracking data from the beginning of a track session ==
 +
 
 +
Select only those points that were collected during a valid track session (i.e. data that begins from the start_date of a track session) including 2D GPS speed. Filter our points where no GPS fix is returned (latitude is not null) and where a user has flagged the data as inappropriate (user_flag <> 1):
 +
 
 +
SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.temperature, t.speed, t.userflag
 +
FROM gps.ee_tracking_speed_limited t, gps.ee_track_session_limited s
 +
WHERE t.device_info_serial = s.device_info_serial AND t.date_time >= s.start_date AND t.device_info_serial = '''<ID>'''
 +
  AND t.latitude IS  NOT NULL AND t.userflag <> 1
 +
 
 +
== Select tracking data and distinguish between distinct GPS deployments ==
 +
In this query you will select the tracking data related to each deployment of a GPS logger or each time an individual bird was tracked. This query is slightly more complex than the query above. If you have used the same logger on more than one individual bird, or have tracked the same individual with more than one logger, this query will enable you to make a distinction between these tracking events by showing both GPS logger id (device_info_serial) and a unique identifier for your individual (eg ring_number or colour_ring). Furthermore, as above, this will only select tracking data from the start and end of a particular track session.
 +
 
 +
  SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.speed_2d, i.colour_ring, s.track_session_id, s.key_name
 +
  FROM gps.ee_track_session_limited s
 +
  JOIN gps.ee_individual_limited i
 +
    ON s.individual_id = i.individual_id
 +
  JOIN gps.ee_tracking_speed_limited t
 +
    ON t.device_info_serial = s.device_info_serial
 +
  AND t.date_time BETWEEN s.start_date AND s.end_date
 +
  WHERE s.key_name = ''''<Project name>'''' AND t.userflag = 0
 +
  ORDER BY t.device_info_serial, t.date_time
 +
 
 +
HINT: WHERE s.key_name = 'HG_TEXEL'
 +
 
 +
== Re-sample tracking data to select data with a predefined interval between fixes ==
 +
 
 +
The following query provides an example of how to select data points with a predefined interval between each point and not lower. In the following example, a 5 minute interval is selected.  In this example all measurements that are taken between each 5 minute interval are left out of the resampled data.
 +
 
 +
Commented query with examples.  Statements followed by -- are comments and are not parsed by SQL.
 +
 
 +
  SELECT *
 +
  FROM gps.ee_tracking_speed_limited
 +
  JOIN
 +
  (
 +
              SELECT
 +
              device_info_serial,
 +
              -- return oldest date_time for each interval
 +
              MIN(date_time) date_time,
 +
              -- each 5 minutes has a unique identifier
 +
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
 +
              FROM gps.ee_tracking_speed_limited
 +
              WHERE device_info_serial = 355 AND date_time BETWEEN '2010-06-27' AND '2010-06-29'
 +
              GROUP BY device_info_serial, intervalid
 +
  ) b
 +
  USING (device_info_serial, date_time)
 +
 
 +
  ORDER BY date_time
 +
 
 +
Uncommented query, date_time and device_info_serial need to be filled in. Interval is 5 minutes and can be changed.
 +
 
 +
  SELECT *
 +
  FROM gps.ee_tracking_speed_limited
 +
  JOIN
 +
  (
 +
              SELECT
 +
              device_info_serial,
 +
              MIN(date_time) date_time,
 +
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
 +
              FROM gps.ee_tracking_speed_limited
 +
              WHERE device_info_serial = '''<ID>''' AND date_time BETWEEN ''''yyyy-mm-dd'''' AND ''''yyyy-mm-dd''''
 +
              GROUP BY device_info_serial, intervalid
 +
  ) b
 +
  USING (device_info_serial, date_time)
 +
  ORDER BY date_time
 +
 
 +
= Calculating summary statistics =
 +
== Total distance ==
 
A query to determine the total distance a bird has flown is :  
 
A query to determine the total distance a bird has flown is :  
  
Line 11: Line 142:
 
  from gps.get_uvagps_track_speed('''<ID>''', '2010-01-01 00:00:00', '2011-01-01 00:00:00');
 
  from gps.get_uvagps_track_speed('''<ID>''', '2010-01-01 00:00:00', '2011-01-01 00:00:00');
  
=== Total number of measurements per tag ===
+
== Total number of measurements per tag ==
  
 
Count the total number of measurements for each device using the GROUP BY clause:
 
Count the total number of measurements for each device using the GROUP BY clause:
Line 29: Line 160:
 
  ORDER BY device_info_serial ASC;
 
  ORDER BY device_info_serial ASC;
  
== Select tracking data from the beginning of a track session ==
+
Count the number of GPS fixes within a track session and for all track sessions within a specific project.
 
+
In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'
Select only those points that were collected during a valid track session (i.e. data that begins from the start_date of a track session) including 2D GPS speed. Filter our points where no GPS fix is returned (latitude is not null) and where a user has flagged the data as inappropriate (user_flag <> 1):
 
 
 
SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.temperature, t.speed, t.userflag
 
FROM gps.ee_tracking_speed_limited t, gps.ee_track_session_limited s
 
WHERE t.device_info_serial = s.device_info_serial AND t.date_time >= s.start_date AND t.device_info_serial = '''<ID>'''
 
  AND t.latitude IS  NOT NULL AND t.userflag <> 1
 
 
 
== How to separate birds or tag deployments when querying data where the same GPS logger has been used more than once or the same individual has been tracked with more than on logger ==
 
NOTE: The current query does not work yet and can only be implemented in a testing environment - we expect to have this ready by December 2014
 
 
 
In this query you will select the tracking data related to each deployment of a GPS logger. This query is slightly more complex than the query above. If you have used the same logger on more than one individual bird, or have tracked the same individual with more than one logger, this query will enable you to make that distinction, by showing both GPS logger id (device_info_serial) and a unique identifier for your individual (eg ring_number or colour_ring). Furthermore, as above, this will only select tracking data from the start and end of a particular track session.
 
  
  SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.speed, i.colour_ring, s.track_session_id, s.key_name
+
WITH
  FROM webapp.ee_track_session_limited s
+
V AS (
  JOIN webapp.ee_individual_limited i
+
    SELECT s.device_info_serial, count(t.*) AS point_count, s.track_session_id, s.individual_id
     ON s.individual_id = i.individual_id
+
    FROM gps.ee_track_session_limited s
  JOIN webapp.ee_tracking_speed_limited t
+
     JOIN gps.ee_tracking_speed_limited t
    ON t.device_info_serial = s.device_info_serial
+
      ON t.latitude IS NOT NULL
  AND t.date_time BETWEEN s.start_date AND s.end_date
+
    AND s.device_info_serial = t.device_info_serial  
  WHERE s.key_name = '<Project name>' AND t.userflag = 0
+
    AND t.date_time BETWEEN s.start_date AND s.end_date
  ORDER BY t.device_info_serial, t.date_time
+
    WHERE s.key_name = ''''< project name >''''
 +
    GROUP BY s.track_session_id, s.device_info_serial, s.individual_id
 +
)
 +
SELECT i.colour_ring, V.device_info_serial, V.point_count, V.track_session_id
 +
FROM V
 +
JOIN gps.ee_individual_limited i
 +
  ON V.individual_id = i.individual_id
 +
= Extra Metadata =
 +
== Extract all track sessions for a project ==
 +
This query provides all the track sessions for a specific project as well as the selected ID of the individual birds that were tracked. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'
  
HINT: WHERE s.key_name = 'HG_TEXEL'
+
  SELECT s.device_info_serial, i.colour_ring, s.track_session_id, s.key_name
 +
  FROM gps.ee_individual_limited i
 +
  JOIN gps.ee_track_session_limited s
 +
  ON s.individual_id = i.individual_id
 +
  WHERE s.key_name = ''''< project name >''''
 +
  ORDER BY i.colour_ring

Latest revision as of 09:29, 13 October 2015

UvAGPS functions

get_uvagps_track_speed

Compute distance (meters), time-interval (time stamp hh:mm:ss), speed (meters/second) and heading (degrees) for a series of subsequent tracking entries for a specific device_info_serial. For the first entry in the sequence these values will always be missing because they are computed as the difference between subsequent rows.

Usage examples :

select * from gps.get_uvagps_track_speed(<device_info_serial>);

select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00');

select t.*, a.distance, a.interval, a.speed, a.direction 
from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00') a 
join gps.ee_tracking_speed_limited t using(device_info_serial, date_time);

Distances and heading are calculated with spherical correction with the Haversine formula as described here. The heading is the initial heading. PostGIS does not calculate the heading, so therefore we don't use PostGIS.

In these functions the records that have been flagged as invalid by setting the userflag to 1 are not taken into account and discarded. If speed calculations are required for all records an additional boolean argument get raw data can be given. i.e.

select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00', true);

select * from gps.get_uvagps_track_speed(119, true);

get_uvagps_track_distance

Compute distance (meters) for UvAGPS tracks to arbitrary point in lat/long or geometry location

Example usage :

select * from gps.get_uvagps_track_distance(119, 52.1897017, 6.1996606); 

select * from gps.get_uvagps_track_distance(119, (select location from gps.ee_tracking_speed_limited where device_info_serial = 119 and date_time = '2009-05-07 19:55:08')); 

Here the distance is calculated with the PostGIS distance_sphere function which uses spherical correction.

get_uvagps_track_distance_direction

Compute distance (meters) and direction (degrees) for UvAGPS tracks from arbitrary point in lat/long or geometry location

Example usage:

select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606); 

Distances and direction are calculated with spherical correction with the Haversine formula as described here. The direction is the direction from the given point.

In this function the records where the userflag has been set are discarded. To see these records a additional boolean argument for show raw data should be added like :

select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606, true); 

Filtering tracking data

Select tracking data from the beginning of a track session

Select only those points that were collected during a valid track session (i.e. data that begins from the start_date of a track session) including 2D GPS speed. Filter our points where no GPS fix is returned (latitude is not null) and where a user has flagged the data as inappropriate (user_flag <> 1):

SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.temperature, t.speed, t.userflag
FROM gps.ee_tracking_speed_limited t, gps.ee_track_session_limited s
WHERE t.device_info_serial = s.device_info_serial AND t.date_time >= s.start_date AND t.device_info_serial = <ID>
  AND t.latitude IS  NOT NULL AND t.userflag <> 1

Select tracking data and distinguish between distinct GPS deployments

In this query you will select the tracking data related to each deployment of a GPS logger or each time an individual bird was tracked. This query is slightly more complex than the query above. If you have used the same logger on more than one individual bird, or have tracked the same individual with more than one logger, this query will enable you to make a distinction between these tracking events by showing both GPS logger id (device_info_serial) and a unique identifier for your individual (eg ring_number or colour_ring). Furthermore, as above, this will only select tracking data from the start and end of a particular track session.

 SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.speed_2d, i.colour_ring, s.track_session_id, s.key_name
 FROM gps.ee_track_session_limited s
 JOIN gps.ee_individual_limited i
   ON s.individual_id = i.individual_id
 JOIN gps.ee_tracking_speed_limited t
   ON t.device_info_serial = s.device_info_serial
 AND t.date_time BETWEEN s.start_date AND s.end_date
 WHERE s.key_name = '<Project name>' AND t.userflag = 0
 ORDER BY t.device_info_serial, t.date_time

HINT: WHERE s.key_name = 'HG_TEXEL'

Re-sample tracking data to select data with a predefined interval between fixes

The following query provides an example of how to select data points with a predefined interval between each point and not lower. In the following example, a 5 minute interval is selected. In this example all measurements that are taken between each 5 minute interval are left out of the resampled data.

Commented query with examples. Statements followed by -- are comments and are not parsed by SQL.

 SELECT *
 FROM gps.ee_tracking_speed_limited
 JOIN
 (
              SELECT 
              device_info_serial, 
              -- return oldest date_time for each interval
              MIN(date_time) date_time,
              -- each 5 minutes has a unique identifier
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
              FROM gps.ee_tracking_speed_limited
              WHERE device_info_serial = 355 AND date_time BETWEEN '2010-06-27' AND '2010-06-29'
              GROUP BY device_info_serial, intervalid
 ) b
 USING (device_info_serial, date_time)
 ORDER BY date_time

Uncommented query, date_time and device_info_serial need to be filled in. Interval is 5 minutes and can be changed.

 SELECT *
 FROM gps.ee_tracking_speed_limited
 JOIN
 (
              SELECT 
              device_info_serial, 
              MIN(date_time) date_time,
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
              FROM gps.ee_tracking_speed_limited
              WHERE device_info_serial = <ID> AND date_time BETWEEN 'yyyy-mm-dd' AND 'yyyy-mm-dd'
              GROUP BY device_info_serial, intervalid
 ) b
 USING (device_info_serial, date_time)
 ORDER BY date_time

Calculating summary statistics

Total distance

A query to determine the total distance a bird has flown is :

select sum(distance)/1000 as total_distance
from gps.get_uvagps_track_speed(<ID>);

or for a fixed time period :

select sum(distance)/1000 as total_distance
from gps.get_uvagps_track_speed(<ID>, '2010-01-01 00:00:00', '2011-01-01 00:00:00');

Total number of measurements per tag

Count the total number of measurements for each device using the GROUP BY clause:

select device_info_serial, count(*) 
from gps.ee_tracking_speed_limited
group by device_info_serial
ORDER BY device_info_serial ASC; 


If you want to count the number of measurements with valid GPS fixes you can add a where clause :

select device_info_serial, count(*) 
from gps.ee_tracking_speed_limited
where latitude is not null 
group by device_info_serial
ORDER BY device_info_serial ASC;

Count the number of GPS fixes within a track session and for all track sessions within a specific project. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'

WITH 
V AS (
   SELECT s.device_info_serial, count(t.*) AS point_count, s.track_session_id, s.individual_id
   FROM gps.ee_track_session_limited s
   JOIN gps.ee_tracking_speed_limited t
     ON t.latitude IS NOT NULL
    AND s.device_info_serial = t.device_info_serial 
    AND t.date_time BETWEEN s.start_date AND s.end_date
   WHERE s.key_name = '< project name >'
   GROUP BY s.track_session_id, s.device_info_serial, s.individual_id
)
SELECT i.colour_ring, V.device_info_serial, V.point_count, V.track_session_id
FROM V
JOIN gps.ee_individual_limited i
 ON V.individual_id = i.individual_id

Extra Metadata

Extract all track sessions for a project

This query provides all the track sessions for a specific project as well as the selected ID of the individual birds that were tracked. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'

 SELECT s.device_info_serial, i.colour_ring, s.track_session_id, s.key_name
 FROM gps.ee_individual_limited i
 JOIN gps.ee_track_session_limited s
  ON s.individual_id = i.individual_id
 WHERE s.key_name = '< project name >' 
 ORDER BY i.colour_ring