Difference between revisions of "Useful Queries"

From ecology
Jump to: navigation, search
(Useful Queries)
(Re-sample tracking data to select data with a predefined interval between fixes: Do resample only on requested trackers)
 
(37 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
= UvAGPS functions =
  
A query to determine the total distance a bird has flown for a time period is :
+
== get_uvagps_track_speed ==
  
  select sum(distance)/1000 as total_distance
+
Compute distance (meters), time-interval (time stamp hh:mm:ss), speed (meters/second) and heading (degrees) for a series of subsequent tracking entries for a specific device_info_serial.
  from gps.get_uvagps_track_speed('''<ID>''');
+
For the first entry in the sequence these values will always be missing because they are computed as the difference between subsequent rows.
 +
 
 +
Usage examples :
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_speed(<device_info_serial>);
 +
 
 +
select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00');
 +
 
 +
select t.*, a.distance, a.interval, a.speed, a.direction
 +
from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00') a
 +
join gps.ee_tracking_speed_limited t using(device_info_serial, date_time);
 +
</pre>
 +
 
 +
Distances and heading are calculated with spherical correction with the Haversine formula as described [http://http://www.movable-type.co.uk/scripts/latlong.html here].
 +
The heading is the initial heading. PostGIS does not calculate the heading, so therefore we don't use PostGIS.
 +
 
 +
In these functions the records that have been flagged as invalid by setting the '''userflag''' to 1 are not taken into account and discarded.
 +
If speed calculations are required for all records an additional boolean argument '''get raw data''' can be given. i.e.
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00', true);
 +
 
 +
select * from gps.get_uvagps_track_speed(119, true);
 +
</pre>
 +
 
 +
== get_uvagps_track_distance ==
 +
 
 +
Compute distance (meters) for UvAGPS tracks to arbitrary point in lat/long or geometry location
 +
 
 +
Example usage :
 +
 +
<pre>
 +
select * from gps.get_uvagps_track_distance(119, 52.1897017, 6.1996606);
 +
 
 +
select * from gps.get_uvagps_track_distance(119, (select location from gps.ee_tracking_speed_limited where device_info_serial = 119 and date_time = '2009-05-07 19:55:08'));
 +
 
 +
</pre>
 +
 
 +
Here the distance is calculated with the PostGIS distance_sphere function which uses spherical correction.
 +
 
 +
== get_uvagps_track_distance_direction ==
 +
 
 +
 
 +
Compute distance (meters) and direction (degrees) for UvAGPS tracks from arbitrary point in lat/long or geometry location
 +
 
 +
Example usage:
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606);
 +
</pre>
 +
 
 +
Distances and direction are calculated with spherical correction with the Haversine formula as described [http://http://www.movable-type.co.uk/scripts/latlong.html here]. The direction is the direction from the given point.
 +
 
 +
In this function the records where the '''userflag''' has been set are discarded. To see  these records a additional boolean argument for '''show raw data''' should be added like :
 +
 
 +
<pre>
 +
select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606, true);
 +
</pre>
 +
= Filtering tracking data =
 +
 
 +
== Select tracking data from the beginning of a track session ==
 +
 
 +
Select only those points that were collected during a valid track session (i.e. data that begins from the start_date of a track session) including 2D GPS speed. Filter our points where no GPS fix is returned (latitude is not null) and where a user has flagged the data as inappropriate (user_flag <> 1):
 +
 
 +
SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.temperature, t.speed, t.userflag
 +
FROM gps.ee_tracking_speed_limited t, gps.ee_track_session_limited s
 +
WHERE t.device_info_serial = s.device_info_serial AND t.date_time >= s.start_date AND t.device_info_serial = '''<ID>'''
 +
  AND t.latitude IS  NOT NULL AND t.userflag <> 1
 +
 
 +
== Select tracking data and distinguish between distinct GPS deployments ==
 +
In this query you will select the tracking data related to each deployment of a GPS logger or each time an individual bird was tracked. This query is slightly more complex than the query above. If you have used the same logger on more than one individual bird, or have tracked the same individual with more than one logger, this query will enable you to make a distinction between these tracking events by showing both GPS logger id (device_info_serial) and a unique identifier for your individual (eg ring_number or colour_ring). Furthermore, as above, this will only select tracking data from the start and end of a particular track session.
 +
 
 +
  SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.speed_2d, i.colour_ring, s.track_session_id, s.key_name
 +
  FROM gps.ee_track_session_limited s
 +
  JOIN gps.ee_individual_limited i
 +
    ON s.individual_id = i.individual_id
 +
  JOIN gps.ee_tracking_speed_limited t
 +
    ON t.device_info_serial = s.device_info_serial
 +
  AND t.date_time BETWEEN s.start_date AND s.end_date
 +
  WHERE s.key_name = ''''<Project name>'''' AND t.userflag = 0
 +
  ORDER BY t.device_info_serial, t.date_time
 +
 
 +
HINT: WHERE s.key_name = 'HG_TEXEL'
 +
 
 +
== Re-sample tracking data to select data with a predefined interval between fixes ==
  
or
+
The following query provides an example of how to select data points with a predefined interval between each point and not lower. In the following example, a 5 minute interval is selected.  In this example all measurements that are taken between each 5 minute interval are left out of the resampled data.
  
  select sum(distance)/1000 as total_distance
+
Commented query with examples. Statements followed by -- are comments and are not parsed by SQL.
from gps.get_uvagps_track_speed('''<ID>''', '2010-01-01 00:00:00', '2011-01-01 00:00:00');
 
  
 +
  SELECT *
 +
  FROM gps.ee_tracking_speed_limited
 +
  JOIN
 +
  (
 +
              SELECT
 +
              device_info_serial,
 +
              -- return oldest date_time for each interval
 +
              MIN(date_time) date_time,
 +
              -- each 5 minutes has a unique identifier
 +
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
 +
              FROM gps.ee_tracking_speed_limited
 +
              WHERE device_info_serial = 355 AND date_time BETWEEN '2010-06-27' AND '2010-06-29'
 +
              GROUP BY device_info_serial, intervalid
 +
  ) b
 +
  USING (device_info_serial, date_time)
  
If you want to do this for all ID's at the same time you can do something like :
+
  ORDER BY date_time
  
select device_info_serial,  (select sum(distance)/1000
+
Uncommented query, date_time and device_info_serial need to be filled in. Interval is 5 minutes and can be changed.
  from gps.get_uvagps_track_speed(device_info_serial, '2010-01-01 00:00:00', '2011-01-01 00:00:00')) as total_distance
 
from gps.uva_device
 
  
For limited users :
+
  SELECT *
 +
  FROM gps.ee_tracking_speed_limited
 +
  JOIN
 +
  (
 +
              SELECT
 +
              device_info_serial,
 +
              MIN(date_time) date_time,
 +
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
 +
              FROM gps.ee_tracking_speed_limited
 +
              WHERE device_info_serial = '''<ID>''' AND date_time BETWEEN ''''yyyy-mm-dd'''' AND ''''yyyy-mm-dd''''
 +
              GROUP BY device_info_serial, intervalid
 +
  ) b
 +
  USING (device_info_serial, date_time)
 +
  ORDER BY date_time
  
select device_info_serial,  (select sum(distance)/1000
+
= Calculating summary statistics =
  from gps.get_uvagps_track_speed(device_info_serial, '2010-01-01 00:00:00', '2011-01-01 00:00:00')) as total_distance
+
== Total distance ==
from gps.uva_device_limited
+
A query to determine the total distance a bird has flown is :  
  
 +
select sum(distance)/1000 as total_distance
 +
from gps.get_uvagps_track_speed('''<ID>''');
  
 +
or for a fixed time period :
  
The GROUP BY clause in SQL can be used to count the total number of measurements for each device :
+
select sum(distance)/1000 as total_distance
 +
from gps.get_uvagps_track_speed('''<ID>''', '2010-01-01 00:00:00', '2011-01-01 00:00:00');
  
select device_info_serial, count(*)
+
== Total number of measurements per tag ==
from gps.uva_tracking
 
group by device_info_serial;
 
  
and for limited users :
+
Count the total number of measurements for each device using the GROUP BY clause:
  
 
  select device_info_serial, count(*)  
 
  select device_info_serial, count(*)  
  from gps.uva_tracking_limited
+
  from gps.ee_tracking_speed_limited
  group by device_info_serial;  
+
  group by device_info_serial
 +
ORDER BY device_info_serial ASC;  
  
  
Line 41: Line 155:
 
    
 
    
 
  select device_info_serial, count(*)  
 
  select device_info_serial, count(*)  
  from gps.uva_tracking
+
  from gps.ee_tracking_speed_limited
 
  where latitude is not null  
 
  where latitude is not null  
  group by device_info_serial;
+
  group by device_info_serial
 
+
ORDER BY device_info_serial ASC;
If you to see want both the count of valid and invalid GPS fixes per device you can do something like :
 
  
  select device_info_serial, latitude is not null as valid, count(*)
+
Count the number of GPS fixes within a track session and for all track sessions within a specific project.
  from gps.uva_tracking
+
In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'
  group by device_info_serial, valid
 
  order by device_info_serial, valid;
 
  
and for limited users:
+
WITH
 +
V AS (
 +
    SELECT s.device_info_serial, count(t.*) AS point_count, s.track_session_id, s.individual_id
 +
    FROM gps.ee_track_session_limited s
 +
    JOIN gps.ee_tracking_speed_limited t
 +
      ON t.latitude IS NOT NULL
 +
    AND s.device_info_serial = t.device_info_serial
 +
    AND t.date_time BETWEEN s.start_date AND s.end_date
 +
    WHERE s.key_name = ''''< project name >''''
 +
    GROUP BY s.track_session_id, s.device_info_serial, s.individual_id
 +
)
 +
SELECT i.colour_ring, V.device_info_serial, V.point_count, V.track_session_id
 +
FROM V
 +
JOIN gps.ee_individual_limited i
 +
  ON V.individual_id = i.individual_id
 +
= Extra Metadata =
 +
== Extract all track sessions for a project ==
 +
This query provides all the track sessions for a specific project as well as the selected ID of the individual birds that were tracked. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'
  
   select device_info_serial, latitude is not null as valid, count(*)
+
   SELECT s.device_info_serial, i.colour_ring, s.track_session_id, s.key_name
   from gps.uva_tracking_limited
+
  FROM gps.ee_individual_limited i
   group by device_info_serial, valid
+
   JOIN gps.ee_track_session_limited s
   order by device_info_serial, valid;
+
  ON s.individual_id = i.individual_id
 +
   WHERE s.key_name = ''''< project name >''''
 +
   ORDER BY i.colour_ring

Latest revision as of 09:29, 13 October 2015

UvAGPS functions

get_uvagps_track_speed

Compute distance (meters), time-interval (time stamp hh:mm:ss), speed (meters/second) and heading (degrees) for a series of subsequent tracking entries for a specific device_info_serial. For the first entry in the sequence these values will always be missing because they are computed as the difference between subsequent rows.

Usage examples :

select * from gps.get_uvagps_track_speed(<device_info_serial>);

select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00');

select t.*, a.distance, a.interval, a.speed, a.direction 
from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00') a 
join gps.ee_tracking_speed_limited t using(device_info_serial, date_time);

Distances and heading are calculated with spherical correction with the Haversine formula as described here. The heading is the initial heading. PostGIS does not calculate the heading, so therefore we don't use PostGIS.

In these functions the records that have been flagged as invalid by setting the userflag to 1 are not taken into account and discarded. If speed calculations are required for all records an additional boolean argument get raw data can be given. i.e.

select * from gps.get_uvagps_track_speed(119, '2008-01-01 00:00:00', '2010-01-01 00:00:00', true);

select * from gps.get_uvagps_track_speed(119, true);

get_uvagps_track_distance

Compute distance (meters) for UvAGPS tracks to arbitrary point in lat/long or geometry location

Example usage :

select * from gps.get_uvagps_track_distance(119, 52.1897017, 6.1996606); 

select * from gps.get_uvagps_track_distance(119, (select location from gps.ee_tracking_speed_limited where device_info_serial = 119 and date_time = '2009-05-07 19:55:08')); 

Here the distance is calculated with the PostGIS distance_sphere function which uses spherical correction.

get_uvagps_track_distance_direction

Compute distance (meters) and direction (degrees) for UvAGPS tracks from arbitrary point in lat/long or geometry location

Example usage:

select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606); 

Distances and direction are calculated with spherical correction with the Haversine formula as described here. The direction is the direction from the given point.

In this function the records where the userflag has been set are discarded. To see these records a additional boolean argument for show raw data should be added like :

select * from gps.get_uvagps_track_distance_direction(119, 52.1897017, 6.1996606, true); 

Filtering tracking data

Select tracking data from the beginning of a track session

Select only those points that were collected during a valid track session (i.e. data that begins from the start_date of a track session) including 2D GPS speed. Filter our points where no GPS fix is returned (latitude is not null) and where a user has flagged the data as inappropriate (user_flag <> 1):

SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.temperature, t.speed, t.userflag
FROM gps.ee_tracking_speed_limited t, gps.ee_track_session_limited s
WHERE t.device_info_serial = s.device_info_serial AND t.date_time >= s.start_date AND t.device_info_serial = <ID>
  AND t.latitude IS  NOT NULL AND t.userflag <> 1

Select tracking data and distinguish between distinct GPS deployments

In this query you will select the tracking data related to each deployment of a GPS logger or each time an individual bird was tracked. This query is slightly more complex than the query above. If you have used the same logger on more than one individual bird, or have tracked the same individual with more than one logger, this query will enable you to make a distinction between these tracking events by showing both GPS logger id (device_info_serial) and a unique identifier for your individual (eg ring_number or colour_ring). Furthermore, as above, this will only select tracking data from the start and end of a particular track session.

 SELECT t.device_info_serial, t.date_time, t.latitude, t.longitude, t.altitude, t.speed_2d, i.colour_ring, s.track_session_id, s.key_name
 FROM gps.ee_track_session_limited s
 JOIN gps.ee_individual_limited i
   ON s.individual_id = i.individual_id
 JOIN gps.ee_tracking_speed_limited t
   ON t.device_info_serial = s.device_info_serial
 AND t.date_time BETWEEN s.start_date AND s.end_date
 WHERE s.key_name = '<Project name>' AND t.userflag = 0
 ORDER BY t.device_info_serial, t.date_time

HINT: WHERE s.key_name = 'HG_TEXEL'

Re-sample tracking data to select data with a predefined interval between fixes

The following query provides an example of how to select data points with a predefined interval between each point and not lower. In the following example, a 5 minute interval is selected. In this example all measurements that are taken between each 5 minute interval are left out of the resampled data.

Commented query with examples. Statements followed by -- are comments and are not parsed by SQL.

 SELECT *
 FROM gps.ee_tracking_speed_limited
 JOIN
 (
              SELECT 
              device_info_serial, 
              -- return oldest date_time for each interval
              MIN(date_time) date_time,
              -- each 5 minutes has a unique identifier
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
              FROM gps.ee_tracking_speed_limited
              WHERE device_info_serial = 355 AND date_time BETWEEN '2010-06-27' AND '2010-06-29'
              GROUP BY device_info_serial, intervalid
 ) b
 USING (device_info_serial, date_time)
 ORDER BY date_time

Uncommented query, date_time and device_info_serial need to be filled in. Interval is 5 minutes and can be changed.

 SELECT *
 FROM gps.ee_tracking_speed_limited
 JOIN
 (
              SELECT 
              device_info_serial, 
              MIN(date_time) date_time,
              FLOOR(EXTRACT(EPOCH FROM date_time)/(5*60)) intervalid
              FROM gps.ee_tracking_speed_limited
              WHERE device_info_serial = <ID> AND date_time BETWEEN 'yyyy-mm-dd' AND 'yyyy-mm-dd'
              GROUP BY device_info_serial, intervalid
 ) b
 USING (device_info_serial, date_time)
 ORDER BY date_time

Calculating summary statistics

Total distance

A query to determine the total distance a bird has flown is :

select sum(distance)/1000 as total_distance
from gps.get_uvagps_track_speed(<ID>);

or for a fixed time period :

select sum(distance)/1000 as total_distance
from gps.get_uvagps_track_speed(<ID>, '2010-01-01 00:00:00', '2011-01-01 00:00:00');

Total number of measurements per tag

Count the total number of measurements for each device using the GROUP BY clause:

select device_info_serial, count(*) 
from gps.ee_tracking_speed_limited
group by device_info_serial
ORDER BY device_info_serial ASC; 


If you want to count the number of measurements with valid GPS fixes you can add a where clause :

select device_info_serial, count(*) 
from gps.ee_tracking_speed_limited
where latitude is not null 
group by device_info_serial
ORDER BY device_info_serial ASC;

Count the number of GPS fixes within a track session and for all track sessions within a specific project. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'

WITH 
V AS (
   SELECT s.device_info_serial, count(t.*) AS point_count, s.track_session_id, s.individual_id
   FROM gps.ee_track_session_limited s
   JOIN gps.ee_tracking_speed_limited t
     ON t.latitude IS NOT NULL
    AND s.device_info_serial = t.device_info_serial 
    AND t.date_time BETWEEN s.start_date AND s.end_date
   WHERE s.key_name = '< project name >'
   GROUP BY s.track_session_id, s.device_info_serial, s.individual_id
)
SELECT i.colour_ring, V.device_info_serial, V.point_count, V.track_session_id
FROM V
JOIN gps.ee_individual_limited i
 ON V.individual_id = i.individual_id

Extra Metadata

Extract all track sessions for a project

This query provides all the track sessions for a specific project as well as the selected ID of the individual birds that were tracked. In the example below the project name needs to be filled in, for example s.key_name = 'HG_IVES'

 SELECT s.device_info_serial, i.colour_ring, s.track_session_id, s.key_name
 FROM gps.ee_individual_limited i
 JOIN gps.ee_track_session_limited s
  ON s.individual_id = i.individual_id
 WHERE s.key_name = '< project name >' 
 ORDER BY i.colour_ring