Difference between revisions of "ECMWF"

From ecology
Jump to: navigation, search
(Helpful Functions)
(Historical data)
 
(16 intermediate revisions by 4 users not shown)
Line 3: Line 3:
 
Datamodel for the tables in the ECMWF schema.  
 
Datamodel for the tables in the ECMWF schema.  
  
Data is created by the ECMWF and retrieved from the KNMI ftp site as GRIB files. The original GRIB files can also be  retrieved from the flysafe server from :
+
Data is created by the ECMWF and retrieved from the KNMI ftp site as GRIB files. The original GRIB files can also be  retrieved from the Flysafe server from :
  
 
[https://flysafe-pps.grid.sara.nl/archive/knmi/ecmwf/ ECMWF Data Archive]
 
[https://flysafe-pps.grid.sara.nl/archive/knmi/ecmwf/ ECMWF Data Archive]
Line 28: Line 28:
 
The GIS LOCATION is only added to this table. This is because always the same values for latitude and  longitude are used and then if the GIS location is required for the ANALYSIS or FORECAST tables it can be retrieved by joining to this table on the LATITUDE, LONGUTUDE fields.
 
The GIS LOCATION is only added to this table. This is because always the same values for latitude and  longitude are used and then if the GIS location is required for the ANALYSIS or FORECAST tables it can be retrieved by joining to this table on the LATITUDE, LONGUTUDE fields.
  
== FORECAST ==  
+
== FORECAST and UVA_FORECAST ==  
  
 
The FORECAST table contains predicted values in steps of 3 hours till 72 hours ahead. The MODELTS is the timestamp for which this prediction applies. Two times a day new Forecast GRIB files will arrive. The newer FORECAST data that arrives for the same timestamp will overwrite the older data.
 
The FORECAST table contains predicted values in steps of 3 hours till 72 hours ahead. The MODELTS is the timestamp for which this prediction applies. Two times a day new Forecast GRIB files will arrive. The newer FORECAST data that arrives for the same timestamp will overwrite the older data.
  
The FORECAST table itself is a master table that does not contain any data. The inherited tables  
+
The FORECAST (and UVA_FORECAST) table itself is a master table that does not contain any data. The inherited tables  
FORECAST<YEAR><MONTH> contain the data, but you the master FORECAST table can be used to SELECT the data.  
+
FORECAST<YEAR><MONTH> contain the data, but the master FORECAST or UVA_FORECAST table can be used to SELECT the data.
 +
 
 +
Data till 2009-09 is owned by the UvA and can be accessed by anybody with ECMWF_READ rights by using the FORECAST_UVA table. The FORECAST table itself also references data after 2009-09 that is owned by RNLAF for which special access rights ECMWF_RNLAF_READ are required.  
  
 
{|border=1
 
{|border=1
Line 54: Line 56:
 
|TCC||164 TCC total cloud coverage
 
|TCC||164 TCC total cloud coverage
 
|-
 
|-
|U10||165 10U 10 meters wind U-component
+
|U10||165 10U 10 meters wind U-component m/s
 
|-
 
|-
|V10||166 10V 10 meters wind V-component
+
|V10||166 10V 10 meters wind V-component m/s
 
|-
 
|-
 
|T2||167 2T 2 metres air temperature
 
|T2||167 2T 2 metres air temperature
Line 113: Line 115:
 
|}
 
|}
  
== ANALYSIS ==
+
== ANALYSIS and UVA_ANALYSIS ==
 +
 
 +
The ANALYSIS and UVA_ANALYSIS table contains values from the Analysis GRIB files at 18, 00, 06 and 12 UTC
 +
 
 +
Data till 2009-09 is owned by the UvA and can be accessed by anybody with ECMWF_READ rights by using the ANALYSIS_UVA table. The ANALYSIS table itself also references data after 2009-09 that is owned by RNLAF for which special access rights ECMWF_RNLAF_READ are required.
  
The ANALYSIS table contains values from the Analysis GRIB files at 18, 00, 06 and 12 UTC
 
 
   
 
   
 
{|border=1
 
{|border=1
Line 196: Line 201:
 
ANALYSISHISTORY which inherit from FORECAST and ANALYSIS respectively.  
 
ANALYSISHISTORY which inherit from FORECAST and ANALYSIS respectively.  
  
The FORECASTHISTORY does not always contain TP (228 TP total precipitation accumulated).  Instead of this it contains the following values :
+
The FORECASTHISTORY does not always contain TP (228 TP total precipitation accumulated).  Instead, it contains the following values:
  
 
{|border=1
 
{|border=1
Line 216: Line 221:
 
|}
 
|}
  
== Helpful Functions ==  
+
== Helpful Functions & Queries ==  
  
  
Line 225: Line 230:
  
  
They can be used in the following way:
+
They can be used in the following way, which will return all the fields in the relevant ECMWF data:
  
   select * from ecmwf.forecast_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
+
   SELECT * FROM ecmwf.forecast_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
 
    
 
    
   select * from ecmwf.analysis_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
+
   SELECT * FROM ecmwf.analysis_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
 +
 
 +
Note that the model latitude and model longitude are returned as integers and multiples of 100.  To return latititude and longitude as a decimal (for example to compare to the latitude and longitude of your GPS data) you need to first convert these fields to decimals:
 +
 
 +
  SELECT f.latitude/100::decimal, f.longitude/100::decimal
 +
  FROM ecmwf.forecast_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00') f;
  
 
These function can for example be used in a join as follows :
 
These function can for example be used in a join as follows :
  
   SELECT g.ptt_id
+
   SELECT g.device_info_serial
 
       , g.date_time
 
       , g.date_time
 
       , g.latitude
 
       , g.latitude
 
       , g.longitude
 
       , g.longitude
       , g.speed
+
       , g.speed_2d
      , g.course
 
 
       , g.altitude
 
       , g.altitude
 
       , (ecmwf.forecast_nearest( g.latitude, g.longitude,g.date_time)).*
 
       , (ecmwf.forecast_nearest( g.latitude, g.longitude,g.date_time)).*
   FROM gps.gps_tracking_data g
+
   FROM gps.ee_tracking_speed_limited g
    , gps.individual i
+
   WHERE g.device_info_serial = 1  
    , gps.track_session t
 
   WHERE g.ptt_id = t.ptt_id
 
    AND g.date_time >= t.start_date
 
    AND i.ring_number = t.ring_number
 
    AND g.parser_qc = 1  
 
    AND g.ptt_id = 41745
 
 
     AND g.latitude <> 0
 
     AND g.latitude <> 0
 +
    AND g.userflag = 0
 
   ORDER BY g.date_time;
 
   ORDER BY g.date_time;
  
  
Although these functions were converted to PLPgSQL to speed them up, a faste ways to retrieve  
+
Although these functions were converted to PLPgSQL to speed them up, a faster ways to retrieve the same data is with (thanks to Floris Sluiters):  
the same data is with (thanks to Floris Sluiters):  
 
  
   SELECT ptt_id, date_time, latitude, longitude, speed, course, altitude, (x::ecmwf.forecast).*
+
   SELECT device_info_serial, date_time, latitude, longitude, speed_2d, altitude, (x::ecmwf.forecast).*
 
   FROM (SELECT  
 
   FROM (SELECT  
 
       g.*
 
       g.*
Line 265: Line 268:
 
             AND f.modelts = TIMESTAMP 'epoch' + round(EXTRACT(EPOCH FROM g.date_time)  
 
             AND f.modelts = TIMESTAMP 'epoch' + round(EXTRACT(EPOCH FROM g.date_time)  
 
                 / (60*60*3))::integer * INTERVAL '3 hours' limit 1)::ecmwf.forecast) as x  
 
                 / (60*60*3))::integer * INTERVAL '3 hours' limit 1)::ecmwf.forecast) as x  
         FROM  gps.gps_tracking_data g
+
         FROM  gps.ee_tracking_speed_limited g
            , gps.individual i
+
         WHERE g.device_info_serial = 1
            , gps.track_session t
+
            AND g.latitude <> 0
         WHERE g.ptt_id = t.ptt_id
+
            AND g.userflag = 0
          AND g.date_time >= t.start_date
 
          AND i.ring_number = t.ring_number
 
          AND g.parser_qc = 1
 
          AND g.ptt_id = 41475
 
          AND g.latitude <> 0
 
 
           ORDER BY g.date_time) t
 
           ORDER BY g.date_time) t
 +
 +
Below is example query that retrieves data for a model grid point closest to user provided coordinates and for one model time stamp (eg. 12:00) for a series of consecutive days
 +
 +
  SELECT f.modelts, f.latitude, f.longitude, f.blh
 +
  FROM ecmwf.forecast as f
 +
  WHERE f.latitude = round(42.718104*100/25)::integer*25 AND f.longitude = round(11.517727*100/25)::integer*25 and
 +
    f.modelts >= '2007-08-01 12:00' and f.modelts <= '2008-08-01 12:00' and date_part('hour', f.modelts) = 12
 +
  ORDER BY f.modelts 
 +
 +
 +
=== Interpolation ===
 +
 +
The following functions are added so that simple linear interpolation for a specific point in space and time can be done in Postgres.
 +
 +
These are '''ecmwf.forecast_interpolate''' and '''ecmwf.analysis_interpolate''' and they take a latitude, longitude and timestamp as parameter.
 +
 +
They can be used in the same way:
 +
 +
  select * from ecmwf.forecast_interpolate( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
 +
 
 +
  select * from ecmwf.analysis_interpolate( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
 +
 +
And they can also be used in joins as above. These functions select the 8 surrounding points in latitude, longitude and date_time and interpolate linear in latitude, longitude and date_time successively. The '''interpolate''' functions are much slower then the '''nearest''' functions.

Latest revision as of 10:26, 22 February 2018

ECMWF

Datamodel for the tables in the ECMWF schema.

Data is created by the ECMWF and retrieved from the KNMI ftp site as GRIB files. The original GRIB files can also be retrieved from the Flysafe server from :

ECMWF Data Archive

More information about the parameters in these GRIB files can be retrieved from the ECMWF website.

INVARIABLE

The INVARIABLE table contains data that is invariant over time.

LATITUDE degrees * 100 to make this a integer
LONGITUDE degrees * 100 to make this a integer
Z 129 Z Geopotential surface/model orography
LSM 172 LSM Land sea mask
LOCATION Latitude/longitude as GIS location


The GIS LOCATION is only added to this table. This is because always the same values for latitude and longitude are used and then if the GIS location is required for the ANALYSIS or FORECAST tables it can be retrieved by joining to this table on the LATITUDE, LONGUTUDE fields.

FORECAST and UVA_FORECAST

The FORECAST table contains predicted values in steps of 3 hours till 72 hours ahead. The MODELTS is the timestamp for which this prediction applies. Two times a day new Forecast GRIB files will arrive. The newer FORECAST data that arrives for the same timestamp will overwrite the older data.

The FORECAST (and UVA_FORECAST) table itself is a master table that does not contain any data. The inherited tables FORECAST<YEAR><MONTH> contain the data, but the master FORECAST or UVA_FORECAST table can be used to SELECT the data.

Data till 2009-09 is owned by the UvA and can be accessed by anybody with ECMWF_READ rights by using the FORECAST_UVA table. The FORECAST table itself also references data after 2009-09 that is owned by RNLAF for which special access rights ECMWF_RNLAF_READ are required.

MODELTS Time for which this forecast applies = model run + step hours
LATITUDE degrees * 100 to make this a integer
LONGITUDE degrees * 100 to make this a integer
STEP model step (smaller is more recent forecast)
SSRD 169 SSRD surface solar radiation downwards accumulated
TP 228 TP total precipitation accumulated
SP 134 SP surface pressure
BLH 159 BLH boundary layer height
TCC 164 TCC total cloud coverage
U10 165 10U 10 meters wind U-component m/s
V10 166 10V 10 meters wind V-component m/s
T2 167 2T 2 metres air temperature
D2 168 2D 2 metres dew point temperature
LCC 186 LCC low cloud coverage
Pressure level 1000
T_1000 130 T temperature
U_1000 131 U U-wind component
V_1000 132 V V-wind component
GH_1000 156 GH geopotential height
R_1000 157 R relative humidity
Pressure level 925
T_925 real 130 T temperature
U_925 real 131 U U-wind component
V_925 real 132 V V-wind component
GH_925 real 156 GH geopotential height
R_925 real 157 R relative humidity
Pressure level 850
T_850 130 T temperature
U_850 real 131 U U-wind component
V_850 real 132 V V-wind component
GH_850 real 156 GH geopotential height
R_850 real 157 R relative humidity
Pressure level 700
T_700 real 130 T temperature
U_700 real 131 U U-wind component
V_700 real 132 V V-wind component
GH_700 real 156 GH geopotential height
R_700 real 157 R relative humidity

ANALYSIS and UVA_ANALYSIS

The ANALYSIS and UVA_ANALYSIS table contains values from the Analysis GRIB files at 18, 00, 06 and 12 UTC

Data till 2009-09 is owned by the UvA and can be accessed by anybody with ECMWF_READ rights by using the ANALYSIS_UVA table. The ANALYSIS table itself also references data after 2009-09 that is owned by RNLAF for which special access rights ECMWF_RNLAF_READ are required.


MODELTS Time for which this forecast applies = model run
LATITUDE degrees * 100 to make this a integer
LONGITUDE degrees * 100 to make this a integer
SP 134 SP surface pressure
TCC 164 TCC total cloud coverage
U10 165 10U 10 meters wind U-component
V10 166 10V 10 meters wind V-component
T2 167 2T 2 metres air temperature
D2 168 2D 2 metres dew point temperature
LCC 186 LCC low cloud coverage
Pressure level 1000
T_1000 130 T temperature
U_1000 131 U U-wind component
V_1000 132 V V-wind component
GH_1000 156 GH geopotential height
R_1000 157 R relative humidity
Pressure level 925
T_925 130 T temperature
U_925 131 U U-wind component
V_925 132 V V-wind component
GH_925 156 GH geopotential height
R_925 157 R relative humidity
Pressure level 850
T_850 130 T temperature
U_850 131 U U-wind component
V_850 132 V V-wind component
GH_850 156 GH geopotential height
R_850 157 R relative humidity
Pressure level 700
T_700 130 T temperature
U_700 131 U U-wind component
V_700 132 V V-wind component
GH_700 156 GH geopotential height
R_700 157 R relative humidity

COMBINED

If you want to use Analysis data if available for a timestimp but otherwise Forecast, the COMBINED VIEW can be used. If a ANALYSIS value is available for a timestamp and latitude/longitude the ANALYSIS value will be used, otherwise the FORECAST value is given.

Historical data

In addition ongoing process of adding ECMWF data since November 11 2007, a historical data set has been loaded. That historical dataset is stored in additional inherited tables FORECASTHISTORY and ANALYSISHISTORY which inherit from FORECAST and ANALYSIS respectively.

The FORECASTHISTORY does not always contain TP (228 TP total precipitation accumulated). Instead, it contains the following values:

LSP 142 LSP Large scale precipitation
CP 143 CP Convective precipitation

The historical dat sets span the following periods and locations

Start End Area Grid
2007-01-01 2007-11-15 62degN 10degW 335degN 20degE 0.25x0.25deg
2003-02-15 2003-06-15 62degN 10degW 335degN 20degE 0.5x0.5deg
2005-01-01 2006-12-31 62degN 10degW 335degN 20degE 0.5x0.5deg

Helpful Functions & Queries

The following functions have been defined in Postgres to select the nearest gridpoint in spacetime for the ECMWF analysis and forecast tables.


These are ecmwf.forecast_nearest and ecmwf.analysis_nearest and they take a latitude, longitude and timestamp as parameter.


They can be used in the following way, which will return all the fields in the relevant ECMWF data:

 SELECT * FROM ecmwf.forecast_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
  
 SELECT * FROM ecmwf.analysis_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');

Note that the model latitude and model longitude are returned as integers and multiples of 100. To return latititude and longitude as a decimal (for example to compare to the latitude and longitude of your GPS data) you need to first convert these fields to decimals:

 SELECT f.latitude/100::decimal, f.longitude/100::decimal
 FROM ecmwf.forecast_nearest( 52.35983, 5.615, timestamp '2008-05-19 13:34:00') f;

These function can for example be used in a join as follows :

 SELECT g.device_info_serial
      , g.date_time
      , g.latitude
      , g.longitude
      , g.speed_2d
      , g.altitude
      , (ecmwf.forecast_nearest( g.latitude, g.longitude,g.date_time)).*
 FROM gps.ee_tracking_speed_limited g
 WHERE g.device_info_serial = 1 
   AND g.latitude <> 0
   AND g.userflag = 0
 ORDER BY g.date_time;


Although these functions were converted to PLPgSQL to speed them up, a faster ways to retrieve the same data is with (thanks to Floris Sluiters):

 SELECT device_info_serial, date_time, latitude, longitude, speed_2d, altitude, (x::ecmwf.forecast).*
 FROM (SELECT 
      g.*
      ,((SELECT f 
           FROM ecmwf.forecast f
          WHERE f.latitude = round(g.latitude * 100 / 25)::integer * 25
            AND f.longitude =  round(g.longitude * 100 / 25)::integer * 25
            AND f.modelts = TIMESTAMP 'epoch' + round(EXTRACT(EPOCH FROM g.date_time) 
                / (60*60*3))::integer * INTERVAL '3 hours' limit 1)::ecmwf.forecast) as x 
       FROM   gps.ee_tracking_speed_limited g
       WHERE g.device_info_serial = 1 
            AND g.latitude <> 0
            AND g.userflag = 0
         ORDER BY g.date_time) t

Below is example query that retrieves data for a model grid point closest to user provided coordinates and for one model time stamp (eg. 12:00) for a series of consecutive days

 SELECT f.modelts, f.latitude, f.longitude, f.blh
 FROM ecmwf.forecast as f
 WHERE f.latitude = round(42.718104*100/25)::integer*25 AND f.longitude = round(11.517727*100/25)::integer*25 and 
   f.modelts >= '2007-08-01 12:00' and f.modelts <= '2008-08-01 12:00' and date_part('hour', f.modelts) = 12
 ORDER BY f.modelts   


Interpolation

The following functions are added so that simple linear interpolation for a specific point in space and time can be done in Postgres.

These are ecmwf.forecast_interpolate and ecmwf.analysis_interpolate and they take a latitude, longitude and timestamp as parameter.

They can be used in the same way:

 select * from ecmwf.forecast_interpolate( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');
  
 select * from ecmwf.analysis_interpolate( 52.35983, 5.615, timestamp '2008-05-19 13:34:00');

And they can also be used in joins as above. These functions select the 8 surrounding points in latitude, longitude and date_time and interpolate linear in latitude, longitude and date_time successively. The interpolate functions are much slower then the nearest functions.