其他分享
首页 > 其他分享> > [PG Upgrade Series] Extract Epoch Trap

[PG Upgrade Series] Extract Epoch Trap

作者:互联网

Background

In many cases applications need to get current Unix timestamp (seconds since 1970-01-01 00:00:00 UTC, also called Unix epoch or Unix time or POSIX time) as the following Linux command shows.

# date -d '2020-01-01' +%s
1577808000

The Extract Epoch Issue

After upgrading PostgreSQL 9.1 to PostgreSQL 11, the same SQL produces different results.

SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);

In PostgreSQL 9.1,

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.x on xxx
(1 row)
postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577808000
(1 row)
postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+08
(1 row)

However in PostgreSQL 11, the outcome is NOT same !

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 11.x on xxx
(1 row)
postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)
postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 08:00:00+08
(1 row)

Bug or Feature

After exploring PostgreSQL 9.1 and PostgreSQL 9.2 documentations, we can get some clues.

PostgreSQL 9.1 EXTRACT

epoch

For date and timestamp values, the number of seconds since 1970-01-01 00:00:00 UTC (can be negative); for interval values, the total number of seconds in the interval

PostgreSQL 9.2 EXTRACT

epoch

For timestamp with time zone values, the number of seconds since 1970-01-01 00:00:00 UTC (can be negative); for date and timestamp values, the number of seconds since 1970-01-01 00:00:00 local time; for interval values, the total number of seconds in the interval

As you can see, starting from PostgreSQL 9.2, it distinguishes between timestamp without time zone and timestamp with time zone, while PostgreSQL 9.1 not.

It makes sense as epoch is the seconds since 1970-01-01 00:00:00 UTC which is a timestamp with time zone.

When extracting epoch from a timestamp without time zone, strictly speaking, it should not use the word 'epoch' which always relates to UTC time zone.

Say, when extracting seconds from a timestamp without time zone, it will return the number of seconds since 1970-01-01 00:00:00 local time

Let's try to explain the weird result in PostgreSQL 11 when extracting seconds from a timestamp without time zone based on the documentation.

As shown below, the offset is 28800 seconds, i.e. 8 hours, which is exactly the same utc_offset of the local time zone PRC.

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 11.x on xxx
(1 row)
postgres=# SELECT 1577836800 - 1577808000;
 ?column? 
----------
    28800
(1 row)
postgres=# SELECT 28800/3600;
 ?column? 
----------
        8
(1 row)
postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
 name | abbrev | utc_offset | is_dst 
------+--------+------------+--------
 PRC  | CST    | 08:00:00   | f
(1 row)

When converting 1577836800 back to timestamp with to_timestamp(1577836800) , function to_timestamp treats the input parameter as UTC epoch time (the number of seconds since 1970-01-01 00:00:00 UTC) which is NOT the case.

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 11.x on xxx
(1 row)
postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)
postgres=# SELECT to_timestamp(1577836800);
      to_timestamp      
------------------------
 2020-01-01 08:00:00+08
(1 row)

In fact, starting from PostgreSQL 9.5 EXTRACT documentation, it already gave a notice.

epoch

Beware that applying to_timestamp to an epoch extracted from a date or timestamp value could produce a misleading result: the result will effectively assume that the original value had been given in UTC, which might not be the case.

Further more, PostgreSQL 9.2 Release Note also did explain the change.

  • Make EXTRACT(EPOCH FROM timestamp without time zone) measure the epoch from local midnight, not UTC midnight (Tom Lane)

    This change reverts an ill-considered change made in release 7.3. Measuring from UTC midnight was inconsistent because it made the result dependent on the timezone setting, which computations for timestamp without time zone should not be. The previous behavior remains available by casting the input value to timestamp with time zone.

So to fix the issue when upgrading PostgreSQL 9.1 to PostgreSQL 9.2 or later versions, simply convert the input timestamp to timestamp with time zone as below.

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 11.x on xxx
(1 row)
postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577808000
(1 row)

In a word, this is not a bug, but a trap. When you try to extract epoch from a timestamp without time zone in PostgreSQL 9.2 or later versions, you will fall into the trap.

Further testing

According to PostgreSQL 9.2 Release Note, when extracting epoch from timestamp without time zone, the result might be depending on the timezone setting.

Make EXTRACT(EPOCH FROM timestamp without time zone) measure the epoch from local midnight, not UTC midnight (Tom Lane)

Measuring from UTC midnight was inconsistent because it made the result dependent on the timezone setting, which computations for timestamp without time zone should not be.

Here are the test cases for three different time zones.

SELECT version();
SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));

SET TIME ZONE 'America/Los_Angeles';
SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));

SET TIME ZONE 'UTC';
SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));

Here summarizes above test cases.

Local (PRC) America/Los_Angeles UTC
9.1 without time zone 1577808000
(2020-01-01 00:00:00+08)
1577865600
(2020-01-01 00:00:00-08)
1577836800
(2020-01-01 00:00:00+00)
9.1 with time zone 1577808000
(2020-01-01 00:00:00+08)
1577865600
(2020-01-01 00:00:00-08)
1577836800
(2020-01-01 00:00:00+00)
11 without time zone 1577836800
(2020-01-01 08:00:00+08)
1577836800
(2019-12-31 16:00:00-08)
1577836800
(2020-01-01 00:00:00+00)
11 with time zone 1577808000
(2020-01-01 00:00:00+08)
1577865600
(2020-01-01 00:00:00-08)
1577836800
(2020-01-01 00:00:00+00)

We can see that:

  1. In PostgreSQL 9.1, extracting epoch from timestamp without time zone and timestamp with time zone produce the same result.

  2. In PostgreSQL 9.1, the result varies as time zone changes when extracting epoch from timestamp without time zone.

    While In PostgreSQL 11, extracting epoch from timestamp without time zone always return the same result regardless of time zone, since the relative seconds between each 1970-01-01 00:00:00 local time and its local now() is always same .

  3. In both PostgreSQL 9.1 and PostgreSQL 11, extracting epoch from timestamp with time zone always produces same correct result.

Following are the testing details.

Tests in PostgreSQL 9.1

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.x on xxx
(1 row)

postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
 name | abbrev | utc_offset | is_dst 
------+--------+------------+--------
 PRC  | CST    | 08:00:00   | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577808000
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+08
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577808000
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+08
(1 row)

postgres=# SET TIME ZONE 'America/Los_Angeles';
SET
postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
        name         | abbrev | utc_offset | is_dst 
---------------------+--------+------------+--------
 America/Los_Angeles | PST    | -08:00:00  | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577865600
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00-08
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577865600
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00-08
(1 row)

postgres=# SET TIME ZONE 'UTC';
SET

postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
 name | abbrev | utc_offset | is_dst 
------+--------+------------+--------
 UTC  | UTC    | 00:00:00   | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+00
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+00
(1 row)

Tests in PostgreSQL 11

postgres=# SELECT version();
                                                    version
------------------------------------------------------------------------------------------------
 PostgreSQL 11.x on xxx

postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
 name | abbrev | utc_offset | is_dst 
------+--------+------------+--------
 PRC  | CST    | 08:00:00   | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 08:00:00+08
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577808000
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+08
(1 row)

postgres=# SET TIME ZONE 'America/Los_Angeles';
SET
postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
        name         | abbrev | utc_offset | is_dst 
---------------------+--------+------------+--------
 America/Los_Angeles | PST    | -08:00:00  | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2019-12-31 16:00:00-08
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577865600
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00-08
(1 row)

postgres=# SET TIME ZONE 'UTC';
SET

postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
 name | abbrev | utc_offset | is_dst 
------+--------+------------+--------
 UTC  | UTC    | 00:00:00   | f
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp without time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+00
(1 row)

postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
 date_part  
------------
 1577836800
(1 row)

postgres=# SELECT to_timestamp(extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone));
      to_timestamp      
------------------------
 2020-01-01 00:00:00+00
(1 row)

Conclusion

To avoid the extract epoch trap regardless of PostgreSQL versions,

  1. Always extract epoch from timestamp with time zone or cast the input timestamp to timestamp with time zone.

    Following SQLs apply correctly on both PostgreSQL 9.1 and PostgreSQL 11.

    Get epoch time of some timestamp:

    postgres=# SELECT extract(epoch from '2020-01-01 00:00:00'::timestamp with time zone);
     date_part  
    ------------
     1577808000
    (1 row)
    postgres=# SELECT to_timestamp(1577808000);
          to_timestamp      
    ------------------------
     2020-01-01 00:00:00+08
    (1 row)
    

    Get epoch time of some date:

    postgres=# SELECT extract(epoch from '2020-01-01'::timestamp with time zone);
     date_part  
    ------------
     1577808000
    (1 row)
    postgres=# SELECT to_timestamp(1577808000);
          to_timestamp      
    ------------------------
     2020-01-01 00:00:00+08
    (1 row)
    

    Get the current epoch time:

    postgres=# SELECT extract(epoch from now());
        date_part     
    ------------------
     1641150931.25813
    (1 row)
    postgres=# SELECT to_timestamp(1641150931.25813);
             to_timestamp         
    ------------------------------
     2022-01-03 03:15:31.25813+08
    (1 row)
    

    Get epoch time of current date:

    postgres=# SELECT extract(epoch from current_date::timestamp with time zone);
     date_part  
    ------------
     1641139200
    (1 row)
    postgres=# SELECT to_timestamp(1641139200);
          to_timestamp      
    ------------------------
     2022-01-03 00:00:00+08
    (1 row)
    
  2. Always use timestamp with time zone everywhere including tables' columns

    e.g. always use TIMESTAMP WITH TIME ZONE data type for timestamp columns

    CREATE TABLE public.projects (
        id BIGSERIAL PRIMARY KEY,
        project_name VARCHAR(64),
        project_description TEXT,
        last_release_date DATE,
        total_cost NUMERIC(16,2),
        star_count INTEGER,
        pending_delete BOOLEAN DEFAULT false,
        create_time TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
        update_time TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
    );
    

原文链接:
https://www.cnblogs.com/dbadaily/p/pg-extract.html
您浏览的网址与此链接不一致的话,则为未授权的转载,为了更好的阅读体验,建议阅读原文。

公众号

关注 DBA Daily 公众号,第一时间收到文章的更新。
通过一线 DBA 的日常工作,学习实用数据库技术干货!

公众号优质文章推荐

写了一个简单易用的 shell 框架

PostgreSQL VACUUM 之深入浅出

华山论剑之 PostgreSQL sequence

GitLab supports only PostgreSQL now

MySQL or PostgreSQL?

PostgreSQL hstore Insight

标签:00,Upgrade,zone,Series,time,Epoch,01,2020,timestamp
来源: https://www.cnblogs.com/dbadaily/p/pg-extract.html