Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: compress_chunk_time_interval does not work with BIGINT #6896

Open
DmitriiAn opened this issue May 8, 2024 · 1 comment
Open

[Bug]: compress_chunk_time_interval does not work with BIGINT #6896

DmitriiAn opened this issue May 8, 2024 · 1 comment

Comments

@DmitriiAn
Copy link

DmitriiAn commented May 8, 2024

What type of bug is this?

Configuration, Incorrect result

What subsystems and features are affected?

Compression

What happened?

if timestamp in a hypertable is a BIGINT, there is no way to use compress_chunk_time_interval https://docs.timescale.com/api/latest/compression/alter_table_compression/

I would expect to use timescaledb.compress_chunk_time_interval = BIGINT, but it does not work.

TimescaleDB version affected

2.14.2

PostgreSQL version used

15.6

What operating system did you use?

What installation method did you use?

Not applicable

What platform did you run on?

Timescale Cloud

Relevant log output and stack trace

No response

How can we reproduce the bug?

create table if not exists public.device_metrics(
	timestamp bigint,
	value double precision
);

SELECT create_hypertable('public.device_metrics', 'timestamp', chunk_time_interval => 86400000000000);  -- 1 day

INSERT INTO public.device_metrics(timestamp, value)
SELECT
  EXTRACT(EPOCH FROM gs)*1e9::BIGINT, -- get nanoseconds                   
  random()*100
FROM generate_series(
  '2022-01-01 00:00:00'::timestamp, 
  '2022-06-01 00:00:00'::timestamp, 
  '1 minute'::interval
) AS gs;

We inserted 152 chunks:

SELECT count(*) FROM timescaledb_information.chunks 
WHERE hypertable_schema = 'public' and hypertable_name = 'device_metrics';
 count 
-------
   152
(1 row)

Prepare the table to be compressed: 

CREATE OR REPLACE FUNCTION unix_nano_now() returns BIGINT LANGUAGE SQL IMMUTABLE as $$  SELECT extract(epoch from now())::BIGINT * 100000000 $$;

SELECT set_integer_now_func('public.device_metrics', 'unix_nano_now', replace_if_exists => true);

Enable compression: 
1. Try to use BIGINT first:
ALTER TABLE public.device_metrics SET (timescaledb.compress,
   timescaledb.compress_chunk_time_interval = 604800000000000
);  -- 7 days 
ERROR:  invalid value for timescaledb.compress_chunk_time_interval '604800000000000'
HINT:  timescaledb.compress_chunk_time_interval must be a valid interval

2. Try to use TEXT:
ALTER TABLE public.device_metrics SET (timescaledb.compress,
   timescaledb.compress_chunk_time_interval = '7 days'
);
WARNING:  compress chunk interval is not a multiple of chunk interval, you should use a factor of chunk interval to merge as much as possible
ALTER TABLE

SELECT 
        total_chunks, 
        number_compressed_chunks, 
  pg_size_pretty(before_compression_total_bytes) as before_compression,  
        pg_size_pretty(after_compression_total_bytes) as after_compression
FROM hypertable_compression_stats('public.device_metrics');
 total_chunks | number_compressed_chunks | before_compression | after_compression 
--------------+--------------------------+--------------------+-------------------
          152 |                        0 |                    | 
(1 row)


Looks like it worked. Compression enabled. Now we'll compress the chunks and see what happens. 

Compress all chunks manually: 
DO $$
DECLARE
    chunk_record text;
BEGIN
    FOR chunk_record IN SELECT * FROM show_chunks('public.device_metrics')
    LOOP
        PERFORM public.compress_chunk(chunk_record, true);
    END LOOP;
END $$;

We still have 152 compressed chunks. It should be 7 times less

SELECT 
        total_chunks, 
        number_compressed_chunks, 
  pg_size_pretty(before_compression_total_bytes) as before_compression,  
        pg_size_pretty(after_compression_total_bytes) as after_compression
FROM hypertable_compression_stats('public.device_metrics');
 total_chunks | number_compressed_chunks | before_compression | after_compression 
--------------+--------------------------+--------------------+-------------------
          152 |                      152 | 25 MB              | 12 MB
(1 row)


Since in the doc https://docs.timescale.com/api/latest/compression/alter_table_compression/, there are no examples with BIGINT, I assume it doesn't work with this type.
@akuzm
Copy link
Member

akuzm commented May 8, 2024

Yeah, I guess it's not possible to configure normally now, because the compress_chunk_time_interval is always Interval, and for the integer columns it is converted to microseconds, but they are using the nanosecond units, so this is where the mismatch comes from. I guess that to fix it, we could treat it same as we do chunk_time_interval which is a bigint.

For the time being, I have a dumb workaround:

ALTER TABLE public.device_metrics SET (timescaledb.compress,
   timescaledb.compress_chunk_time_interval = '7000 days'    
);

Just increase the interval 1000x, so that when it's converted to microseconds, you get the desired value in nanoseconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants