Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRQ: maintenance jobs for hassio data #34

Open
AlexMKX opened this issue Mar 31, 2023 · 1 comment
Open

FRQ: maintenance jobs for hassio data #34

AlexMKX opened this issue Mar 31, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@AlexMKX
Copy link

AlexMKX commented Mar 31, 2023

Hello.
Thank you for such great extension.

My usage implies some sensors to be requested every second. As a result - around 600MB data is collected every day. However, this data contains lot of dupes. Nonetheless it runs on ZFS with dedupe and compression, so storage is not a big problem, the HASS performance degrades anyway.

I created a script to remove state dupes and attribute dupes from pg base and later vacuuming. It would be great to have it in the default installation.

-- create mat view with dupes (non-changed states) and calculate old_state to fix later
create materialized view cleanup_dupes as (with d as (SELECT state_id,
                                                    FIRST_VALUE(STATE_ID) OVER (PARTITION BY ENTITY_ID,
                                                        LAST_CHANGED_TS
                                                        ORDER BY last_updated_ts) AS FST,
                                                    last_value(state_id) OVER (PARTITION BY ENTITY_ID,
                                                        LAST_CHANGED_TS
                                                        ORDER BY last_changed_ts) AS LST
                                             FROM PUBLIC.STATES
                                             WHERE to_timestamp(last_changed_ts) < (NOW() - interval '12 hours'))
                                  select *,
                                         CASE when state_id != fst and state_id != lst then true else false end as to_delete
                                  from d) with no data;

create index dupes_state_id on cleanup_dupes (state_id);
create index dupes_index_fst on cleanup_dupes (fst);
-- maintenance 
-- refresh matview with dupes
refresh materialized view cleanup_dupes;
-- fixup old_state.
UPDATE public.states
SET OLD_STATE_ID = cleanup_dupes.FST
FROM cleanup_dupes
WHERE cleanup_dupes.STATE_ID = STATES.STATE_ID
  AND cleanup_dupes.FST != STATES.STATE_ID
  and states.old_state_id is not  NULL;

-- delete duped states
delete from public.states
where exists(select state_id
             from cleanup_dupes
             where states.state_id = cleanup_dupes.state_id
               and cleanup_dupes.to_delete = true);
-- delete unreferences attributes 
delete from public.state_attributes where not exists (
    select attributes_id from public.states where states.attributes_id=state_attributes.attributes_id
) ;
-- free space
vacuum full analyze state_attributes, states;
@Expaso
Copy link
Owner

Expaso commented Jul 11, 2023

Hi @AlexMKX ,

Nice Script! Very usefull!

I only see a problem: to incorporate this into the addon, I have to be sure somebody uses this addon to store the homeassistant recorder database itself.
Most people do this, but not all.

I don't think I should sit on the users chair and proactively delete data from their tables.
What is your view on this?

@Expaso Expaso added the enhancement New feature or request label Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants