You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One thing I find really handy for the all.pages table is setting rank = 1000 as a quick way to get results and save cots but still see real data (often the more interesting data too, to be honest!).
We can't do that with the all.requests table. We also can't quickly look up the data for a simple site so can't do this via the all.pages table either. It would be handy to be able to do either of these by clustering the all.requests table by page or rank .
Now there are a max of 4 clustering columns and we're already using 4 for all.requests:
client
is_root_page
is_main_document
type
These are all useful so we'd need to drop one if we wanted to add a new column.
I think is_main_document is useful, but can mostly be repeated by type='html' AND is_main_document (not entirely but 99.8% of cases and the most useful ones!) so I'd prefer to replace that with either page or rank. I'm thinking page as can use that to get rank, but open to ideas.
The text was updated successfully, but these errors were encountered:
One thing I find really handy for the
all.pages
table is settingrank = 1000
as a quick way to get results and save cots but still see real data (often the more interesting data too, to be honest!).We can't do that with the
all.requests
table. We also can't quickly look up the data for a simple site so can't do this via theall.pages
table either. It would be handy to be able to do either of these by clustering theall.requests
table bypage
orrank
.Now there are a max of 4 clustering columns and we're already using 4 for
all.requests
:client
is_root_page
is_main_document
type
These are all useful so we'd need to drop one if we wanted to add a new column.
I think
is_main_document
is useful, but can mostly be repeated bytype='html' AND is_main_document
(not entirely but 99.8% of cases and the most useful ones!) so I'd prefer to replace that with eitherpage
orrank
. I'm thinkingpage
as can use that to get rank, but open to ideas.The text was updated successfully, but these errors were encountered: