Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory control for external sorting #63731

Open
liuneng1994 opened this issue May 14, 2024 · 3 comments
Open

Memory control for external sorting #63731

liuneng1994 opened this issue May 14, 2024 · 3 comments
Labels

Comments

@liuneng1994
Copy link
Contributor

liuneng1994 commented May 14, 2024

image

In the current implementation, when sorting data, some data will be retained in memory. In some extreme scenarios, such as when the row size is very large, considerable additional memory overhead will be introduced during data merging. In this way, the memory for cached data (close to max_bytes_before_external_sort) and the memory required for merging data will be much higher than max_bytes_before_external_sort, resulting in insufficient memory.
Is it possible to serialize these tail data to the disk? Although this will lose some performance, the memory control will be more friendly. Since external sorting is enabled, the memory control benefits should outweigh the overhead of serialization.
Any ideas?

relate pr :#62904

@liuneng1994
Copy link
Contributor Author

@nickitat @KochetovNicolai

@nickitat
Copy link
Member

do you have a specific example when a query actually fails this way?

@liuneng1994
Copy link
Contributor Author

liuneng1994 commented May 17, 2024

do you have a specific example when a query actually fails this way?

drop table if exists test_order;
create table test_order
(
    val00  Int64,
    val01  Int64,
    val02  Int64,
    val03  Int64,
    val04  Int64,
    val05  Int64,
    val06  Int64,
    val07  Int64,
    val08  Int64,
    val09  Int64,
    val10  Int64,
    val11  Int64,
    val12  Int64,
    val13  Int64,
    val14  Int64,
    val15  Int64,
    val16  Int64,
    val17  Int64,
    val18  Int64,
    val19  Int64,
    val20  Int64,
    val21  Int64,
    val22  Int64,
    val23  Int64,
    val24  Int64,
    val25  Int64,
    val26  Int64,
    val27  Int64,
    val28  Int64,
    val29  Int64,
    val30  Int64,
    val31  Int64,
    val32  Int64,
    val33  Int64,
    val34  Int64,
    val35  Int64,
    val36  Int64,
    val37  Int64,
    val38  Int64,
    val39  Int64,
    val40  Int64,
    val41  Int64,
    val42  Int64,
    val43  Int64,
    val44  Int64,
    val45  Int64,
    val46  Int64,
    val47  Int64,
    val48  Int64,
    val49  Int64,
    val50  Int64,
    val51  Int64,
    val52  Int64,
    val53  Int64,
    val54  Int64,
    val55  Int64,
    val56  Int64,
    val57  Int64,
    val58  Int64,
    val59  Int64,
    val60  Int64,
    val61  Int64,
    val62  Int64,
    val63  Int64,
    val64  Int64,
    val65  Int64,
    val66  Int64,
    val67  Int64,
    val68  Int64,
    val69  Int64,
    val70  Int64,
    val71  Int64,
    val72  Int64,
    val73  Int64,
    val74  Int64,
    val75  Int64,
    val76  Int64,
    val77  Int64,
    val78  Int64,
    val79  Int64,
    val80  Int64,
    val81  Int64,
    val82  Int64,
    val83  Int64,
    val84  Int64,
    val85  Int64,
    val86  Int64,
    val87  Int64,
    val88  Int64,
    val89  Int64,
    val90  Int64,
    val91  Int64,
    val92  Int64,
    val93  Int64,
    val94  Int64,
    val95  Int64,
    val96  Int64,
    val97  Int64,
    val98  Int64,
    val99  Int64,
    val100 Int64,
    val101 Int64,
    val102 Int64,
    val103 Int64,
    val104 Int64,
    val105 Int64,
    val106 Int64,
    val107 Int64,
    val108 Int64,
    val109 Int64,
    val110 Int64,
    val111 Int64,
    val112 Int64,
    val113 Int64,
    val114 Int64,
    val115 Int64,
    val116 Int64,
    val117 Int64,
    val118 Int64,
    val119 Int64,
    val120 Int64,
    val121 Int64,
    val122 Int64,
    val123 Int64,
    val124 Int64,
    val125 Int64,
    val126 Int64,
    val127 Int64,
    val128 Int64,
    val129 Int64,
    val130 Int64,
    val131 Int64,
    val132 Int64,
    val133 Int64,
    val134 Int64,
    val135 Int64,
    val136 Int64,
    val137 Int64,
    val138 Int64,
    val139 Int64,
    val140 Int64,
    val141 Int64,
    val142 Int64,
    val143 Int64,
    val144 Int64,
    val145 Int64,
    val146 Int64,
    val147 Int64,
    val148 Int64,
    val149 Int64,
    val150 Int64,
    val151 Int64,
    val152 Int64,
    val153 Int64,
    val154 Int64,
    val155 Int64,
    val156 Int64,
    val157 Int64,
    val158 Int64,
    val159 Int64,
    val160 Int64,
    val161 Int64,
    val162 Int64,
    val163 Int64,
    val164 Int64,
    val165 Int64,
    val166 Int64,
    val167 Int64,
    val168 Int64,
    val169 Int64,
    val170 Int64,
    val171 Int64,
    val172 Int64,
    val173 Int64,
    val174 Int64,
    val175 Int64,
    val176 Int64,
    val177 Int64,
    val178 Int64,
    val179 Int64,
    val180 Int64,
    val181 Int64,
    val182 Int64,
    val183 Int64,
    val184 Int64,
    val185 Int64,
    val186 Int64,
    val187 Int64,
    val188 Int64,
    val189 Int64,
    val190 Int64,
    val191 Int64,
    val192 Int64,
    val193 Int64,
    val194 Int64,
    val195 Int64,
    val196 Int64,
    val197 Int64,
    val198 Int64,
    val199 Int64,
    val200 Int64,
    val201 Int64,
    val202 Int64,
    val203 Int64,
    val204 Int64,
    val205 Int64,
    val206 Int64,
    val207 Int64,
    val208 Int64,
    val209 Int64,
    val210 Int64,
    val211 Int64,
    val212 Int64,
    val213 Int64,
    val214 Int64,
    val215 Int64,
    val216 Int64,
    val217 Int64,
    val218 Int64,
    val219 Int64,
    val220 Int64,
    val221 Int64,
    val222 Int64,
    val223 Int64,
    val224 Int64,
    val225 Int64,
    val226 Int64,
    val227 Int64,
    val228 Int64,
    val229 Int64,
    val230 Int64,
    val231 Int64,
    val232 Int64,
    val233 Int64,
    val234 Int64,
    val235 Int64,
    val236 Int64,
    val237 Int64,
    val238 Int64,
    val239 Int64,
    val240 Int64,
    val241 Int64,
    val242 Int64,
    val243 Int64,
    val244 Int64,
    val245 Int64,
    val246 Int64,
    val247 Int64,
    val248 Int64,
    val249 Int64,
    val250 Int64,
    val251 Int64,
    val252 Int64,
    val253 Int64,
    val254 Int64,
    val255 Int64,
    val256 Int64,
    val257 Int64,
    val258 Int64,
    val259 Int64,
    val260 Int64,
    val261 Int64,
    val262 Int64,
    val263 Int64,
    val264 Int64,
    val265 Int64,
    val266 Int64,
    val267 Int64,
    val268 Int64,
    val269 Int64,
    val270 Int64,
    val271 Int64,
    val272 Int64,
    val273 Int64,
    val274 Int64,
    val275 Int64,
    val276 Int64,
    val277 Int64,
    val278 Int64,
    val279 Int64,
    val280 Int64,
    val281 Int64,
    val282 Int64,
    val283 Int64,
    val284 Int64,
    val285 Int64,
    val286 Int64,
    val287 Int64,
    val288 Int64,
    val289 Int64,
    val290 Int64,
    val291 Int64,
    val292 Int64,
    val293 Int64,
    val294 Int64,
    val295 Int64,
    val296 Int64,
    val297 Int64,
    val298 Int64,
    val299 Int64
) engine = MergeTree order by tuple();

insert into test_order
select number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number,
       number
from numbers(8000000);

-- max_memory_usage 5.4GB max_bytes_before_external_sort 5.3GB total data size 20.61GB

set max_memory_usage = 5798205849;
set memory_overcommit_ratio_denominator = 0;
select *
from test_order
order by val00 settings max_bytes_before_external_sort = 5690831667, prefer_external_sort_block_bytes = 150000000 format
         Null;

The above is an extreme example, but it does have a probability of happening in the insert select scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants