Skip to content

Release v0.8.0

Latest
Compare
Choose a tag to compare
@xianjingfeng xianjingfeng released this 13 Dec 02:58
· 265 commits to master since this release
aa25cfa

Apache Uniffle (Incubating) Release v0.8.0

Highlight

  • Support TEZ
  • Introduce Netty for shuffle data transmission
  • Use off heap memory to store shuffle data.
  • Introduce REST API for cluster management.
  • Introduce command line for cluster management.

ChangeLog

  • Change license owner to ASF by @kaijchen in #5
  • Trivial code improvements by @wForget in #7
  • [Minor] Store shuffleId int to be consistent with other data structure by @zuston in #10
  • Introduce the asList method in ConfigOptions by @zuston in #9
  • Rename package by @jerqi in #6
  • Minimize apache-rat excluded files by @kaijchen in #11
  • Update module names by @kaijchen in #12
  • Covert PartitionAssignmentInfo to static inner class by @pan3793 in #15
  • [Followup] Migrate to Junit5 by @zuston in #14
  • [Bug] Fix NPE problem when process the event if application was cleared already by @colinmjj in #16
  • [CI] Enable codecov report by @kaijchen in #17
  • Correct the config description and fix typo by @zuston in #19
  • Add CI and Codecov badges in README by @kaijchen in #20
  • [Followup] Use asList method in some existing configOptions by @zuston in #18
  • Move rss-integration-spark-common-test module package by @wForget in #23
  • [INFRA] Improve asf.yaml to reduce the notifications by @jerryshao in #25
  • [TEST] Improve code coverage in rss-common by @kaijchen in #26
  • Remove redundant package by @wForget in #27
  • [CI] Switch to temurin JDK by @kaijchen in #24
  • [INFRA] Improve asf.yaml to reduce the notifications (another-try) by @jerryshao in #33
  • Bump commons-lang3 from 3.5 to 3.10 by @wForget in #28
  • Fix the log of incorrectly bound class by @wForget in #35
  • [TYPO] Fix misspelled word "integration" by @kaijchen in #34
  • Fix some hyperlink in README.md by @daugraph in #32
  • Upgrade gRPC to support Apple Silicon by @pan3793 in #13
  • Allow to specify custom tags to decide the assignment of servers by @zuston in #30
  • Optimize the bash script by @zuston in #29
  • [Improvement] reduce compiler warnings by @advancedxy in #46
  • [Chore]: document update and build time optimize by @advancedxy in #45
  • Supplement doc about assignment tags by @zuston in #47
  • [Bug] Fix skip() api maybe skip unexpected bytes which makes inconsistent data by @colinmjj in #40
  • [improvement] Remove experimental feature with ShuffleUploader by @colinmjj in #51
  • [Improvement] Provides utility classes for creating thread factories by @smallzhongfeng in #49
  • Enable spotbugs and fix high priority bugs by @kaijchen in #38
  • [CI] Change default checkstyle severity to error by @kaijchen in #57
  • [Style] Check indentation by @kaijchen in #56
  • [Experimental Feature] MR Supports Remote Spill by @frankliee in #55
  • [Improvement] Log indicate the shuffle server host:port when doing re… by @zuston in #58
  • Send commit concurrently in client side by @zuston in #59
  • Explicitly set the constructor with AccessManager when extending AccessChecker by @zuston in #43
  • [DOC] Replace Firestorm with Uniffle by @jerqi in #60
  • Introduce the extraProperties to support user-defined pluggable accessCheckers by @zuston in #42
  • Log enhancement: Merge multiple logs into oneline and add more description by @zuston in #62
  • [TEST] Add more unit tests in rss-common by @kaijchen in #63
  • [MINOR] Comments of PartitionBalanceAssignmentStrategy miss byte units by @smallzhongfeng in #68
  • [Minor] Make config keys and default values finalized by @kaijchen in #70
  • [Log Improvment] Add more detailed debug info for MR client by @frankliee in #84
  • [Improvement] Shutdown the grpc executors pool when closing by @zuston in #83
  • Log enhancement: return error message when getting assignment servers and log exception when initializing by @zuston in #64
  • [ISSUE-48] [Feature] Init Kubernetes operator directory by @jerqi in #75
  • [Improvement] No need to use synchronized lock of the method scope when getting client by @zuston in #82
  • [DOC] Remove Wechat group in README by @jerqi in #88
  • [Performance Optimization] Improve the speed of writing index file in shuffle server by @zuston in #91
  • [DOC] Update title and description in README by @kaijchen in #94
  • [Improvement] ShuffleBlock should be release when finished reading by @xianjingfeng in #74
  • [IMPROVEMENT][COMMON] Fix common module code style by @jerqi in #99
  • [Improvement]LocalStorage init use multi thread #71 by @xianjingfeng in #72
  • [Improvement] Use OR operation instead of serialization for cloning BitMaps by @kaijchen in #103
  • [Improvement] Ignore partial failure on initializing local storage in shuffle server side by @zuston in #102
  • [CI] Test compile in Java 11 and Java 17 by @kaijchen in #105
  • Sleep less time but try more times when stopping by @xianjingfeng in #112
  • [Improvement] Use ConfigBuilder to rewrite the class RssSparkConfig by @smallzhongfeng in #104
  • [Improvement] Introduce config to customize assignment server numbers in client side by @zuston in #100
  • Assign partition again if registerShuffleServers failed by @xianjingfeng in #115
  • [ISSUE-106][IMPROVEMENT] Set rpc timeout for all rpc interface by @xianjingfeng in #113
  • [MINOR][IMPROVEMENT] Avoid CoordinatorServer#initialization multiple new Configuration() by @zwangsheng in #118
  • [Improve] Remove useless server id from StorageManagerFactory#createStorageManager by @zwangsheng in #119
  • [MINOR][IMPROVEMENT][COORD] Fix coordinator module code style by @jerqi in #122
  • [Improvement] Set heartBeatExecutorService as daemon thread by @smallzhongfeng in #121
  • [JUnit] Introduce the property of trimStackTrace to show error stacktrace in mvn-test by @zuston in #126
  • Make the conf of rss.storage.basePath as list by @zuston in #130
  • [MINOR][IMPROVEMENT][STORAGE] Fix storage module code style by @jerqi in #131
  • [Improvement] Add timeout reconnection when DelegationRssShuffleManager send the request of AccessCluster by @smallzhongfeng in #139
  • [MINOR] Fix flaky test testGetHostIp by @izchen in #141
  • [Improvement] Add the number of unhealthy nodes in CoordinatorMetrics by @smallzhongfeng in #147
  • [ISSUE-48][FEATURE] Add Uniffle Dockerfile by @wangao1236 in #132
  • [BUGFIX] Fix memory leak which cause oom by @summaryzb in #145
  • [Log Improvement] Output the registering/lost/exclude nodes in log by @zuston in #148
  • [MINOR] Tagged spark hadoop version in release package by @izchen in #149
  • [DOC] Migrate the coordinator doc from README to docs page by @zuston in #153
  • [MINOR][DOC] Remove spaces when reading file of excluded nodes by @smallzhongfeng in #155
  • [Improvement] Filter null value when selecting remote storage in ApplicationManager by @smallzhongfeng in #156
  • Introduce more grpc server metrics by @zuston in #150
  • [Improvement] Introduce a new class ShuffleTaskInfo by @smallzhongfeng in #158
  • [ISSUE-76] Disallow sendShuffleData if requireBufferId expired by @xianjingfeng in #159
  • Support storing shuffle data to secured dfs cluster by @zuston in #53
  • [FOLLOWUP] Delete hdfs shuffle data files using proxy user by @zuston in #170
  • [ISSUE-48][FEATURE] Init Operator Directory by @wangao1236 in #161
  • PID file name should contains program name by @zuston in #165
  • [BUGFIX] Fix resource leak when shuffle read by @izchen in #174
  • [Improvement] ShuffleBufferManager supports triggering flush according to the size of single ShuffleBuffer by @leixm in #176
  • [Improvement] Should match from pathToStorages when appId does not exist in appIdToStorages by @smallzhongfeng in #168
  • [ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size by @leixm in #178
  • Revert "[ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size " by @jerqi in #179
  • [ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size by @leixm in #180
  • [FOLLOWUP] Store app user in ShuffleTaskInfo by @smallzhongfeng in #181
  • [ISSUE-123] Fix all test code style by @macroguo-ghy in #185
  • [ISSUE-48][FEATURE][FOLLOW UP] Add RemoteShuffleService CRD by @wangao1236 in #175
  • [FOLLOWUP] Add the conf of rss.security.hadoop.krb5-conf.file by @zuston in #184
  • Fix flaky test about kerberos by @zuston in #191
  • [Improvement] Add optional environment variables by @izchen in #187
  • [MINOR] Fix some bad practices reported by spotbugs by @kaijchen in #177
  • [ISSUE-48][FEATURE][FOLLOW UP] Add webhook component by @wangao1236 in #188
  • [Log-Improvement] Log the newly registered app id by @zuston in #193
  • [MINOR] Replace HashSet with ImmutableSet in configs by @kaijchen in #195
  • [IMPROVEMENT] Introduce the enumType in ConfigOptions by @zuston in #199
  • [ISSUE-48][FEATURE][FOLLOW UP] Generate informer and lister for crd by @wangao1236 in #202
  • [ISSUE-144] Fix flaky test RssShuffleUtilsTest#testDestroyDirectByteBuffer by @LuciferYang in #203
  • [Issue-194][Feature] Support spark 3.2.0 by @leixm in #201
  • [ISSUE-186][Feature] Use I/O cost time to select storage paths by @smallzhongfeng in #192
  • [Improvement][AQE] Avoid calling getShuffleResult multiple times by @leixm in #190
  • Fix flaky test of heartbeatTimeoutTest by @zuston in #206
  • [IMPROVEMENT] Add more metrics about local storage info by @zuston in #205
  • [MINOR][IMPROVEMENT] Return index-file size of n*SEGMENT_SIZE in HDFS reading by @zuston in #204
  • Add DISCLAIMER by @jerqi in #212
  • [TEST] Improve SimpleClusterManagerTest by @kaijchen in #216
  • [Minor] Modify the format of DISCLAIMER by @jerqi in #217
  • Add Notice and DISCLAMER file by @frankliee in #215
  • Add more badges in README by @kaijchen in #219
  • Fix incorrect log format strings by @kaijchen in #220
  • Change total lines badge url to sloc.xyz in README by @kaijchen in #222
  • [MINOR] Fix warnings reported by lgtm by @kaijchen in #223
  • [MINOR] Simplify creating buffer logic by @zuston in #227
  • Support cancelling previous ci actions by @zuston in #225
  • Use the conf of shuffleNodesNumber from jobs to be as checking factor by @zuston in #208
  • Output the stderr and stdout to output file in startup script by @zuston in #226
  • [ISSUE-48][FEATURE][FOLLOW UP] Add controller component by @wangao1236 in #214
  • Add more metrics about requiring read memory by @zuston in #231
  • Adjust the memory required times to match grpc max deadline conf by @zuston in #218
  • [MINOR] Fix flaky test by @jerqi in #238
  • [ISSUE-48][FEATURE][FOLLOW UP] Add yaml of components and crd exampes by @wangao1236 in #236
  • Fix Flaky test GetShuffleReportForMultiPartTest by @leixm in #241
  • Set the default disk capacity to the total space by @zuston in #237
  • Add issue template by @jerqi in #8
  • [MINOR] Fix inefficient map iteration by @kaijchen in #245
  • Support deploy multiple shuffle servers in a single node by @xianjingfeng in #166
  • Fast fail when reading failed in ComposedClientReadHandler by @zuston in #213
  • Fix startup shell problem by @jerqi in #251
  • New version 0.7.0-snapshot by @jerqi in #252
  • [ISSUE-196] Fix flaky test about kerberos by @zuston in #250
  • [ISSUE-48][FEATURE][FOLLOW UP] add unit test for validating rss objects by @wangao1236 in #248
  • [Improvement] Add hdfs path health check to AppBalanceSelectStorageStrategy by @smallzhongfeng in #210
  • [TYPO] Replace Chinese colon by ASCII colon by @kaijchen in #255
  • Introduce startup-silent-period mechanism to avoid partial assignments by @zuston in #247
  • Replace DISCLAIMER with DISCLAIMER-WIP by @jerqi in #258
  • [ISSUE-244] Fix flaky test of CoordinatorGrpcTest.rpcMetricsTest by @zuston in #256
  • Fix flaky test of ClientConfManagerTest by @smallzhongfeng in #260
  • [Refactor] Optimize creating shuffle handlers by @zuston in #259
  • Introduce data cleanup mechanism on stage level by @zuston in #249
  • [ISSUE-48][FEATURE][FOLLOW UP] add docs for operator by @wangao1236 in #261
  • Fix potenial missing reads of exclude nodes by @zuston in #269
  • [ISSUE-257] RssMRUtils#getBlockId change the partitionId of int type to long by @fpkgithub in #266
  • [ISSUE-273][BUG] Get shuffle result failed caused by concurrent calls to registerShuffle by @leixm in #274
  • Add enum type test about case insensitive by @zuston in #280
  • Support ZSTD by @zuston in #254
  • [ISSUE-239][BUG] RssUtils#transIndexDataToSegments should consider the length of the data file by @leixm in #275
  • Remove code quality badge and add release badge by @kaijchen in #284
  • [ISSUE-163][FEATURE] Write to hdfs when local disk can't be write by @xianjingfeng in #235
  • Upgrade Github actions for Node.js 16 by @kaijchen in #292
  • Fix NPE in WriteBufferManager.addRecord by @wForget in #296
  • Fix AbstractStorage#containsWriteHandler by @xianjingfeng in #281
  • Add more test cases on LocalStorageManager.selectStorage by @zuston in #298
  • [ISSUE-137][Improvement][AQE] Sort MapId before the data are flushed by @zuston in #293
  • [ISSUE-283][FEATURE] Support snappy compression/decompression by @amaliujia in #304
  • [ISSUE-290] Make RpcNodePort and HttpNodePort optional by @amaliujia in #305
  • [ISSUE-301][Subtask][Improvement][AQE] Merge continuous ShuffleDataSegment into single one by @zuston in #303
  • Cleanup RuntimeException and fetchRemoteStorage logic in ClientUtils by @kaijchen in #295
  • [ISSUE-135][FOLLOWUP][Improvement][AQE] Assign adjacent partitions to the same ShuffleServer by @leixm in #307
  • Correct the contributing guide link in pull-request template by @zuston in #314
  • Fix bug of "Comparison method violates its general contract" by @zuston in #315
  • [AQE][LocalOrder] Fix potenial bug when merging continuous segments by @zuston in #318
  • [AQE][LocalOrder] Fix wrong param of expectedTaskIds in LocalOrderSegmentSplit by @zuston in #319
  • [Feature] Support the estimated number of ShuffleServers required. by @leixm in #322
  • [Bug] Fix potenial bug when the index reading offset is greater than data length by @zuston in #320
  • [ISSUE-154][Improvement] Support Empty assignment to Shuffle Server by @rhh777 in #325
  • [Bug] Fix invalid owner of host path volumes by @wangao1236 in #330
  • [ISSUE-309][FEATURE] Support ShuffleServer latency metrics. by @leixm in #327
  • [ISSUE-329]Catch NPE in ShuffleTaskManager#addFinishedBlockIds by @xianjingfeng in #331
  • [BUG] Fix wrong method name by @leixm in #335
  • [ISSUE-328] Cleanup unused shuffle servers after stage completed by @xianjingfeng in #334
  • [MINOR] Migrate RankValue to the package of the common class by @smallzhongfeng in #265
  • [BUG] Fix incorrect spark metrics by @zuston in #324
  • [Improvement][LocalOrder] Add tests about keeping consistent with FixedSize when no skew optimization by @zuston in #336
  • [INFRA] Add k8s pipeline by @jerqi in #340
  • Remove unused class of RssShuffleUtils by @zuston in #345
  • [ISSUE-342][Improvement] Check Spark Serializer type by @chong0929 in #344
  • [Feature] Support user's app quota level limit by @smallzhongfeng in #311
  • [BUG][AQE][LocalOrder] Fix the bug of missed data due to block sorting by @zuston in #347
  • [ISSUE-364] Fix indexWriter don't close if exception thrown when close dataWriter by @xianjingfeng in #349
  • [BUG] Fix flaky test of AQESkewedJoinWithLocalOrderTest by @zuston in #350
  • Add collaborators by @jerqi in #351
  • [BUG][FOLLOWUP] Fix flaky test of AQESkewedJoinWithLocalOrderTest by @zuston in #355
  • [BUG][AQE][LocalOrder] Remove check of discontinuous map task ids by @zuston in #354
  • [Improvement] Task fast fail once blocks fail to send by @zuston in #332
  • [ISSUE-228][FEATURE] Add a period local storage cleaner thread by @sfwang218 in #357
  • [Improvement] Optimize the use of QuotaManager by @smallzhongfeng in #359
  • [ISSUE-339] Optimize retry logic in send shuffle data by @xianjingfeng in #361
  • [ISSUE-300] Make config type of RSS_CLIENT_TYPE as enum by @selectbook in #310
  • [ISSUE-228] Fix the problem of protobuf-java incorrect dependency at compile time by @tsface in #362
  • [ISSUE-124] Add fallback mechanism for blocks read inconsistent by @xianjingfeng in #276
  • [BUG] Fix potential memory leak when encountering disk unhealthy by @zuston in #370
  • [Improvement][AQE] Support getting memory data skip by upstream task ids by @zuston in #358
  • [ISSUE-369] Don't throw exception if blocks are corrupted but have multi replicas by @xianjingfeng in #374
  • [ISSUE-285][Improvement] Only use HDFS and LOCALFILE storageType in the test by @tiantingting5435 in #360
  • Fix typo of PreferDiffHostAssignmentStrategy by @zuston in #379
  • [ISSUE-376] Fix concurrency problems may occur when the ApplicationManager register app by @smallzhongfeng in #382
  • [ISSUE-380] Refactor the flush process to fix fallback fail by @zuston in #383
  • [Refactor] Make coordinator class more organized by @smallzhongfeng in #386
  • [ISSUE-392] Fix the bug in the shuffle data cleanup checker that causes false reports of disk corruption by @zuston in #393
  • [ISSUE-390] Print more infos after read finished by @xianjingfeng in #395
  • [Improvement] Skip blocks when read from memory by @xianjingfeng in #294
  • [Improvement] Small refactor for code quality by @advancedxy in #394
  • [BUG] Fix incorrect block info statistics after read finished by @xianjingfeng in #401
  • Revert "[Improvement] Skip blocks when read from memory (#294)" by @xianjingfeng in #403
  • [ISSUE-388][ISSUE-244][Bug] Fix incorrect usage of GRPCMetrics#setGauge by @xianjingfeng in #404
  • [MINOR] If there is no data flush to hdfs, return directly instead of throw exception by @xianjingfeng in #406
  • Support writing multi files of single partition to improve speed in HDFS storage by @zuston in #396
  • Fix incorrect metrics of event_queue_size and total_write_handler by @zuston in #411
  • [Improvement] Support skip memory data when use multiple replicas by @xianjingfeng in #400
  • [ISSUE-402] Flaky Test: QuorumTest#case1 by @Rembrant777 in #422
  • Fix flaky test QuotaManagerTest#testDetectUserResource by @xianjingfeng in #421
  • Improve README by @kaijchen in #427
  • Remove unused data structure and method by @zuston in #429
  • [FOLLOWUP] Remove unused methods in Storage interface by @zuston in #431
  • Flaky Test: AppBalanceSelectStorageStrategyTest#selectStorageTest by @smallzhongfeng in #438
  • [Minor] Move GrpcServerTest to common.rpc package by @kaijchen in #439
  • [Minor] refactor test code by @advancedxy in #432
  • [Minor][Improvement] Introduce FileWriter interface for Localfile/HDFS file writer by @zuston in #444
  • [ISSUE-169] Support metric reporter and Support promethues push gateway by @xianjingfeng in #415
  • [Improvement] Avoid selecting storage which has reached the high watermark by @zuston in #424
  • [Bug] Fix potential negative preAllocatedSize variable by @advancedxy in #428
  • [Feature] add a configuration to control shuffle data flush by @advancedxy in #445
  • [Improvement] Refactor getPartitionRange to calculate range directly by @a-li in #447
  • Fix Flaky Test: AppBalanceSelectStorageStrategyTest#selectStorageTest by @smallzhongfeng in #450
  • Fix Flaky Test: QuotaManagerTest#testDetectUserResource by @smallzhongfeng in #453
  • [ISSUE-451][Improvement] Read HDFS data files with random sequence to distribute pressure by @zuston in #452
  • [ISSUE-455] Lazily create uncompressedData by @xianjingfeng in #457
  • [ISSUE-378][HugePartition][Part-1] Record every partition data size for one app by @zuston in #458
  • [ISSUE-456] Avoid removeResources for multiple times by @xianjingfeng in #459
  • [ISSUE-461] Support Spark 3.3 by @kaijchen in #463
  • [Deps] Bump slf4j to 1.7.36 to fix vulnerability in slf4j-log4j12 by @kaijchen in #464
  • [ISSUE-448][Feature] shuffle server report storage info by @advancedxy in #449
  • [ISSUE-472] Fix Flaky Test: LocalFileServerReadHandlerTest#testDataInconsistent by @zuston in #473
  • [Improvement] Remove some unused empty server metrics by @zuston in #474
  • [Improvement] Add more logs about data flush by @zuston in #482
  • Fix potential race condition when registering remote storage info by @zuston in #481
  • [Minor] Improve readability by replacing lambda with method reference by @iwangjie in #488
  • [ISSUE-489][Minor] Cleanup some code by @iwangjie in #490
  • [Test] Assume unknown blockID in LocalFileHandlerTestBase by @kaijchen in #478
  • [ISSUE-475][Improvement] It's unnecessary to use ConcurrentHashMap for "partitionToBlockIds" in RssShuffleWriter by @jiafuzha in #480
  • [ISSUE-378][HugePartition][Part-2] Introduce memory usage limit and data flush by @zuston in #471
  • [ISSUE-484] Fix accidentally remove the storage of appId when unregistering partial shuffle in HdfsStorageManager by @zuston in #485
  • [Test] Cleanup tests with Files#createTempDir() by @kaijchen in #492
  • [Test] Add ConfigUtilsTest by @kaijchen in #500
  • [Deps] Bump protobuf to 3.19.6 to address vulnerability by @kaijchen in #499
  • [Minor] Make Constants final by @kaijchen in #501
  • Cleanup UnitConverter and improve UnitConverterTest by @kaijchen in #504
  • Fixes errors in doc header and operator install command by @zuston in #506
  • [ISSUE-378][HugePartition][Part-3] Introduce more metrics about huge partition by @zuston in #494
  • [ISSUE-378][HugePartition][Part-4] Supplement doc about huge partitions by @zuston in #505
  • [Deps] Switch to slf4j-reload4j by @kaijchen in #508
  • [ISSUE-512][Operator] Bump golang to 1.17 by @kaijchen in #515
  • [ISSUE-514] Fix flaky test: ShuffleServerGrpcTest#clearResourceTest by @xianjingfeng in #516
  • [ISSUE-507] Fix Flaky Test: ShuffleBufferManagerTest#cacheShuffleDataTest by @xianjingfeng in #511
  • [ISSUE-509] Fix Flaky Test: ShuffleBufferManagerTest#shuffleFlushThreshold by @xianjingfeng in #510
  • [ISSUE-479] [operator] refine operator's build system by @advancedxy in #491
  • [ISSUE_479][operator][followup] use exec form instead of shell form in Dockerfile by @advancedxy in #518
  • [SpotBugs] Set threshold to middle with exceptions by @kaijchen in #517
  • [ISSUE-522] [operator] pass RSS_IP to coordinator container env by @advancedxy in #523
  • [ISSUE-496][operator] infer resource request/limit from spec for init container by @advancedxy in #521
  • [SpotBugs] Remove unread protected field in Checker by @kaijchen in #520
  • [ISSUE-468] Put unavailable servers to the end of the list when sending shuffle data by @xianjingfeng in #470
  • chore: add new collaborator by @advancedxy in #535
  • [SpotBugs] Fix REC_CATCH_EXCEPTION by @kaijchen in #527
  • [ISSUE-524][operator] Upgrading rss could also be deleted by @advancedxy in #531
  • [ISSUE-553] Avoid removing buffer multiple times when clearing resources by @xianjingfeng in #534
  • [ISSUE-469][operator] feat: supports adding labels to rss pods. by @wangao1236 in #528
  • [SpotBugs] Fix UWF_UNWRITTEN_PUBLIC_OR_PROTECTED_FIELD by @kaijchen in #536
  • Result of mkdirs() is ignored in LocalFileWriteHandler#createBasePath() by @kaijchen in #537
  • [Minor] Optimize ShuffleServerInfo#hashCode by @xianjingfeng in #538
  • [SpotBugs] Enable SC_START_IN_CTOR check by @kaijchen in #541
  • [operator] fix error kind of ownerreference by @wangao1236 in #540
  • [ISSUE-542] Ensure the elements of StatusCode and RssProtos.StatusCode are the same by @xianjingfeng in #543
  • Flaky Test: LowestIOSampleCostSelectStorageStrategyTest#selectStorageTest by @smallzhongfeng in #544
  • [Improvement] Only report to the shuffle servers that owns the blocks by @zuston in #539
  • [Minor] Cleanup "throws RuntimeException" by @kaijchen in #549
  • [ISSUE-546] Replace ResponseStatusCode with StatusCode by @xianjingfeng in #547
  • [ISSUE-476][FEATURE] Respect spark.shuffle.compress configuration in Uniffle by @jiafuzha in #495
  • [ISSUE-545][operator] feat: support setting runtime class name and env for rss by @wangao1236 in #548
  • [ISSUE #525][operator] refine svc creations by @advancedxy in #530
  • [#552] docs: add more doc about spark.serializer requirement by @advancedxy in #556
  • [#559] test: use withEnvironmentVariable to replace RssUtilsTest#setEnv by @advancedxy in #560
  • [Followup #249] refactor: cleanup code and unify interfaces by @kaijchen in #558
  • [#554] feat: infer rss base storage conf from env by @advancedxy in #555
  • [MINOR] Remove commiters from collaborators by @jerqi in #563
  • [#525][FOLLOWUP] fix: add omitempty tag by @advancedxy in #565
  • [#545][FOLLOWUP] update rbac rule for webhook by @advancedxy in #566
  • [#571] feat: skip init for empty writable base dir by @advancedxy in #573
  • [#567] feat: add a shuffle-server metric about read_used_buffer_size by @zuston in #568
  • [MINOR] test: fix assertion in tests by @kaijchen in #574
  • [#575] refactor: replace switch-case with EnumMap in ComposedClientReadHandler by @kaijchen in #570
  • [#580] chore: improve CI workflows by @kaijchen in #579
  • [#410] feat: support the hot reload of coordinator's configuration by @jerqi in #572
  • [MINOR] chore(deps): bump go-restful to 2.16.0 in operator by @dependabot in #577
  • [#580] chore: move deploy/kubernetes to a standalone workflow by @kaijchen in #578
  • [MINOR] refactor: simplify ShuffleWriteClientImpl#genServerToBlocks() by @kaijchen in #594
  • [MINOR] chore: remove duplicated dependency in rss-client-mr by @kaijchen in #599
  • [#580] chore: speed up CI workflows by @kaijchen in #602
  • [MINOR] chore(operator): bump prometheus/client_golang to 1.11.1 by @dependabot in #601
  • [MINOR] docs: correct the format of server_guide doc by @zuston in #608
  • [#611] chore: update project version to 0.8.0-SNAPSHOT by @zuston in #609
  • [#613] test: SimpleClusterManagerTest#updateExcludeNodesTest by @xianjingfeng in #614
  • [MINOR] fix: allow mountPoint not containing '/' by @xianjingfeng in #607
  • [#551] docs: update templates for flaky test and pull request by @kaijchen in #588
  • [#571][FOLLOWUP] fix: optimize base dir init process by @advancedxy in #616
  • [#618] chore(docker): support downloads latest hadoop archives and mirror url by @advancedxy in #619
  • [#580] chore(ci): disable parallel build in maven by @kaijchen in #631
  • [#626] chore(ci): skip build operator if no code changes by @advancedxy in #628
  • [#627] fix(operator): support specifying custom ports by @wangao1236 in #629
  • [#626][FOLLOWUP] chore(ci): fix typo in build.yml by @advancedxy in #633
  • [#600] chore(operator): change JDK base from openjdk to eclipse-temurin by @advancedxy in #617
  • [#632]fix: respect volumes in rss spec by @advancedxy in #634
  • [#630] feat(client): Disable the localShuffleReader by default in Spark3 client by @zuston in #636
  • [#80][Part-1] feat: Add decommisson logic to shuffle server by @xianjingfeng in #606
  • [MINOR] refactor: address unchecked conversions by @kaijchen in #623
  • [#545][FOLLOWUP]feat(operator): support specifying custom affinity & tolerations by @crain-cn in #641
  • [#647] fix: Multiple coordinator produce conflicts when they delect the same file by @jerqi in #648
  • [#635] feat(client): enable LOCAL_ORDER by default for Spark AQE by @zuston in #644
  • [#576] test: increase timeout and remove initialization time in storage strategy tests by @smallzhongfeng in #646
  • [#649] test: remove mini-cluster in ClientConfManagerTest by @smallzhongfeng in #650
  • [#647][FOLLOWUP] set coordinator id before ApplicationMaster by @advancedxy in #654
  • [MINOR] docs: update the project introduction in README by @kaijchen in #653
  • [#655]feat(coordinator): heap size configurable and add gc log by @advancedxy in #656
  • [#659] fix(server): fix NPE of ShuffleDataFlushEvent's underStorage in some cases by @zuston in #660
  • [#612] test: cleanup shuffleServer instance for each test by @zuston in #658
  • [#408] test: fix memory check failure in ShuffleBufferManagerTest#bufferSizeTest by @zuston in #657
  • [MINOR] docs: fix flaky-test-report.yml by @xianjingfeng in #664
  • [#665] fix(client): keep consistent with vanilla spark when key or value is null by @zuston in #666
  • [#642]feat(server): better default options for shuffle server by @advancedxy in #662
  • [MINOR] test: address unchecked conversions in tests by @kaijchen in #624
  • [#378] feat: introduce storage manager selector by @zuston in #621
  • [#80][Part-2] feat: Add RPC logic and heartbeat logic for decommisson by @xianjingfeng in #663
  • [#675] fix: filter no space exception in checkStorageReadAndWrite by @xumanbu in #677
  • [#671] feat(coordinator): Metrics of the number of apps submitted by users by @smallzhongfeng in #672
  • [#645][Improvement] feat(operator): support manager parameter configuration by @crain-cn in #670
  • [#678] improvement: Write hdfs files asynchronously when detectStorage by @smallzhongfeng in #680
  • [MINOR] Use multithreading to detect multiple disks by @smallzhongfeng in #687
  • [#397] docs: add the usage of AccessQuotaChecker by @smallzhongfeng in #692
  • [#585] feat(netty): Add MaxDirectMemorySize option for shuffle Server by @jerqi in #690
  • [#697] fix: use the naive equals method to avoid introducing additional dependencies by @zuston in #698
  • [#691] test :fix flaky test CoordinatorMetricsTest#testCoordinatorMetrics by @smallzhongfeng in #694
  • [#585] fix: avoid unbound variable errors in start-shuffle-server.sh by @zuston in #696
  • build(deps): bump golang.org/x/net to 0.7.0 in kubernetes operator by @dependabot in #676
  • chore: remove unused log info by @zuston in #700
  • [#674] feat(docker): use JDK11 as the default java version in Dockerfile by @zuston in #683
  • [MINOR] improvement: Reduce the size of Spark patch by @jerqi in #699
  • [MINOR] fix: Add method close for ApplicationManager by @jerqi in #704
  • [#80][Part-3] feat: add REST API for decommisson by @xianjingfeng in #684
  • [MINOR] test: fix static field initialized before TempDir field injected by @advancedxy in #707
  • [#708] test: do not assume hostname of hdfs mini-cluster by @kaijchen in #709
  • [#483] test: fix flaky test ShuffleServerFaultToleranceTest by @jerqi in #705
  • [MINOR] Removed unused methods and variable by @smallzhongfeng in #702
  • [#615] improvement: Reduce task binary by removing 'partitionToServers' from RssShuffleHandle by @jiafuzha in #637
  • [#564] test(operator): add end-to-end test by @wangao1236 in #581
  • [#711] feat(netty): Add Netty port information for Shuffle Server by @jerqi in #712
  • [MINOR] test: fix tempdir leak in KerberizedHdfs tests by @kaijchen in #721
  • [MINOR] test: test class name should end with Test by @kaijchen in #724
  • [#133] feat(netty): Add StreamServer. by @leixm in #718
  • [#625] improvement: Package sun.security.krb5 is not visible in Java 11 and 17. by @slfan1989 in #726
  • [MINOR] chore(ci): test compile in java 11 and 17 with source/target version set by @kaijchen in #728
  • [MINOR] improvement(client): update log level from INFO to DEBUG to avoid noisy by @zuston in #725
  • Revert "[MINOR] test: fix tempdir leak in KerberizedHdfs tests" by @jerqi in #732
  • [#733] test: fix LocalStorageManagerTest#testGetLocalStorageInfo on Linux SSD platform by @kaijchen in #734
  • [#729] improvement: use foreach when iterate Roaring64NavigableMap for better performance by @zuston in #730
  • [MINOR] test: fix CoordinatorGrpcTest#shuffleServerHeartbeatTest on Linux SSD platform by @kaijchen in #738
  • [#133] feat(netty): Add Netty Utils by @leixm in #727
  • [#669] improvement: refresh application when reading memory data by @xianjingfeng in #741
  • [#719] feat(netty): Optimize allocation strategy by @smallzhongfeng in #739
  • [#736] feat(storage): best effort to write same hdfs file when no race condition by @zuston in #744
  • [#747] feat(tez): Add Tez Framework by @jerqi in #748
  • [#750] feat: add RssFetchFailedException by @advancedxy in #751
  • [MINOR] test: untracked files created in ShuffleFlushManagerTest by @kaijchen in #745
  • [#752] refactor: replace RuntimeException with RssException by @kaijchen in #753
  • [MINOR] chore: Remove committers from collaborators by @jerqi in #754
  • [MINOR] test: do not wrap test in try-catch block by @kaijchen in #746
  • [#762] If the storage path is not exist, get file store for its parent. by @xianjingfeng in #763
  • [#133] feat(netty): Add Encoder and Decoder. by @leixm in #742
  • [MINOR] chore(operator): remove useless configuration items in configuration example by @xianjingfeng in #764
  • [#519] Speed up ConcurrentHashMap#computeIfAbsent by @turboFei in #766
  • [#716] improvement(operator): support specifying imagePullSecrets by @xianjingfeng in #765
  • [#755] refactor: Add the method of creating thread pool by @smallzhongfeng in #767
  • [#706] Implement spill method to avoid memory deadlock by @zuston in #714
  • [#772] fix(kerberos): cache proxy user ugi to avoid memory leak by @zuston in #773
  • [MINOR] chore(spark): remove noisy log in client side by @zuston in #776
  • [#477][part-0] feat: add ShuffleManagerServer impl by @advancedxy in #777
  • [MINOR] Avoid returning null in defaultUserApps when quota file does't config user by @smallzhongfeng in #786
  • [#584] feat(netty): Add transport client pool for netty by @xumanbu in #771
  • [#720] feat(netty): support random port for netty by @xumanbu in #723
  • [#782]refactor: restrict rss.rpc.server.type to an enum by @advancedxy in #783
  • [#720][FOLLOW-UP] Correct the shuffle server id by @jerqi in #792
  • [#794] feat(operator): support delete ShuffleServer pod with Evicted status by @xianjingfeng in #795
  • [MINOR] fix: the description of Spark patches in the README.md by @jerqi in #801
  • [#799] feat: use storage host label for remote storage write metrics by @advancedxy in #800
  • [#796] bug:fix the issues of MetricReporter by @xianjingfeng in #797
  • [BUG][MINOR] Fix scripts compatible with jdk8 & jdk11 by @summaryzb in #798
  • [MINOR] test: Fix the flaky test GrpcServerTest by @java-codehunger in #789
  • [#477][part-1] feat: support stage resubmit in spark clients by @advancedxy in #787
  • [#804] improvement: Optimize CRC calculation of ByteBuffer by @jerqi in #805
  • [MINOR] chore: update import order rule to make scala a separated group by @advancedxy in #818
  • [#779] feat: Grpc server support random port by @xumanbu in #820
  • [#596] feat(netty): Use off heap memory to read HDFS data by @jerqi in #806
  • [#416] feat(hdfs): lazy initialization of hdfsShuffleWriteHandler when per-partition concurrent write is enabled by @zuston in #816
  • [MINOR] Add WORKDIR by @smallzhongfeng in #823
  • [#778] feat: Separate ShuffleServer metrics through tags by @smallzhongfeng in #812
  • [MINOR] Modify MD format by @smallzhongfeng in #829
  • [#827] feat(operator): support generating hpa objects by @wangao1236 in #828
  • [MINOR] Remove unused config SHUFFLE_EXPIRED_TIMEOUT_MS by @smallzhongfeng in #835
  • [#133] feat(netty): Rewrite protocol. by @leixm in #826
  • [#593][part-1] feat: Codec compress support ByteBuffer by @xumanbu in #830
  • [#389] Improvement: Make codec to be a singleton. by @tobehardest in #840
  • [MINOR] build(operator): update clusterrole of controller and webhook by @wangao1236 in #842
  • fix(client): disable spark memory spill by @zuston in #844
  • [#841] feat(config): Support deprecated and fallback keys for ConfigOptions by @zuston in #843
  • [#414] feat(client): support specifying per-partition's max concurrency to write in client side by @zuston in #815
  • [#770] feat(cli): Introduce apache.commons.cli basic framework by @slfan1989 in #833
  • [#133] feat(netty): Introduce ShuffleServerGrpcNettyClient. by @leixm in #839
  • [#596][FOLLOWUP] Index data support offheap read by @jerqi in #852
  • [#859][Improvement] Set MALLOC_ARENA_MAX in start-shuffle-server.sh by @rhh777 in #860
  • [#593][FOLLOWUP] Fix zstd compress dest ByteBuffer position by @xumanbu in #857
  • [MINOR] chore(ci): Add Tez pipeline by @jerqi in #862
  • [#774] docs: Fix the metadata.annotations: Too long in install.md by @cchung100m in #853
  • [#863] feat(coordinator): support comments in exclude node files by @yl09099 in #874
  • [MINOR] refactor: Reduce the usage of memory in the ShuffleWriter by @jerqi in #877
  • [#493] improvement: replace putIfAbsent with computeIfAbsent to avoid performance loss in some critical paths by @cchung100m in #876
  • [MINOR] chore: delete checkstyle-suppressions.xml by @kaijchen in #878
  • [MINOR] Remove unused code of shuffle upload by @jerqi in #883
  • [#884] improvement: Make start and stop scripts executable under the bin folder by @jiafuzha in #885
  • [#133] feat(netty): Implement ShuffleServer interface. by @leixm in #879
  • [#715] fix(mr): The container does not exit because shuffleclient is not closed by @zhaobing001 in #882
  • [#872] feat(tez): Add the common and utils class by @lifeSo in #890
  • [#886] fix(mr): MR Client may lost data or throw exception when rss.storage.type without MEMORY. by @zhengchenyu in #887
  • [#133] feat(netty): Fix IllegalReferenceCountException. by @leixm in #899
  • [MINOR] fix: Fix LocalStorageManager divide by zero exception by @leixm in #900
  • [#895] improvement: Rename Hdfs*.java to Hadoop*.java to support other Hadoop FS-compatible distributed filesystem by @jiafuzha in #898
  • [MINOR] refactor: Add overwrite annotation by @lifeSo in #905
  • [MINOR] docs: Add benchmark results by @jerqi in #904
  • [#417] refactor: Eliminate raw use of parameterized class by @cchung100m in #891
  • [#881] fix: Ensure LocalStorageMeta disk size is correctly updated when events are processed by @awdavidson in #902
  • [#896] Improvement: Support Hadoop-3.2 by @zhengchenyu in #897
  • [MINOR] docs: update document and build script for Hadoop-3.2 by @zhengchenyu in #912
  • [MINOR] refactor: Fix tez workflow and checkstyle by @zhengchenyu in #911
  • [#908] feat(tez): Write byte array shuffle data to MapOutput by @lifeSo in #909
  • [#872] feat(tez): Get parameter from Inputcontext then provide util function by @lifeSo in #915
  • [#872][FOLLOWUP] feat(tez): Add the common and utils class by @lifeSo in #894
  • [#872][FOLLOWUP] feat(tez): Modify utils and add test case by @lifeSo in #916
  • [#872][FOLLOWUP] feat(tez): Add UmbilicalUtils to get Worker info from AM by @lifeSo in #919
  • [MINOR] fix: Error logs upload fail by @zhengchenyu in #917
  • [MINOR] improvement(mr): Add @ Test to activate test case by @lifeSo in #923
  • [#927] Improvement: improve the control of server heartbeat by @summaryzb in #928
  • [#819 ] feat(tez): Tez ApplicationMaster supporting RemoteShuffle by @baitian77 in #918
  • [#854][FOLLOWUP] feat(tez): add RssTezFetcher to fetch data from worker. by @lifeSo in #920
  • [MINOR] fix: Fix kerberos ut error caused by Config.singleton is not refresh. by @zhengchenyu in #932
  • [#933] fix: incorrect metric grpc_server_connection_number by @xianjingfeng in #934
  • [#855] feat(tez): Support Tez Output OrderedPartitionedKVOutput by @bin41215 in #930
  • [#343] improvement(build): Shade Netty Native lib by @cchung100m in #924
  • [MINOR] Add new collaborators by @jerqi in #942
  • [#854][FOLLOWUP] feat(tez): Add RssTezFetcherTask to fetch data from worker for OrderedInput by @lifeSo in #935
  • [#855] feat(tez): Support Tez Output UnorderedPartitionedKVOutput by @bin41215 in #943
  • [#855] feat(tez): Support Tez Output RssUnorderedKVOutput by @bin41215 in #944
  • [#854][FOLLOWUP] feat(tez): Add Simple Fetched Allocator to allocation memory or disk for shuffle data by @lifeSo in #922
  • [#590][part-1] ManagedBuffer instead ByteBuf to hold ShuffleData by @xumanbu in #906
  • [#854][FOLLOWUP] feat(tez): Add RssShuffleScheduler to run and manager shuffle work by @lifeSo in #948
  • [#854][FOLLOWUP] feat(tez): Add RssShuffle to handle event and generate fetch task by @lifeSo in #929
  • [#940] feat: Support columnar shuffle with gluten by @summaryzb in #950
  • [#937] feat: Add rest api for servernode list of losing connection and unhealthy by @yl09099 in #938
  • [#768] feat(cli): Cli method for blacklist update by @Kwafoor in #931
  • [#381]imporvement(server): Check JAVA_HOME and HADOOP_HOME in start-shuffle-server.sh by @cchung100m in #954
  • [#854][FOLLOWUP] feat(tez): Add RssShuffleManager to run and manager shuffle work by @lifeSo in #947
  • [#956] refactor: Changes the Boolean flag that determines whether a Node is healthy to a state by @yl09099 in #959
  • [#854][FOLLOWUP] feat(tez): Add Rss Input Class to begin Tez input task by @lifeSo in #949
  • [MINOR] fix(tez): fix thread factory name by @zhengchenyu in #975
  • [#886] fix(tez): Tez Client may lost data or throw exception when rss… by @zhengchenyu in #976
  • [#940] improvement: Optimize columnar shuffle integration by @summaryzb in #958
  • [#961] refactor(coordinator): Improve Coordinator Log Format. by @slfan1989 in #962
  • [#965] feat(tez): support remote shuffle for tez framework by @zhengchenyu in #966
  • [#978][Improvement] Provides a tool class to format CLI output content by @yl09099 in #979
  • [#768] [Follow Up] feat(cli): Cli method for blacklist update. by @slfan1989 in #968
  • [MINOR] docs: update document for tez client plugin. by @zhengchenyu in #987
  • [#983] improvement(tez): Optimize tez client delivery configuration by @zhengchenyu in #985
  • [MINOR] Add new collaborators by @jerqi in #995
  • [MINOR] docs: update the document and scripts of build tez. by @zhengchenyu in #997
  • [#986] improvement(tez): Optimize the method of obtain the vertex id. by @zhengchenyu in #990
  • [#864] feat(server): Introduce Jersey to strengthen REST API by @xianjingfeng in #939
  • [#989] fix(tez): parition class is not set for RssUnorderedKVOutput. by @zhengchenyu in #994
  • [#757] feat(server): separate flush thread pools for different storage types by @leixm in #775
  • [#1001] improvement: support get all metrics by one request by @xianjingfeng in #1002
  • [#957] feat(tez): Tez examples integration test by @zhengchenyu in #982
  • [#991] Improvement(tez): TezRemoteShuffleManager support secure cluster. by @zhengchenyu in #1005
  • [#1010]chore: enable and enforce spotless by @advancedxy in #1009
  • [#992] fix(tez): convertTaskAttemptIdToLong should not consider appattemptId by @zhengchenyu in #1007
  • [#999] improvement: apply REST principles in the urls of the http interfaces by @xianjingfeng in #1000
  • [MINOR] test(tez): add storage type configs in TezRemoteShuffleManagerTest by @kaijchen in #1015
  • [#972 ] fix(tez): Add output mapOutputByteCounter metrics by @bin41215 in #1016
  • [MINOR] refactor: use general method to check the remote storage existence by @zuston in #980
  • [#993] feat(tez): Optimize the method of obtain the application attem… by @zhengchenyu in #1021
  • [#388][FOLLOWUP] fix: Fix the flaky test GrpcServerTest by @Alisha-0321 in #1023
  • [#846] feat(cli): Service's start & stop & restart through Cli. by @slfan1989 in #925
  • [#973] improvement: Make shuffle manager client RPC timeout configurable by @cchung100m in #1017
  • [#477] feat(spark): support getShuffleResult throws FetchFailedException. by @leixm in #1004
  • [#1019] test(tez) RssOrderedPartitionedKVOutputTest add close func unit test by @bin41215 in #1025
  • [#808] feat(spark): ensure thread safe and data consistency when spilling by @zuston in #848
  • [MINOR] Add new collabrator by @jerqi in #1031
  • [#1018] test(tez) RssUnorderedPartitionedKVOutputTest add close func ut by @bin41215 in #1034
  • [#133] feat(netty): integration-test supports netty. by @leixm in #1008
  • [#889] improvement: Modify the default value of the rss.coordinator.select.partition.strategy parameter to CONTINUOUS. by @leixm in #1036
  • [#889] improvement: Modify default value of single buffer flush. by @leixm in #1039
  • [#889] improvement: Modify default value of rss.server.max.concurrency.of.per-partition.write to 30. by @leixm in #1037
  • [MINOR] doc: fix metrics document by @zhengchenyu in #1040
  • [#133] feat(netty): local shuffle read support zero-copy. by @leixm in #1047
  • [#640] feat(netty): Metric system for netty server by @xumanbu in #1041
  • [#1048] fix(coord): ExcludeNodes does not take effect when the coordinator restarts. by @leixm in #1049
  • [MINOR] chore(asf): stop forwarding GitHub issues to dev mail list by @kaijchen in #1032
  • [#1011] feat(tez): Avoid recompute succeeded task. by @zhengchenyu in #1033
  • [#477] feat(spark): Fix rss.resubmit.stage does not support dynamic client conf. by @leixm in #1050
  • [MINOR] doc: Add JDK benchmark tests by @jerqi in #1059
  • [#1064] improvement(tez): Make shuffle data send thread pool configurable in WriteBufferManager. by @zhuyaogai in #1065
  • [MINOR] fix(mr): fix default value by @zhengchenyu in #1035
  • [#1068] improvement(tez): Fail fast in WriteBufferManager when failed to send shuffle data to shuffle sever. by @zhuyaogai in #1069
  • [#1070] fix(tez): shuffle server may leak if not register remote stor… by @zhengchenyu in #1076
  • [#1066] fix: The Jetty server fails to start when compiled with JDK 8 but runtime version is JDK 11. by @qijiale76 in #1067
  • [#1095] doc: Add slack link to the readme.md by @ducksblock in #1099
  • [#1100] improvement(mr): Fail fast in SortWriteBufferManager when failed to send shuffle data to shuffle server. by @zhuyaogai in #1103
  • [#299] improvement: Make config type of RSS_STORAGE_TYPE as enum by @cchung100m in #1052
  • Revert "[#299] improvement: Make config type of RSS_STORAGE_TYPE as enum" by @jerqi in #1106
  • [#1074] feat: Introduce the metric of local_storage_uniffle_used_space by @zuston in #1075
  • [MINOR] fix(tez): Add output mapOutputRecordCounter metrics by @bin41215 in #1093
  • [#1101] improvement(tez): Release server resources as soon as possible in RssDAGAppMaster. by @zhuyaogai in #1102
  • [#1043] chore: Add NOTICE-binary by @jerqi in #1097
  • [#1006] feat(spark): Support Spark 3.4 by @summaryzb in #1082
  • [#434] refactor: New utility method to cover dynamic class loading in RSSUtils by @pegasas in #1104
  • [#1060] doc: Add the document for Netty. by @leixm in #1116
  • [#1045] fix(server): shuffle server may hang when restart worker due to multi require-momery and no require-momery release by @lifeSo in #1058
  • [#299] improvement: Make config type of RSS_STORAGE_TYPE as enum by @cchung100m in #1123
  • [#1081] fix(tez): shuffle can not read the data which is flushed to hdfs by @zhengchenyu in #1118
  • [#1124] docs: Add the document for tez-client by @bin41215 in #1125
  • [#1109] fix(tez): Fix the user of remote storage. by @zhengchenyu in #1128
  • [#1114] feat: introduce hdfs host as the total_hadoop_write_data metric label by @zuston in #1107
  • [#1124] docs: modify the document for tez-client by @bin41215 in #1126
  • [#1111] fix: Shuffle server can not delete remote storage path in security cluster by @zhengchenyu in #1122
  • [#1115] improvement: Unregister shuffle explicitly when application i… by @zhengchenyu in #1131
  • [#951] build(operator): Add the HADOOP_VERSION for Docker image. by @qijiale76 in #1027
  • [#837] feat: Display information of all application through Cli. by @slfan1989 in #964
  • [#1124] docs(tez): Add the document for tez-client by @bin41215 in #1140
  • [#1124] docs(tez): Add the document of config option tez.rss.client.send.thread.num by @bin41215 in #1142
  • [#1143] docs: Correct sequence number text by reducing paragraph indentation by 1 space by @bowenliang123 in #1144
  • [#1044] chore: Add LICENSE-binary by @jerqi in #1054
  • [#1132] improvement(spark): Unregister shuffle explicitly when Spark application is stopped. by @zhuyaogai in #1139
  • [#1127] fix(netty): incorrect bytebuf release for ShuffleBlockInfo data in client side by @zuston in #1150
  • [#1051] fix(mr): Ensure configurations in both mapred-site.xml and dynamic_client.conf take effect. by @qijiale76 in #1112
  • [#1155] fix(netty): io.netty.util.internal.OutOfDirectMemoryError. by @leixm in #1151
  • [#1152] fix: Direct memory may leak in shuffle server. by @zhengchenyu in #1154
  • [MINOR] Remove leixm from collaborators by @jerqi in #1158
  • [#1108] feat(server): Add labels with disk path for local storage total_localfile_write_data metrics. by @zhuyaogai in #1160
  • [#722] test: cleanup residue files in tmp directory after tests by @cchung100m in #1134
  • [#1161] improvement: Reduce the data copy by @jerqi in #1162
  • [#1062] fix(tez): Fix to use TezIdHelper to getTaskAttemptId instead of default IdHelper. by @lifeSo in #1078
  • [#1013] fix(tez): Wait when return MapOutput of type wait by @lifeSo in #1063
  • [#1165] improvement(tez): Unregister shuffle data after completing the execution of a DAG. by @zhuyaogai in #1166
  • [#1165][FOLLOWUP] Fix some spelling mistakes. by @zhuyaogai in #1172
  • [#1164] refactor: Exposing the getDataLen method for ShuffleDataResult by @pegasas in #1170
  • [#1173] fix: incorrect shuffle server status by @xianjingfeng in #1174
  • [#1086] [Doc] Simplify the Gluten code and add the doc by @summaryzb in #1322