-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to cache the whole cube image #1276
Conversation
…mark/opt_to_cache_image_cubes
@markccchiang Just to mention, I am aware the macOS12 unit tests are failing here. They consistently failed 4 times in a row when I made Github Actions rerun them. When I let Github Actions re-run the previous commit, it passed the first time. The error is: I have tried building and running it manually with the Address Sanitizer flags as that is what Github Actions has always been doing. The runtime is quite variable: 152.68s, 65.16s, 144.22s. I figured a better test would be to compare it with the 'dev' branch. I get: 90.50s, 91.35s, 91.51s. Therefore, something in this branch has slowed down the Units Tests runtime and they can no longer consistently finish within 2 minutes. It seems to only be an issue on macOS12, which is an original M1 Mac. I have not gone so far as to identify which commit in your branch has caused the Unit Tests to slow down because first I guess we need to know if this is considered an actual issue? If it is not considered an issue, you can get around the Github Actions failure by increasing the timeout in |
@ajm-asiaa Thank you for the reminding and investigating! You are right. This is due to the sample image files I generated are too large. Because I was doing performance tests for the functionalities of cached cube images in unit tests. I will reduce the size of sample image files. So this problem will be solved without changing the timeout setting. |
…mark/opt_to_cache_image_cubes
…le cache and mip data
@markccchiang here are a few findings I observed so far:
This seems not right. The actually RAM usage from the macOS activity monitor shows the right amount. Will have more tests soon. |
@markccchiang Overall, this does improve UX significantly in many I/O intensive features. @veggiesaurus please have a test as well if possible. We also need a strategy on RAM management for the controller deployment. Currently it is the backend flag |
We might also need the ability to control the total amount of memory that cached cubes can take up. For example, if I load 10 images each with 10 GB, would that allow it, and use up 100 GB of memory? |
@markccchiang could you check if the current setup is for the total amount of caching memory, not for individual image? I also found that we need to improve the [info] logs when loading images that can or cannot be cached. The use case could be I reserved 4 GB as image cache. Then I load the following images in order
|
…move it from constructors of FullImageCache and CubeImageCache
…cheSize from ImageCache
…geChannels as UpdateValidity from ImageCache
…nd IsCurrentStokes from Frame
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review, which includes some general remarks about the current structure of the code, and ways in which it could be refactored to divide responsibility between these classes more consistently.
src/Cache/StokesCache.h
Outdated
void UpdateValidity(int stokes) override; | ||
|
||
private: | ||
bool FillCubeImageCache(std::unique_ptr<float[]>& stokes_data, int stokes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be renamed to FillStokesCache
.
test/TestCubeImageCache.cc
Outdated
@@ -0,0 +1,772 @@ | |||
/* This file is part of the CARTA Image Viewer: https://github.com/CARTAvis/carta-backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test file should be renamed (probably to TestImageCaches
), and the class and test names should be updated to match the new names in the code.
src/Frame/Frame.h
Outdated
casacore::IPosition OriginalImageShape() const; // Image shape from the original file | ||
size_t Width() const; // length of x axis | ||
size_t Height() const; // length of y axis | ||
size_t Depth() const; // length of z axis | ||
size_t NumStokes() const; // if no stokes axis, number of stokes = 1 | ||
int XAxis() const; | ||
int YAxis() const; | ||
int ZAxis() const; | ||
int StokesAxis() const; | ||
int CurrentZ() const; | ||
int CurrentStokes() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to add all of these getters for consistency, even if though some of these values aren't currently accessed from outside the frame class, but it isn't necessary to use them inside the frame class. It's fine to use the bare properties (e.g. _width
rather than Width()
). This will also greatly simplify the diff of the frame class.
src/Frame/Frame.cc
Outdated
casacore::IPosition start(_image_shape.size()); | ||
casacore::IPosition start(OriginalImageShape().size()); | ||
start = 0; | ||
casacore::IPosition end(_image_shape); | ||
casacore::IPosition end(OriginalImageShape()); | ||
end -= 1; // last position, not length | ||
|
||
// Slice x axis | ||
if (_x_axis >= 0) { | ||
if (XAxis() >= 0) { | ||
int start_x(x_range.from), end_x(x_range.to); | ||
|
||
// Normalize x constants | ||
if (start_x == ALL_X) { | ||
start_x = 0; | ||
} | ||
if (end_x == ALL_X) { | ||
end_x = _width - 1; | ||
end_x = Width() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of where the properties should still be used (rather than the getters). All of these little changes should be reverted.
There are a few places in the original code where getters are used instead of properties, and for consistency I think that we should update them, but I would prefer to do that in a separate PR (to keep unrelated changes out of this PR).
src/Frame/Frame.cc
Outdated
bool Frame::UpdateChannelCache() { | ||
return _image_cache->UpdateChannelCache(CurrentZ(), CurrentStokes()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is now a one-liner, and only used in a few places in Frame, I would prefer to remove this wrapper function and call the function on the cache directly.
src/Cache/ChannelCache.cc
Outdated
float ChannelCache::GetValue(int x, int y, int z, int stokes) { | ||
bool write_lock(false); | ||
queuing_rw_mutex_scoped cache_lock(&_cache_mutex, write_lock); | ||
return _channel_data[(_width * y) + x]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used in two different ways: from Frame, to access a single value (where this lock is required), and inside a loop in other cache functions (where this lock should not be used; the calling function should acquire the lock once).
But this is part of a broader issue with the way that the cache functions are currently used from frame, which I will describe in more detail below.
src/Cache/StokesCache.cc
Outdated
} | ||
|
||
bool StokesCache::ChannelDataAvailable(int z, int stokes) const { | ||
return (z == ALL_Z) && (stokes == _frame->CurrentStokes()) && _stokes_image_cache_valid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely the (z == ALL_Z)
check here is redundant? If the stokes matches, data is available for any z value.
src/Frame/Frame.cc
Outdated
if (ImageCacheAvailable()) { | ||
cursor_value_with_current_stokes = _image_cache->GetValue(x, y, CurrentZ(), CurrentStokes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several places in Frame where the cache is used like this. It doesn't make sense to check first if the cache is valid and then acquire the cache lock and get the data. The check should be performed after the lock is acquired, in the same function as the data retrieval.
The correct thing to do is probably to refactor all instances like this so that the public function on the cache returns a boolean and sets an input/output parameter instead, checks internally whether the data is available, and returns false if it isn't. Some cache functions already have this kind of interface, but it's not consistent.
In this specific case, there should be two separate cache functions for getting a value: a public one, which acquires the lock and checks if the data is available, and a protected internal function which just fetches the data.
It would also be helpful to unify the behaviour of functions of data which fetch data only if it exists in the cache, and functions which update the cache if necessary and then fetch data. Currently the frame explicitly updates the channel cache in several places. This should be the internal responsibility of the cache -- the frame should only invalidate the cache when the channel or stokes changes. Functions for fetching data from the cache should have an update
flag which determines whether the cache should be updated on demand when that function is called. The frame would then set that flag appropriately.
src/Frame/Frame.cc
Outdated
} else { | ||
// Required stokes is not the current stokes or the stokes needs to be computed | ||
if (ImageCacheAvailable(CurrentZ(), stokes)) { | ||
get_spatial_profile_from_cache(stokes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImageCacheAvailable
is called a second time here inside get_spatial_profile_from_cache
. But this is an example of logic which should be refactored.
…mark/opt_to_cache_image_cubes
…mark/opt_to_cache_image_cubes
|
Description
Cache all cube image data #1286
This is the trial to cache the whole cube image if its memory space is under a specific quota. Users can set the memory quota via the command line
--full_image_cache_size_available
when starting the backend. For example,./carta_backend --full_image_cache_size_available 1000
means at most 1000 MB of memory space for the backend can be used. If the memory of a cube image we open is under 1000 MB, it will cache all the cube image data. The upper limit of full image cache is defined as 90% of users' computer ram capacity. If users set the full image cache that exceeds the upper limit, the backend will reset it to the upper limit. By default, the full image cache is 0 MB. This function will not apply to HDF5 images (only available for CASA or FITS files). The cube image cache will be used in the following calculations:🚧 image moments (This item will do in a separate PR after this PR is merged)
How does this PR solve the issue? Give a brief summary.
Create an
ImageCache
struct under theFrame
class and use it to deal with the channel or cube image data cache, which is obtained from the casacore.Are there any companion PRs (frontend, protobuf)?
No.
Is there anything else that testers should know (e.g. exactly how to reproduce the issue)?
The performances of cached cube images are expected to be better than without the cache (i.e., the current dev branch). Especially when accessing spectral profiles or data in the z-direction.
Checklist
/ no changelog update neededadded corresponding fixprotobuf updated to the latest dev commit /no protobuf update needed