Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MacOS: Avoid fatal ENAMETOOLONG on SMP startup #1783

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kinkie
Copy link
Contributor

@kinkie kinkie commented Apr 17, 2024

FATAL: ... failed to shm_open(/squid-cache_mem_map_filenos.shm):
(63) File name too long

MacOS shared memory segment names must be shorter than 32 characters.

@squid-anubis squid-anubis added M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels and removed M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels labels Apr 17, 2024
src/MemStore.cc Show resolved Hide resolved
@rousskov rousskov changed the title MacOS portability: shorten shm names MacOS: Avoid fatal ENAMETOOLONG on SMP startup Apr 17, 2024
@@ -28,7 +28,7 @@
static const auto &
MapLabel()
{
static const auto label = new SBuf("transients_map");
static const auto label = new SBuf("transients");
return *label;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the simplicity of the current solution, but it has several problems that are not visible in the PR diff. Some of them are significant. In aggregate, they call for a different solution IMO.

I speculate that the current solution does not fully address the problem because Rock cache_dirs with reasonable names will exceed the same limit. Each cache_dir map name is derived from a cache_dir database filename. Those filenames will often be much longer than "cache_mem_map"! See Rock::SwapDir::inodeMapPath().

I speculate that the current solution does not fully address the problem for Squids started with -n command line option because that option value is added to the segment name. The default service name is APP_SHORTNAME (usually "squid"), but names set on the command line can be as long as 32 characters. See service_name and Ipc::Mem::Segment::GenerateName().

The current solution introduces a new inconsistency: Some StoreMap objects still use a _map name suffix/component and some do not.

Removal of the _map suffix makes debugging more misleading. For example, the message below is related to cache_mem, but Squid is not really reading cache_mem segment or pages here. It is reading a map of those pages. There are many such debugging messages, in various important contexts.

openForReadingAt: cannot open empty entry 14914 for reading cache_mem

Finally, due to code duplication, the current changes introduce an inconsistency with debugging in CollapsedForwarding::HandleNewData() that still uses transients_map. This is very minor, of course.


I suggest a different approach. Here is a very rough sketch:

1: Undo current PR changes. We will keep the existing StoreMap labels intact.

2: Adjust Ipc::Mem::Segment::GenerateName() direct and indirect callers to supply two string parameters1 (e.g., humanId and machineId). The first string is the human-friendly segment name we currently use. No changes there. The second string will contain one or more dash-separated components: scope - subscope - subsubscope... The first component uniquely identifies StoreMap caller/context (e.g., MemStore will use scope 1; Transients will use scope 2, RockSwapDir will have scope 3 -- all coming from some new enum, of course). The subsequent ids uniquely identify scopes within that scope (e.g., RockSwapDir will have subscopes for each cache_dir directive; StoreMap will add (sub)subscope 1 for slices segment, 2 for anchors segment, and 3 for filenos segment). We need this manual ID-creation process because we cannot easily and reliably create short names from user-controlled long strings; a hash or map, for example, will not work well. It is more difficult to describe than it will look in the final code.

3: Adjust Ipc::Mem::Segment::GenerateName() to take two string parameters (e.g., humandId and machineId), and to produce (on MacOS) segment names in the following format / service_name - machineId .shm. This algorithm does not accommodate very long service names, but perhaps that is OK. I suspect we must keep the service name as a part of the name to avoid clashes among multiple Squid instances running on the same host...

4: Log (via level-2 or level-3 debugs()) the mapping between Ipc::Mem::Segment::GenerateName() humanId, machineId, and return value.

5: Use the human-friendly id parameter as Segment::theName, except when making system calls. Those system calls and corresponding error messages should use the string generated by GenerateName(), of course. We can store that generated string in a new Segment::systemName_, Segment::syscallName_ , or similar data member. This will make segment debugging cleaner and consistent across platforms, which is an improvement on top of this bug fix.

Footnotes

  1. Depending on how the code ends up looking, we may introduce a simple class to encapsulate this information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do it, but I have a couple of questions:

  1. why waste precious characters on machineId? let's do away with the dashes. We could allocate exactly two characters for machined. Or what if we used a md5 hash? The map is printed anyway. It could also be part of the "mgr:info" report

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why waste precious characters on machineId? let's do away with the dashes. We could allocate exactly two characters for machined.

Good question.

We cannot use two fixed position characters because some existing use cases use three levels.

Using three or more (as needed) positions may cover existing cases, but will limit the number of items per position to a subset of valid filename characters. Limiting the number of supported segments per level this way may be OK, but it would be more difficult for a human to interpret machine ID.

Or what if we used a md5 hash?

Using (a portion of) an md5 hash introduces collisions and destroys readability of "ls" output. It doesn't really buy us much either AFAICT.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Apr 17, 2024
@squid-anubis squid-anubis added M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels and removed M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels labels May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-for-author author action is expected (and usually required)
Projects
None yet
3 participants