Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cpu feature detection issues M1 mac #41895

Closed
gbaraldi opened this issue Aug 16, 2021 · 2 comments
Closed

Cpu feature detection issues M1 mac #41895

gbaraldi opened this issue Aug 16, 2021 · 2 comments
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips

Comments

@gbaraldi
Copy link
Member

Julia currently doesn't detect any cpu features when calling Base.BinaryPlatforms.CPUID.cpu_isa on an M1 mac. Julia isn't the only one if issues with runtime feature detection in the ARM-Apple space.
Go had this issue and has an implementation golang/go#42747.
I haven't checked, but reading the code we try to check the features via getauxval(), which isn't available on the darwin libc, and the fallback is a hardcoded name, which isn't available.

julia/src/processor_arm.cpp

Lines 1184 to 1308 in a08a3ff

static NOINLINE std::pair<uint32_t,FeatureList<feature_sz>> _get_host_cpu()
{
FeatureList<feature_sz> features = {};
// Here we assume that only the lower 32bit are used on aarch64
// Change the cast here when that's not the case anymore (and when there's features in the
// high bits that we want to detect).
features[0] = (uint32_t)jl_getauxval(AT_HWCAP);
features[1] = (uint32_t)jl_getauxval(AT_HWCAP2);
#ifdef _CPU_AARCH64_
if (test_nbit(features, 31)) // HWCAP_PACG
set_bit(features, Feature::pauth, true);
#endif
auto cpuinfo = get_cpuinfo();
auto arch = get_elf_arch();
#ifdef _CPU_ARM_
if (arch.version >= 7) {
if (arch.klass == 'M') {
set_bit(features, Feature::mclass, true);
}
else if (arch.klass == 'R') {
set_bit(features, Feature::rclass, true);
}
else if (arch.klass == 'A') {
set_bit(features, Feature::aclass, true);
}
}
switch (arch.version) {
case 8:
set_bit(features, Feature::v8, true);
JL_FALLTHROUGH;
case 7:
set_bit(features, Feature::v7, true);
break;
default:
break;
}
#endif
std::set<uint32_t> cpus;
std::vector<std::pair<uint32_t,CPUID>> list;
// Ideally the feature detection above should be enough.
// However depending on the kernel version not all features are available
// and it's also impossible to detect the ISA version which contains
// some features not yet exposed by the kernel.
// We therefore try to get a more complete feature list from the CPU name.
// Since it is possible to pair cores that have different feature set
// (Observed for exynos 9810 with exynos-m3 + cortex-a55) we'll compute
// an intersection of the known features from each core.
// If there's a core that we don't recognize, treat it as generic.
bool extra_initialized = false;
FeatureList<feature_sz> extra_features = {};
for (auto info: cpuinfo) {
auto name = (uint32_t)get_cpu_name(info);
if (name == 0) {
// no need to clear the feature set if it wasn't initialized
if (extra_initialized)
extra_features = FeatureList<feature_sz>{};
extra_initialized = true;
continue;
}
if (!check_cpu_arch_ver(name, arch))
continue;
if (cpus.insert(name).second) {
if (extra_initialized) {
extra_features = extra_features & find_cpu(name)->features;
}
else {
extra_initialized = true;
extra_features = find_cpu(name)->features;
}
list.emplace_back(name, info);
}
}
features = features | extra_features;
// Not all elements/pairs are valid
static constexpr CPU v8order[] = {
CPU::arm_cortex_a35,
CPU::arm_cortex_a53,
CPU::arm_cortex_a55,
CPU::arm_cortex_a57,
CPU::arm_cortex_a72,
CPU::arm_cortex_a73,
CPU::arm_cortex_a75,
CPU::arm_cortex_a76,
CPU::arm_neoverse_n1,
CPU::arm_neoverse_n2,
CPU::arm_neoverse_v1,
CPU::nvidia_denver2,
CPU::nvidia_carmel,
CPU::samsung_exynos_m1,
CPU::samsung_exynos_m2,
CPU::samsung_exynos_m3,
CPU::samsung_exynos_m4,
CPU::samsung_exynos_m5,
};
shrink_big_little(list, v8order, sizeof(v8order) / sizeof(CPU));
#ifdef _CPU_ARM_
// Not all elements/pairs are valid
static constexpr CPU v7order[] = {
CPU::arm_cortex_a5,
CPU::arm_cortex_a7,
CPU::arm_cortex_a8,
CPU::arm_cortex_a9,
CPU::arm_cortex_a12,
CPU::arm_cortex_a15,
CPU::arm_cortex_a17
};
shrink_big_little(list, v7order, sizeof(v7order) / sizeof(CPU));
#endif
uint32_t cpu = 0;
if (list.empty()) {
cpu = (uint32_t)generic_for_arch(arch);
}
else {
// This also covers `list.size() > 1` case which means there's a unknown combination
// consists of CPU's we know. Unclear what else we could try so just randomly return
// one...
cpu = list[0].first;
}
// Ignore feature bits that we are not interested in.
mask_features(feature_masks, &features[0]);
return std::make_pair(cpu, features);
}

LLVM does use the features since it has a separate flag -mcpu=apple-a12 so it isn't something very urgent.

Using something like https://developer.apple.com/documentation/kernel/1387446-sysctlbyname seems like an option in other not to hardcode it, for future apple silicon CPUs.

@IanButterworth IanButterworth added the system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips label Aug 16, 2021
@yuyichao
Copy link
Contributor

Dup of #40876

@gbaraldi
Copy link
Member Author

I'm not that familiar, is this the same codepath for feature detection that LLVM uses for codegen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:apple silicon Affects Apple Silicon only (Darwin/ARM64) - e.g. M1 and other M-series chips
Projects
None yet
Development

No branches or pull requests

3 participants