Determining Gentoo CPU_FLAGS_X86

Gentoo have recently taken the positive step of removing Intel-specific USE flags from builds directly, and introducing the new ‘CPU_FLAGS_X86‘ variable to control platform-specific hardware options.

(Presumably this opens the option for extensions such as ‘CPU_FLAGS_ARM="neon"‘, etc., in the future also…)

There is a new build app-portage/cpuinfo2cpuflags which will determine this information – but really, this feels like something that we can figure out canonically for ourselves without needing to pull-in additional packages 😉

Originally, the appropriate flags were described in /usr/portage/profiles/use*.desc and this made things easy – but these have now been removed and so we have to perform a somewhat more involved search:

For conventional Linux-on-x86, the following shell invocation should give us all the information we need:

echo $(
      echo -n 'CPU_FLAGS_X86="'
      for FLAG in $(
          find /usr/portage/ \
            -mindepth 3      \
            -maxdepth 3      \
            -type f          \
            -name \*.ebuild  \
            -exec \
              grep -Hi --colour cpu_flags_x86_ {} + \
        | grep IUSE      \
        | cut -d'"' -f 2 \
        | sort           \
        | uniq
      ); do
          echo $FLAG
      done                            \
    | grep cpu_flags_x86_             \
    | sed 's/^[+-]\?cpu_flags_x86_//' \
    | sort                            \
    | uniq                            \
    | grep -E "$(
         tail -n 7 /proc/cpuinfo \
       | grep flags              \
       | sed 's/pni/sse3/'       \
       | cut -d':' -f 2-         \
       | sed 's/^ */^/ ; s/ /$|^/g ; s/$/$/'
      )"
                             \
    | xargs echo -n
      echo '"'
)

… or, in an easier to copy-and-paste form:

echo $( echo -n 'CPU_FLAGS_X86="' ; for FLAG in $( find "${PORTDIR:-/usr/portage}" -mindepth 3 -maxdepth 3 -type f -name \*.ebuild -exec grep -Hi --colour cpu_flags_x86_ {} + | grep IUSE | cut -d'"' -f 2 | sort | uniq ); do echo $FLAG; done | grep cpu_flags_x86_ | sed 's/^[+-]\?cpu_flags_x86_//' | sort | uniq | grep -E "$( tail -n 7 /proc/cpuinfo | grep flags | sed 's/pni/sse3/' | cut -d':' -f 2- | sed 's/^ */^/ ; s/ /$|^/g ; s/$/$/' )" | xargs echo -n ; echo '"' )

… whereas some customisation for prefix on x86 platforms such as Mac OS X:

echo $( echo -n 'CPU_FLAGS_X86="' ; for FLAG in $( find "${PORTDIR:-/usr/portage}" -mindepth 3 -maxdepth 3 -type f -name \*.ebuild -exec grep -Hi --colour cpu_flags_x86_ {} + | grep IUSE | cut -d'"' -f 2 | sort | uniq ); do echo $FLAG; done | grep cpu_flags_x86_ | sed 's/^[+-]\?cpu_flags_x86_//' | sort | uniq | grep -E "$( /usr/sbin/sysctl hw | grep 'optional\..*: 1$' | cut -d'.' -f 3 | cut -d':' -f 1 | sed 's/upplemental// ; s/avx1_0/avx/' | xargs echo | sed 's/^ */^/ ; s/ /$|^/g ; s/$/$/' )" | xargs echo -n ; echo '"' )

… noting that the results from Linux running (under emulation) on Mac OS will return more comprehensive results that for above since, for example, the ‘sysctl hw‘ output doesn’t include flags such as ‘popcnt‘.

Note that with the move to named repos in place of /usr/portage, relying on ${PORTDIR} or hard-coding ‘/usr/portage‘ will likely be broken fairly shortly. The suggested way to fix this appears to be to use ‘portageq repos_config "${EROOT:-/}"‘ – but this in turn is slow, and would seem to require the following logic:

type -pf portageq >/dev/null 2>&1 || { echo >&2 "FATAL: Cannot locate portageq script" ; exit 1 ; }
local default="$(
    section="
DEFAULT"
    eval portageq repos_config "
${EROOT:-/}" | awk "
        BEGIN                    { output = 0 }
        /^\s*\[.*\]\s*$/         { output = 0 }
        ( 1 == output )          { print  \$0 }
        /^\s*\[${section}\]\s*$/ { output = 1 }
    " | grep "main-repo = " | sed 's/^.* = //'
)"
|| { echo >&2 "FATAL: Cannot determine default repo location: ${?}" ; exit 1 ; }

[[ -n "${default:-}" ]] || { echo >&2 "FATAL: Discovered invalid default repo '${default:-}'" ; exit 1 ; }

local portdir="$(
    section="
${default}"
    eval portageq repos_config "
${EROOT:-/}" | awk "
        BEGIN                    { output = 0 }
        /^\s*\[.*\]\s*$/         { output = 0 }
        ( 1 == output )          { print  \$0 }
        /^\s*\[${section}\]\s*$/ { output = 1 }
    " | grep "location = " | sed 's/^.* = //'
)"
|| { echo >&2 "FATAL: Cannot determine '${default}' repo location: ${?}" ; exit 1 ; }

[[ -n "${portdir:-}" ]] || { echo >&2 "FATAL: Could not determine location of default repo '${default}'" ; exit 1 ; }
portdir="$( readlink -e "${portdir}" 2>/dev/null )" || { echo >&2 "FATAL: Discovered invalid '${default}' repo location '${portdir}'" ; exit 1 ; }
[[ -d "${portdir}" ]] || { echo >&2 "FATAL: Discovered invalid '${default}' repo location '${portdir}'" ; exit 1 ; }

… simply to work out the appropriate value for ‘${PORTDIR}‘!

(This could be somewhat simplified by including stdlib.sh from github.com and calling the ‘std::getfilesection()‘ function, as follows:

type -pf portageq >/dev/null 2>&1 || die "Cannot locate portageq script"
local default="$(
    std::getfilesection "
DEFAULT" < (
        portageq repos_config "
${EROOT:-/}"
    ) | grep "
main-repo = " | sed 's/^.* = //'
)"
|| die "Cannot determine default repo location: ${?}"

[[ -n "${default:-}" ]] || die "Discovered invalid default repo '${default:-}'"

local portdir="$(
    std::getfilesection "
${default}" <(
        portageq repos_config "
${EROOT:-/}"
    ) | grep "
location = " | sed 's/^.* = //'
)"
|| die "Cannot determine '${default}' repo location: ${?}"

[[ -n "${portdir:-}" ]] || die "Could not determine location of default repo '${default}'"
portdir="$( readlink -e "${portdir}" 2>/dev/null )" || die "Discovered invalid '${default}' repo location '${portdir}'"
[[ -d "${portdir}" ]] || die "Discovered invalid '${default}' repo location '${portdir}'"

… which is, at least, somewhat shorter…)