Newlib built using `-Os` causes things like `memcpy` to be very slow on the
RP2350 because it uses byte-wise operations. On the RP2040 this doesn't
matter because there is a ROM routine we use instead of the library, but on
the Pico 2 it's almost 10x slower than the optimal method.
Update GCC to 12.4
Update Newlib to 4.4.0
Move to -O2 library compilation
New toolchain looks to add ~10K to RP2350 flash usage (less on the RP2040).
* Add native Apple ARM silicon M1/M2/M3 support
* Identify Mac ARM in download get.py script
Thanks to the ESP32 `get.py` sources!
* Rebuild M1 w/o using strip
When get.py is run in a script the percent-update printouts shown while
downloading the toolchain end up as 100s to 1000s of lines in log files.
When stdout is not a terminal, avoid printing these percentages and
shrink logfiles significantly. Errors/etc. are still reported as normal.