Following discussions in PR #16666, this commit updates the float
formatting code to improve the `repr` reversibility, i.e. the percentage of
valid floating point numbers that do parse back to the same number when
formatted by `repr` (in CPython it's 100%).
This new code offers a choice of 3 float conversion methods, depending on
the desired tradeoff between code size and conversion precision:
- BASIC method is the smallest code footprint
- APPROX method uses an iterative method to approximate the exact
representation, which is a bit slower but but does not have a big impact
on code size. It provides `repr` reversibility on >99.8% of the cases in
double precision, and on >98.5% in single precision (except with REPR_C,
where reversibility is 100% as the last two bits are not taken into
account).
- EXACT method uses higher-precision floats during conversion, which
provides perfect results but has a higher impact on code size. It is
faster than APPROX method, and faster than the CPython equivalent
implementation. It is however not available on all compilers when using
FLOAT_IMPL_DOUBLE.
Here is the table comparing the impact of the three conversion methods on
code footprint on PYBV10 (using single-precision floats) and reversibility
rate for both single-precision and double-precision floats. The table
includes current situation as a baseline for the comparison:
PYBV10 REPR_C FLOAT DOUBLE
current = 364688 12.9% 27.6% 37.9%
basic = 364812 85.6% 60.5% 85.7%
approx = 365080 100.0% 98.5% 99.8%
exact = 366408 100.0% 100.0% 100.0%
Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
During the coverage test, all the values encountered are within the range
of `%d`.
These locations were found using an experimental gcc plugin for `mp_printf`
error checking.
Signed-off-by: Jeff Epler <jepler@gmail.com>
This fixes the following diagnostic produced by the plugin:
error: argument 3: Format ‘%x’ requires a ‘int’ or
‘unsigned int’ (32 bits), not ‘long unsigned int’ [size 64]
[-Werror=format=]
Signed-off-by: Jeff Epler <jepler@gmail.com>
Test 'l' and 'll' sized objects. When the platform's `mp_int_t` is not 64
bits, dummy values are printed instead so the test result can match across
all platforms.
Ensure hex test values have a letter so 'x' vs 'X' is tested.
And test 'p' and 'P' pointer printing.
Signed-off-by: Jeff Epler <jepler@gmail.com>
This commit provides helpers to retrieve integer values from
mp_obj_t when the content does not fit in a 32 bits integer,
without risking an implicit wrap due to an int overflow.
Signed-off-by: Yoctopuce dev <dev@yoctopuce.com>
An attempt to build the coverage module into the nanbox binary failed, but
pointed out that these sites needed explicit conversion from pointer to
object.
Signed-off-by: Jeff Epler <jepler@gmail.com>
When `MICROPY_PY_SYS_SETTRACE` was enabled, a crash was seen in the
qemu_mips build. It seems likely that this was due to these added fields
not being initialized.
Signed-off-by: Jeff Epler <jepler@gmail.com>
In the case where an mpz number is zero, its `len` is 0 and its `dig` is
NULL. In that case, decrementing NULL via `d--` is undefined behavior
according to the C specification.
Restructuring the loops in this way avoids undefined behavior.
Also, ensure that these cases are tested in the coverage test. This
doesn't make much difference now, but would otherwise cause errors later
when the undefined behavior sanitizer is employed in CI.
Signed-off-by: Jeff Epler <jepler@gmail.com>
Test modified to reschedule itself based on a flag setting. Without the
change in the parent commit, this test executes the callback indefinitely
and hangs but with the change it runs only once each time
mp_handle_pending() is called.
Modified-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
This commit adds a new `RingIO` type which exposes the internal ring-buffer
code for general use in Python programs. It has the stream interface
making it similar to `StringIO` and `BytesIO`, except `RingIO` has a fixed
buffer size and is automatically safe when reads and writes are in
different threads or an IRQ.
This new type is enabled at the "extra features" ROM level.
Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
Necessary to pass CI when testing the V2 preview APIs.
Also adds an extra coverage test for the legacy stackctrl API, to maintain
coverage and check for any regression.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
Use new function mp_obj_new_str_from_cstr() where appropriate. It
simplifies the code, and makes it smaller too.
Signed-off-by: Jon Foster <jon@jon-foster.co.uk>
Before this change, long/mpz ints propagated into all future calculations,
even if their value could fit in a small-int object. With this change, the
result of a big-int binary op will now be converted to a small-int object
if the value fits in a small-int.
For example, a relatively common operation like `x = a * b // c` where
a,b,c all small ints would always result in a long/mpz int, even if it
didn't need to, and then this would impact all future calculations with
x.
This adds +24 bytes on PYBV11 but avoids heap allocations and potential
surprises (e.g. `big-big` is now a small `0`, and can safely be accessed
with MP_OBJ_SMALL_INT_VALUE).
Performance tests are unchanged on PYBV10, except for `bm_pidigits.py`
which makes heavy use of big-ints and gains about 8% in speed.
Unix coverage tests have been updated to cover mpz code that is now
unreachable by normal Python code (removing the unreachable code would lead
to some surprising gaps in the internal C functions and the functionality
may be needed in the future, so it is kept because it has minimal
overhead).
This work was funded through GitHub Sponsors.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The STATIC macro was introduced a very long time ago in commit
d5df6cd44a. The original reason for this was
to have the option to define it to nothing so that all static functions
become global functions and therefore visible to certain debug tools, so
one could do function size comparison and other things.
This STATIC feature is rarely (if ever) used. And with the use of LTO and
heavy inline optimisation, analysing the size of individual functions when
they are not static is not a good representation of the size of code when
fully optimised.
So the macro does not have much use and it's simpler to just remove it.
Then you know exactly what it's doing. For example, newcomers don't have
to learn what the STATIC macro is and why it exists. Reading the code is
also less "loud" with a lowercase static.
One other minor point in favour of removing it, is that it stops bugs with
`STATIC inline`, which should always be `static inline`.
Methodology for this commit was:
1) git ls-files | egrep '\.[ch]$' | \
xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/"
2) Do some manual cleanup in the diff by searching for the word STATIC in
comments and changing those back.
3) "git-grep STATIC docs/", manually fixed those cases.
4) "rg -t python STATIC", manually fixed codegen lines that used STATIC.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
Necessary to get coverage of the new event functions.
Deletes the case that called usleep(delay) for mp_hal_delay_ms(), it seems
like this wouldn't have ever happened anyhow (MICROPY_EVENT_POOL_HOOK is
always defined for the unix port).
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
This fixes the case where e.g.
struct foo_t {
mp_obj_t x;
uint16_t y;
char buf[];
};
will have `sizeof(struct foo_t)==8`, but `offsetof(struct foo_t, buf)==6`.
When computing the size to allocate for `m_new_obj_var` we need to use
offsetof to avoid over-allocating. This is important especially when it
might cause it to spill over into another GC block.
This work was funded through GitHub Sponsors.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This is a MicroPython-specific module that existed to support the old
version of uasyncio. It's undocumented and not enabled on all ports and
takes up code size unnecessarily.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The assertion that is added here (to gc.c) fails when running this new test
if ALLOC_TABLE_GAP_BYTE is set to 0.
Signed-off-by: Jeff Epler <jepler@gmail.com>
Signed-off-by: Damien George <damien@micropython.org>
Instead of being an explicit field, it's now a slot like all the other
methods.
This is a marginal code size improvement because most types have a make_new
(100/138 on PYBV11), however it improves consistency in how types are
declared, removing the special case for make_new.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This replaces occurences of
foo_t *foo = m_new_obj(foo_t);
foo->base.type = &foo_type;
with
foo_t *foo = mp_obj_malloc(foo_t, &foo_type);
Excludes any places where base is a sub-field or when new0/memset is used.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
Doing "import <tab>" will now complete/list built-in modules.
Originally at adafruit#4548 and adafruit#4608
Signed-off-by: Artyom Skrobov <tyomitch@gmail.com>
Two issues are tackled:
1. The calculation of the correct length to print is fixed to treat the
precision as a maximum length instead as the exact length.
This is done for both qstr (%q) and for regular str (%s).
2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn().
Because of the fix of above issue, some testcases that would print
an embedded null-byte (^@ in test-output) would now fail.
The bug here is that "%s" was used to print null-bytes. Instead,
mp_print_strn is used to make sure all bytes are outputted and the
exact length is respected.
Test-cases are added for both %s and %q with a combination of precision
and padding specifiers.
Previous behaviour is when this argument is set to "true", in which case
the function will raise any pending exception. Setting it to "false" will
cancel any pending exception.
This behaviour of a NULL write C method on a stream that uses the write
adaptor objects is no longer supported. It was only ever used by the
coverage build for testing the fail path of mp_get_stream_raise().