updated docs

This commit is contained in:
Zoltán Vörös 2020-11-03 20:58:36 +01:00
parent 0f634f9ee3
commit ebd91b8ee2
8 changed files with 268 additions and 71 deletions

View file

@ -27,7 +27,7 @@ copyright = '2019-2020, Zoltán Vörös and contributors'
author = 'Zoltán Vörös'
# The full version, including alpha/beta/rc tags
release = '1.1.3'
release = '1.2.0'
# -- General configuration ---------------------------------------------------
@ -58,7 +58,7 @@ latex_maketitle = r'''
\Huge\textbf{The $\mu$lab book}
\vskip 0.5em
\LARGE
\textbf{Release 1.1.3}
\textbf{Release 1.2.0}
\vskip 5em
\huge\textbf{Zoltán Vörös}
\end{flushright}

View file

@ -4,26 +4,26 @@ Introduction
Enter ulab
----------
``ulab`` is a ``numpy``-like module for ``micropython``, meant to
simplify and speed up common mathematical operations on arrays. Our goal
was to implement a small subset of ``numpy`` that might be useful in the
context of a microcontroller. This means low-level data processing of
array data of up to four dimensions.
``ulab`` is a ``numpy``-like module for ``micropython`` and its
derivatives, meant to simplify and speed up common mathematical
operations on arrays. ``ulab`` implements a small subset of ``numpy``.
The functions were chosen such that they might be useful in the context
of a microcontroller. However, the project is a living one, and
suggestions for new functions are always welcome.
This document discusses how you can use the library, starting from
building your own firmware, through questions like what affects the
firmware size, what are the trade-offs, and what are the most important
differences to ``numpy``. The document is organised as follows:
The second chapter (the first after this one) helps you with firmware
customisation.
The chapter after this one helps you with firmware customisation.
The third chapter gives a very concise summary of the ``ulab`` functions
and array methods. This chapter can be used as a quick reference.
The fourth chapter is an in-depth review of most functions. Here you can
find usage examples, benchmarks, as well as a thorough discussion of
such concepts as broadcasting, and views versus copies.
The chapters after that are an in-depth review of most functions. Here
you can find usage examples, benchmarks, as well as a thorough
discussion of such concepts as broadcasting, and views versus copies.
The final chapter of this book can be regarded as the programming
manual. The inner working of ``ulab`` is dissected here, and you will
@ -43,21 +43,22 @@ catastrophic system failure, if these data are not processed in time,
because the microcontroller is supposed to interact with the outside
world in a timely fashion. In fact, this latter objective was the
initiator of this project: I needed the Fourier transform of a signal
coming from the ADC of the pyboard, and all available options were
coming from the ADC of the ``pyboard``, and all available options were
simply too slow.
In addition to speed, another issue that one has to keep in mind when
working with embedded systems is the amount of available RAM: I believe,
everything here could be implemented in pure python with relatively
little effort (in fact, there are a couple of python-only
everything here could be implemented in pure ``python`` with relatively
little effort (in fact, there are a couple of ``python``-only
implementations of ``numpy`` functions out there), but the price we
would have to pay for that is not only speed, but RAM, too. python code,
if is not frozen, and compiled into the firmware, has to be compiled at
runtime, which is not exactly a cheap process. On top of that, if
numbers are stored in a list or tuple, which would be the high-level
container, then they occupy 8 bytes, no matter, whether they are all
smaller than 100, or larger than one hundred million. This is obviously
a waste of resources in an environment, where resources are scarce.
would have to pay for that is not only speed, but RAM, too. ``python``
code, if is not frozen, and compiled into the firmware, has to be
compiled at runtime, which is not exactly a cheap process. On top of
that, if numbers are stored in a list or tuple, which would be the
high-level container, then they occupy 8 bytes, no matter, whether they
are all smaller than 100, or larger than one hundred million. This is
obviously a waste of resources in an environment, where resources are
scarce.
Finally, there is a reason for using ``micropython`` in the first place.
Namely, that a microcontroller can be programmed in a very elegant, and
@ -71,7 +72,7 @@ are implemented in a way that
1. conforms to ``numpy`` as much as possible
2. is so frugal with RAM as possible,
3. and yet, fast. Much faster than pure python. Think of a speed-up of
3. and yet, fast. Much faster than pure python. Think of speed-ups of
30-50!
The main points of ``ulab`` are
@ -79,8 +80,8 @@ The main points of ``ulab`` are
- compact, iterable and slicable containers of numerical data in one to
four dimensions. These containers support all the relevant unary and
binary operators (e.g., ``len``, ==, +, \*, etc.)
- vectorised computations on micropython iterables and numerical arrays
(in ``numpy``-speak, universal functions)
- vectorised computations on ``micropython`` iterables and numerical
arrays (in ``numpy``-speak, universal functions)
- computing statistical properties (mean, standard deviation etc.) on
arrays
- basic linear algebra routines (matrix inversion, multiplication,
@ -88,14 +89,17 @@ The main points of ``ulab`` are
decomposition and so on)
- polynomial fits to numerical data, and evaluation of polynomials
- fast Fourier transforms
- filtering of data (convolution and second-order filters)
- function minimasation, fitting, and numerical approximation routines
``ulab`` implements close to a hundred functions and array methods. At
the time of writing this manual (for version 1.0.0), the library adds
approximately 100 kB of extra compiled code to the micropython
approximately 100 kB of extra compiled code to the ``micropython``
(pyboard.v.11) firmware. However, if you are tight with flash space, you
can easily shave tens of kB off the firmware. See the section on
`customising ulab <#Custom_builds>`__.
can easily shave tens of kB off the firmware. In fact, if only a small
sub-set of functions are needed, you can get away with less than 10 kB
of flash space. See the section on `customising
ulab <#Custom_builds>`__.
Resources and legal matters
---------------------------
@ -406,8 +410,8 @@ number of data types. As an example, the innocent-looking expression
requires 25 loops in C, because the ``dtypes`` of both ``a``, and ``b``
can assume 5 different values, and the addition has to be resolved for
all possible cases. A hint: each binary operator costs between 3 and 4
kB in two dimensions.
all possible cases. Hint: each binary operator costs between 3 and 4 kB
in two dimensions.
The ulab version string
-----------------------
@ -469,17 +473,22 @@ firmware is calling ``dir`` with ``ulab`` as its argument.
import ulab as np
print(dir(np))
print('class-level functions: \n', dir(np))
# since fft and linalg are sub-modules, print them separately
print(dir(np.fft))
print(dir(np.linalg))
print('\nfunctions included in the fft module: \n', dir(np.fft))
print('\nfunctions included in the linalg module: \n', dir(np.linalg))
.. parsed-literal::
['__class__', '__name__', 'bool', 'sort', 'sum', '__version__', 'acos', 'acosh', 'arange', 'arctan2', 'argmax', 'argmin', 'argsort', 'around', 'array', 'asin', 'asinh', 'atan', 'atanh', 'bisect', 'ceil', 'clip', 'concatenate', 'convolve', 'cos', 'cross', 'degrees', 'diff', 'e', 'equal', 'erf', 'erfc', 'exp', 'expm1', 'eye', 'fft', 'flip', 'float', 'floor', 'fmin', 'full', 'gamma', 'get_printoptions', 'int16', 'int8', 'interp', 'lgamma', 'linalg', 'linspace', 'log', 'log10', 'log2', 'logspace', 'max', 'maximum', 'mean', 'min', 'minimum', 'ndinfo', 'newton', 'not_equal', 'ones', 'pi', 'polyfit', 'polyval', 'radians', 'roll', 'set_printoptions', 'sin', 'sinh', 'sosfilt', 'sqrt', 'std', 'tan', 'tanh', 'trapz', 'uint16', 'uint8', 'vectorize', 'zeros']
['__class__', '__name__', 'fft', 'ifft', 'spectrogram']
['__class__', '__name__', 'cholesky', 'det', 'dot', 'eig', 'inv', 'norm', 'size', 'trace']
class-level functions:
['__class__', '__name__', 'bool', 'sort', 'sum', '__version__', 'acos', 'acosh', 'arange', 'arctan2', 'argmax', 'argmin', 'argsort', 'around', 'array', 'asin', 'asinh', 'atan', 'atanh', 'bisect', 'ceil', 'clip', 'concatenate', 'convolve', 'cos', 'cosh', 'cross', 'degrees', 'diff', 'e', 'equal', 'erf', 'erfc', 'exp', 'expm1', 'eye', 'fft', 'flip', 'float', 'floor', 'fmin', 'full', 'gamma', 'get_printoptions', 'int16', 'int8', 'interp', 'lgamma', 'linalg', 'linspace', 'log', 'log10', 'log2', 'logspace', 'max', 'maximum', 'mean', 'min', 'minimum', 'ndinfo', 'newton', 'not_equal', 'ones', 'pi', 'polyfit', 'polyval', 'radians', 'roll', 'set_printoptions', 'sin', 'sinh', 'sosfilt', 'sqrt', 'std', 'tan', 'tanh', 'trapz', 'uint16', 'uint8', 'user', 'vectorize', 'zeros']
functions included in the fft module:
['__class__', '__name__', 'fft', 'ifft', 'spectrogram']
functions included in the linalg module:
['__class__', '__name__', 'cholesky', 'det', 'dot', 'eig', 'inv', 'norm', 'size', 'trace']

View file

@ -740,6 +740,96 @@ entries of the source array are *copied* into the target array.
.dtype
~~~~~~
``numpy``:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.dtype.htm
The ``.dtype`` property returns the ``dtype`` of an array. This can then
be used for initialising another array with the matching type. ``ulab``
implements two versions of ``dtype``; one that is ``numpy``-like, i.e.,
one, which returns a ``dtype`` object, and one that is significantly
cheaper in terms of flash space, but does not define a the ``dtype``
object, and returns a single character (number) instead.
**WARNING**: in ``circuitpython``:
.. code::
# code to be run in micropython
import ulab as np
a = np.array([1, 2, 3, 4], dtype=np.int8)
b = np.array([5, 6, 7], dtype=a.dtype)
print('a: ', a)
print('dtype of a: ', a.dtype)
print('\nb: ', b)
.. parsed-literal::
a: array([1, 2, 3, 4], dtype=int8)
dtype of a: dtype('int8')
b: array([5, 6, 7], dtype=int8)
**WARNING:** in ``micropython``:
.. code::
# code to be run in micropython
import ulab as np
a = np.array([1, 2, 3, 4], dtype=np.int8)
b = np.array([5, 6, 7], dtype=a.dtype())
print('a: ', a)
print('dtype of a: ', a.dtype())
print('\nb: ', b)
.. parsed-literal::
a: array([1, 2, 3, 4], dtype=int8)
dtype of a: dtype('int8')
b: array([5, 6, 7], dtype=int8)
If the ``ulab.h`` header file sets the pre-processor constant
``ULAB_HAS_DTYPE_OBJECT`` to 0, then the output of the previous snippet
will be
.. code::
# code to be run in micropython
import ulab as np
a = np.array([1, 2, 3, 4], dtype=np.int8)
b = np.array([5, 6, 7], dtype=a.dtype())
print('a: ', a)
print('dtype of a: ', a.dtype())
print('\nb: ', b)
.. parsed-literal::
a: array([1, 2, 3, 4], dtype=int8)
dtype of a: 98
b: array([5, 6, 7], dtype=int8)
Here 98 is nothing but the ASCII value of the character ``b``, which is
the type code for signed 8-bit integers.
.flatten
~~~~~~~~

View file

@ -475,6 +475,45 @@ and ``append`` keywords that can be found in ``numpy``.
median
------
``numpy``:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html
The function computes the median along the specified axis, and returns
the median of the array elements. If the ``axis`` keyword argument is
``None``, the arrays is flattened first. The ``dtype`` of the results is
always float.
.. code::
# code to be run in micropython
import ulab as np
a = np.array(range(12), dtype=np.int8).reshape((3, 4))
print('a:\n', a)
print('\nmedian of the flattened array: ', np.median(a))
print('\nmedian along the vertical axis: ', np.median(a, axis=0))
print('\nmedian along the horizontal axis: ', np.median(a, axis=1))
.. parsed-literal::
a:
array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]], dtype=int8)
median of the flattened array: 5.5
median along the vertical axis: array([4.0, 5.0, 6.0, 7.0], dtype=float)
median along the horizontal axis: array([1.5, 5.5, 9.5], dtype=float)
sort
----

View file

@ -3,12 +3,11 @@ Programming ulab
Earlier we have seen, how ``ulab``\ s functions and methods can be
accessed in ``micropython``. This last section of the book explains, how
these functions are implemented. This should serve at least two
purposes. First, it should become clear, what the trade-offs are, and
that would allow the user to optimise the code in ``python``.
Second, by the end of the section, one should be able to extend
``ulab``, and write their own functions.
these functions are implemented. By the end of this chapter, not only
would you be able to extend ``ulab``, and write your own
``numpy``-compatible functions, but through a deeper understanding of
the inner workings of the functions, you would be able to see what the
trade-offs are at the ``python`` level.
Code organisation
-----------------
@ -33,17 +32,17 @@ General comments
``ndarrays`` are efficient containers of numerical data of the same type
(i.e., signed/unsigned chars, signed/unsigned integers or
``mp_float_t``\ s, which, depending on the platform, are either C
``float``\ s, of C ``double``\ s). Beyond storing the actual data, the
type definition has eight additional members (on top of the ``base``
type). Namely, ``dense``, which tells us, whether the array is dense or
sparse (more on this later), the ``dtype``, which tells us, how the
bytes are to be interpreted. Moreover, the ``itemsize``, which stores
the size of a single entry in the array, ``boolean``, an unsigned
integer, which determines, whether the arrays is to be treated as a set
of Booleans, or as numerical data, ``ndim``, the number of dimensions
(``uint8_t``), ``len``, the length of the array, the shape
(``*size_t``), the strides (``*size_t``). The length is the product of
the numbers in ``shape``.
``float``\ s, or C ``double``\ s). Beyond storing the actual data in the
void pointer ``*array``, the type definition has eight additional
members (on top of the ``base`` type). Namely, ``dense``, which tells
us, whether the array is dense or sparse (more on this later), the
``dtype``, which tells us, how the bytes are to be interpreted.
Moreover, the ``itemsize``, which stores the size of a single entry in
the array, ``boolean``, an unsigned integer, which determines, whether
the arrays is to be treated as a set of Booleans, or as numerical data,
``ndim``, the number of dimensions (``uint8_t``), ``len``, the length of
the array, the shape (``*size_t``), the strides (``*int32_t``). The
length is simply the product of the numbers in ``shape``.
The type definition is as follows:
@ -108,7 +107,10 @@ reasons that will become clear later, the numbers in ``shape`` and in
i.e., if the number of possible dimensions is ``ULAB_MAX_DIMS``, then
``shape[ULAB_MAX_DIMS-1]`` is the length of the last axis,
``shape[ULAB_MAX_DIMS-2]`` is the length of the last but one axis, and
so on.
so on. If the number of actual dimensions, ``ndim < ULAB_MAX_DIMS``, the
first ``ULAB_MAX_DIMS - ndim`` entries in ``shape`` and ``strides`` will
be equal to zero, but they could, in fact, be assigned any value,
because these will never be accessed in an operation.
With this definition of the strides, the linear combination in
:math:`P(n_1, n_2, ..., n_{k-1}, n_k)` is a one-to-one mapping from the

View file

@ -54,6 +54,11 @@
Return the mean element of the 1D array, as a number if axis is None, otherwise as an array.
.. function:: median(array: ulab.array, *, axis: int = -1) -> ulab.array
Find the median value in an array along the given axis, or along all axes if axis is None.
.. function:: min(array: _ArrayLike, *, axis: Optional[int] = None) -> float
Return the minimum element of the 1D array

View file

@ -14,11 +14,11 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 12,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T21:59:25.428757Z",
"start_time": "2020-11-02T21:59:25.412601Z"
"end_time": "2020-11-03T19:56:04.701289Z",
"start_time": "2020-11-03T19:56:04.691215Z"
}
},
"outputs": [
@ -61,7 +61,7 @@
"author = 'Zoltán Vörös'\n",
"\n",
"# The full version, including alpha/beta/rc tags\n",
"release = '1.1.3'\n",
"release = '1.2.0'\n",
"\n",
"\n",
"# -- General configuration ---------------------------------------------------\n",
@ -92,7 +92,7 @@
"\\Huge\\textbf{The $\\mu$lab book}\n",
"\\vskip 0.5em\n",
"\\LARGE\n",
"\\textbf{Release 1.1.3}\n",
"\\textbf{Release 1.2.0}\n",
"\\vskip 5em\n",
"\\huge\\textbf{Zoltán Vörös}\n",
"\\end{flushright}\n",
@ -255,11 +255,11 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 13,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T20:38:00.507715Z",
"start_time": "2020-11-02T20:38:00.427292Z"
"end_time": "2020-11-03T19:56:16.427813Z",
"start_time": "2020-11-03T19:56:16.398143Z"
}
},
"outputs": [],
@ -293,11 +293,11 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 14,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T20:38:44.508163Z",
"start_time": "2020-11-02T20:38:36.180603Z"
"end_time": "2020-11-03T19:56:24.888593Z",
"start_time": "2020-11-03T19:56:17.294538Z"
}
},
"outputs": [],

View file

@ -5,8 +5,8 @@
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-17T18:29:35.382599Z",
"start_time": "2020-10-17T18:29:35.180512Z"
"end_time": "2020-11-03T19:50:50.150235Z",
"start_time": "2020-11-03T19:50:48.888079Z"
}
},
"outputs": [
@ -34,8 +34,8 @@
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-17T18:29:38.729010Z",
"start_time": "2020-10-17T18:29:38.714016Z"
"end_time": "2020-11-03T19:50:51.340719Z",
"start_time": "2020-11-03T19:50:51.330015Z"
}
},
"outputs": [],
@ -52,8 +52,8 @@
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-17T18:29:40.051454Z",
"start_time": "2020-10-17T18:29:40.012878Z"
"end_time": "2020-11-03T19:50:52.899529Z",
"start_time": "2020-11-03T19:50:52.837604Z"
}
},
"outputs": [],
@ -815,6 +815,58 @@
"print('\\nfirst derivative, second axis:\\n', numerical.diff(c, axis=1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## median\n",
"\n",
"`numpy`: https://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html\n",
"\n",
"The function computes the median along the specified axis, and returns the median of the array elements. If the `axis` keyword argument is `None`, the arrays is flattened first. The `dtype` of the results is always float."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-03T19:54:38.047790Z",
"start_time": "2020-11-03T19:54:38.029264Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a:\n",
" array([[0, 1, 2, 3],\n",
" [4, 5, 6, 7],\n",
" [8, 9, 10, 11]], dtype=int8)\n",
"\n",
"median of the flattened array: 5.5\n",
"\n",
"median along the vertical axis: array([4.0, 5.0, 6.0, 7.0], dtype=float)\n",
"\n",
"median along the horizontal axis: array([1.5, 5.5, 9.5], dtype=float)\n",
"\n",
"\n"
]
}
],
"source": [
"%%micropython -unix 1\n",
"\n",
"import ulab as np\n",
"\n",
"a = np.array(range(12), dtype=np.int8).reshape((3, 4))\n",
"print('a:\\n', a)\n",
"print('\\nmedian of the flattened array: ', np.median(a))\n",
"print('\\nmedian along the vertical axis: ', np.median(a, axis=0))\n",
"print('\\nmedian along the horizontal axis: ', np.median(a, axis=1))"
]
},
{
"cell_type": "markdown",
"metadata": {},