.. the previous arrangement lead to compiler diagnostics when
building with 'scan-build make':
include/tvm/tvm_tokens.h:17:20: error: ‘tvm_register_map’ defined but not used [-Werror=unused-variable]
static const char *tvm_register_map[] = {
^~~~~~~~~~~~~~~~
include/tvm/tvm_tokens.h:7:20: error: ‘tvm_opcode_map’ defined but not used [-Werror=unused-variable]
static const char *tvm_opcode_map[] = {
^~~~~~~~~~~~~~
cc1: all warnings being treated as errors
During preprocessing, we find directives, and if they specify including
other source files, we remove the directive, allocate a new block of
memory, copy the first block, append the new source file, and append the
second block.
Previously, the value of the source pointer was passed into the
preprocessing function. When the memory would be (re)allocated, the
pointer value could change, but the original pointer would not be
updated.
Now, the pointer is passed by address, and updated when the memory it
points to changes.
This commit refactors the project using a single, consistent coding
style, derived from the Linux Kernel Coding Style, available here:
https://www.kernel.org/doc/Documentation/CodingStyle
This includes, but is not limited to:
* Removal of typedefs, especially for structs
* Limiting lines to a reasonable length, 80 characters, mostly
* K&R style braces
* Removal of CamelCase
The htab functions find_str and add_str have been renamed to include _ref in
the places they previously noted _str. These are intended to manage an htab
containing references to any data; string or not.
The htab_find function has been divided up. htab_find_core now handles actually
finding the correct node. htab_find and htab_find_ref are just outward facing
functions for retrieving a specific kind of data from the node, depending on
what the htab is used for.
The htab_structure has been modified in two ways, which should not break
compatability with any of its uses.
It includes a void pointer, which is in this commit used to point to a
string for defines, and will in the future be used to point to the address
space for words, bytes, and double words.
It now includes a function htab_add_str specifically for storing strings.
It calls htab_add so as not to be redundant, but makes the node's value
their index for the lexer to fetch using htab_find, and assigns their
void pointer.
The lexer will now use htab_find on all tokens to see if they are a define
string, and if so, substitute them with the appropriate token.
The defines htab is destroyed after lexing, because that memory is done.
Before allocating space for a token, the lexer will first check to see
if that token is a defined name. If it is, it will allocate space for
the defined string instead.
A test file is included in programs/tinyvm/preprocessor to demonstrate
the behavior. When functioning, the program will print the fibonacci
sequence.
This commit adds behavior to the preprocessor which fills a tree with
defines and their replacements. In future commits, the parser will
substitute instances of the defines with their replacements in the
source code.
The tvm_tree structure should optionally be able to keep track of values
associated with the strings by which its nodes are sorted. In the case of
defines, this is the replacement string. In the case of variables, this
will be a pointer to the variable's location in memory.
Searching should return the value, or NULL.
To opt out of storing a value, pass NULL and 0 as the val and len arguments.
The tvm_tree structure is a binary search tree. It will be used to hold
preprocessor defines, and variable names for when defining bytes, words,
and double words is implemented.
Each node structure and its own string are stored contiguously (in that
order) so the free's are easier to keep track of, and memory doesn't need to
be a concern when adding a string to the tree.
Based on suggestion from bitbckt:
I saw this in my feed, and feel it merits comment. I hope you
don't mind the input.
You'll want to monitor the load factor of the hash table and re-
hash the table on insert when it is exceeded. Otherwise, key
lookup will degrade toward linear time for sets of keys with a
high number of collisions.
The easiest way to implement the load factor is to maintain a
count of allocated nodes in tvm_htab_t and divide that by the
bucket count to obtain the load factor. Of course, you'd need the
bucket count (HTAB_SIZE) to be dynamic, too.