This is a guide to porting reference ops from Lite to Micro. It explains, step-by-step, the recommended code changes and the process for submitting them for review and acceptance. The process results in multiple pull requests, or PRs. Multiple, small PRs are easier for the project to review and merge.
The Micro Contributing Guidelines are prerequisite reading. They cover general code health, maintainability, style, and submission, as well as how to setup a development environment. This guide contains step-by-step instructions for the specific task of porting reference ops from Lite to Micro.
- Porting Reference Ops from Lite to Micro
- General Guidelines
- Notes
- Frequently Asked Questions
- Can I use malloc/free or new/delete in my operator code?
- Can I use static variable allocation in my operator code?
- How do I allocate persistent memory?
- When am I allowed to allocate persistent memory?
- How do I allocate/use temporary memory?
- When can I allocate/use temporary memory?
- Can I resize my input/output tensors?
- Can I change the shape of tensors in my operator code?
- When can I change the shape of tensors in my operator code?
- Can I modify a TfLiteTensor or TfLiteEvalTensor?
Begin by searching the tflite-micro GitHub repository for issues containing the name of the op under consideration to ensure someone isn't already working on a port.
Open a GitHub issue to announce your intent to port the op, and to begin a record of your work. Document the entire process of porting the op in this issue. Link constituent PRs to this issue. See the article Providing Context for background on documenting your work via bug reports.
Now we begin changing, testing, and submitting code. This step will result in the first pull request, PR1.
-
Extract the code for parsing op parameters out of the switch statement in
ParseOpDataTfLite()
inlite/core/api/flatbuffer_conversions.cc
into a standalone function, and call that function from the switch statement. This standalone function is now available to be called by the Micro op resolver, which also needs to parse the op parameters, in a future change. A simple example is PR #45307, and a more complicated example is PR #46021. -
Use
clang-format
to make sure the code is properly formatted.clang-format --style=google -i $(git ls-files -m | grep -E '\.cc|\.h')
-
Make sure your code is lint-free.
cpplint.py $(git ls-files -m)
-
Create a single commit containing the change. Observe the guidelines for good commit log messages found in the article Providing Context. A good example is commit 0664214.
-
Since this change modifies the op's implementation in Lite, test the change with the relevant Lite unit tests.
bazel test tensorflow/lite/kernels:all
-
Create and submit the PR. Write a good PR description, and be sure to link to the GitHub issue created to document the port. A good example is PR #45307.
Move the reference implementation of the op in reference_ops.h to a standalone header so that Micro can include it without including unrelated dependencies via reference_ops.h.
A good example is PR #45311.
-
Copy an existing header from
tensorflow/lite/kernels/internal/reference/
totensorflow/lite/kernels/internal/reference/NEW_OP.H
to create the boilerplate. ReplaceNEW_OP.H
with the name of the new operator. -
Move the implementation from
tensorflow/lite/kernels/internal/reference/reference_ops.h
totensorflow/lite/kernels/internal/reference/NEW_OP.H
. -
Add the new header to the build by adding to the library definitions
reference_base
andlegacy_reference_base
in the filetensorflow/lite/kernels/internal/BUILD
. See, for example, this change for operator FILL. -
Use the program
clang-format
to make sure the code is properly formatted.clang-format --style=google -i $(git ls-files -m | grep -E '\.cc|\.h')
Do not clang-format existing code in
BUILD
orreference_ops.h
. -
Make sure your code is lint-free.
cpplint.py $(git ls-files -m)
Do not modify code in
BUILD
orreference_ops.h
to satisfycpplint.py
. -
Create a single commit containing the change. Observe the guidelines for good commit log messages found in the article Providing Context. A good example is commit 92f459e.
-
Since this change modifies the op's implementation in Lite, test the change with the relevant Lite unit tests.
bazel test tensorflow/lite/kernels:all
-
Create and submit the PR. Write a good PR description, and be sure to link to the GitHub issue created to document the port. A good example is PR #45311.
-
Copy the kernel and test from Lite to Micro.
In the first commit of this PR, copy the kernel and test from Lite to Micro without making any modifications and without adding them to the build.
A good example is commit a2ca1fd.
This copy action is in its own commit in order to create readable, reviewable diffs when modifications are made in later commits. If the files were copied and modified in one step, the modifications would not appear as a diff of the Lite version. Instead, the files would simply appear at the destination path in their final form.
-
Remove Lite-specific code from copies
In the second commit of this PR, remove the bulk of Lite-specific code from the files copied to micro in the previous step.
A good example is commit a5a87b4.
This bulk-delete action is in its own commit for reasons similar to those given in the step above: to produce a more readable, reviewable diff in this step and in the next. Because the files are not yet added to the build, they need not (and obviously won't) compiler or function. What to delete now as opposed to deleting in the next commit is somewhat subjective, but make deletes in order to:
- Flatten the namespace down to
tflite
. - Stop resizing output tensors.
- Remove input and output types other than
int8
andfloat32
. - Stop using gmock and gtest.
- etc.
- Flatten the namespace down to
-
Port the op and the test
Make the necessary changes to the micro kernel, header, and test to make the op implementation suitable for micro. Include these in the build.
This step requires the most creativity, and may receive the most feedback during review. Maintain good atomicity in your commits. Considering its scope, this step will consist of more than one commit. A good example is the changes made in PR #45647.
-
Use
clang-format
to make sure the code is properly formatted.$ clang-format --style=google -i $(git ls-files -m | grep -E '\.cc|\.h')
Do not clang-format existing code in
BUILD
orreference_ops.h
. -
Make sure the code is lint-free.
$ cpplint.py $(git ls-files -m)
Do not modify code in
BUILD
orreference_ops.h
to satisfycpplint.py
. -
Make sure the port passes all applicable tests.
$ bazel test tensorflow/lite/micro/kernels:${op}_test $ bazel test tensorflow/lite/micro/kernels:all $ make -f tensorflow/lite/micro/tools/make/Makefile test_kernel_${op}_test $ make -f tensorflow/lite/micro/tools/make/Makefile test
See the general Micro Contributing Guidelines for other testing ideas, including the use of address sanitizers.
-
Create and submit the PR. Write a good PR description, and be sure to link to the GitHub issue created to document the port. A good example is PR #45647.
Check each commit against the pre-submit checklist in the micro Contributing Guidelines. Specifically, make sure your code:
-
Is formatted with clang-format.
-
Passes a lint check.
-
Passes all unit tests.
$ make -s -j8 -f tensorflow/lite/micro/tools/make/Makefile test
CI runs these checks on all PRs, and will hold up your PR if any of these checks fail.
To the extent possible, maintain a 1:1 correspondence between Micro and Lite
versions of unit tests. Avoid cleanup of merely stylistic issues, e.g., by
replacing the hardcoded literal 3.40282e+038
with
std::numeric_limits<float>::max()
. Any changes between the Micro and Lite
versions of a test put a burden on future maintainers to figure out whether the
differences are actually significant or just stylistic.
-
There was discussion of commits vs. PRs in #45387.
No. All memory allocation in TensorFlow Lite Micro (TFLM) is done using C++ stack based automatic allocation, or through specialized TFLM persistent and temporary allocation methods.
No. This is due to the call ordering of C++ static constructors being platform/compiler dependent.
Use TfLiteContext::AllocatePersistentBuffer
to allocate persistent memory.
Memory allocated by this method will remain valid throughout the lifetime of
the tflite::MicroInterpreter
instance.
An example code snippet looks like (leaky_relu.cc):
void* LeakyReluInit(TfLiteContext* context, const char* buffer, size_t length) {
TFLITE_DCHECK(context->AllocatePersistentBuffer != nullptr);
return context->AllocatePersistentBuffer(context, sizeof(LeakyReluOpData));
}
The TfLiteContext::AllocatePersistentBuffer
method may only be called within
the scope of your operator's Init
and Prepare
methods.
Use the TfLiteContext::RequestScratchBufferInArena
and
TfLiteContext::GetScratchBuffer
methods. The temporary memory is shared
between all operators, and is only valid for your operator within the scope
of your operator's Invoke
method. Do not attempt to use temporary memory
to share data between operator invocations. Temporary memory is to be used
only as pre-allocated storage during the execution scope of your operator's
Invoke
method.
An example code snippet looks like (add_n.cc):
if (output->type == kTfLiteFloat32) {
// Allocate scratch buffer space for pointer to each tensor's data
// and store the scratch buffer index in the node's user_data
int scratch_index;
size_t scratch_size = sizeof(float*) * num_inputs;
TF_LITE_ENSURE_OK(context, context->RequestScratchBufferInArena(
context, scratch_size, &scratch_index));
node->user_data =
reinterpret_cast<decltype(node->user_data)>(scratch_index);
}
And to use the buffer:
int scratch_index =
static_cast<int>(reinterpret_cast<intptr_t>(node->user_data));
void* scratch_buffer = context->GetScratchBuffer(context, scratch_index);
The TfLiteContext::RequestScratchBufferInArena
method is available only within
the scope of your operator's Prepare
method.
The TfLiteContext::GetScratchBuffer
method is available only within
the scope of your operator's Invoke
method.
No. The storage space for each input/output tensor is a fixed, calculated value
determined at the time the TensorFlow Lite (TfLite) model converter is executed.
During the Init
phase of the tflite::MicroInterpreter
all tensor storage is
allocated by the tflite::MicroInterpreter
instance, using the calculated values
of the model converter.
For more information see: Memory Allocation Overview
Yes. The new shape must not exceed the storage space indicated by the old shape.
Because tensor shape values may live in memory that is not directly writable
(ex. Flash, EEPROM, ROM), a special method must be called before modification
is attempted. The tflite::micro::CreateWritableTensorDimsWithCopy
method will
move the tensor shape values to guaranteed persistent writable memory.
An example code snippet looks like (l2_pool_2d.cc):
// the output variable is a TfLiteTensor*
TfLiteEvalTensor* output_eval =
tflite::micro::GetEvalOutput(context, node, kOutputTensor);
TF_LITE_ENSURE_OK(context, tflite::micro::CreateWritableTensorDimsWithCopy(
context, output, output_eval));
output->dims->data[kBatchRank] = batches;
output->dims->data[kHeightRank] = out_height;
output->dims->data[kWidthRank] = out_width;
output->dims->data[kChannelRank] = channels_out;
Tensor shape values can be modified any time after the
tflite::micro::CreateWritableTensorDimsWithCopy
method has been called.
This means that tensor shape values can be modified within the scope of
your operator's Prepare
or Invoke
methods.
The tflite::micro::CreateWritableTensorDimsWithCopy
method may
only be called within the scope of your operator's Prepare
method.
No. The tflite::MicroInterpreter
is the owner and manipulator of these data
structures. Your code should not modify these data structures. The only
directly allowed modification of tensors is to change their data values, or
their shape values.