Level up your TinyML skills

Is it really a thing? Attiny Machine Learning

Attiny Machine Learning

A few days ago, I tweeted about Truly TinyML™ : Machine Learning that can run even on an Attiny85.

Not surprisingly, user @whaleygeek replied:

Attiny85 only has 512 bytes of RAM, will the models run in that?

Let's prove my assertion was true with an actual example.

RAM constraints of Attiny

David raises a question worth investigating: RAM constraints of the Attiny chip.

RAM is indeed the most limiting factor on many boards as it dictates how much data you can hold in memory at any given time.

Until recently, most microcontrollers had RAM memory ranging from 2 Kb (Arduino UNO) to 8 Kb (Arduino MEGA). Many other microcontrollers were in the same ballpark.

Nowadays we experienced a huge leap forward and boards come equipped with up to 512 Kb of RAM. Nevertheless, for many neural network models, this proves to be still a constraint.

Luckily for us, not all Machine Learning models require RAM to work properly!

PCBWay

FLASH-based Machine Learning

If you think at Machine Learning in terms of neural networks, then it's correct to think that the Attiny85 will never be able to run any model.

If you think at Machine Learning in terms of logistic regression, then it's probably correct to think so.

If you think at Machine Learning in terms of tree-based models, then you're wrong.

One of the most rewarding Machine Learning model I like to work with is Random Forest. This is an ensemble of many decision trees trained so as to avoid overfitting and achieve a greater overall accuracy. It works remarkably well in many scenarios, with many kinds of data and does not require major feature pre-processing. If you feel bold, you can even switch to the more performant Extreme Gradient Boosting, which is still a tree-based model, yet (usually) achieving even better accuracy.

Being tree-based translates to the fact that we can implement it using simple if-else constructs, like the following.

if (x[3] <= 0.800000011920929) {
    return 0;
} else {
    if (x[3] <= 1.699999988079071) {
        if (x[2] <= 4.900000095367432) {
            return 1;
        } else {
            return 2;
        }
    } else {
        return 2;
    }
}

This is an actual tree able to classify the Iris dataset. A Random Forest classifier is an ensemble of many such trees.

As you can see, we don't need to store any variable in memory (apart the x input vector), so RAM won't be a limiting factor.

FLASH will.

But FLASH is usually cheaper than RAM and boards have 4-10 times more FLASH than RAM.

Attiny Machine Learning project: MNIST digits classification

With all that said, let's try to create a Random Forest model that runs on the Attiny85. Our task is to classify the UCI MNIST dataset, which is made of 8x8 images of handwritten digits.

MNIST samples

The Python code used to train the model seems to come from a scikit-learn demo, considering how little lines we need.

from everywhereml.sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf = RandomForestClassifier(n_estimators=7, max_leaf_nodes=20)
clf.fit(X_train, y_train)

print('Score: %.2f' % clf.score(X_test, y_test))
>>> Score: 0.89

We can't get too fancy with the complexity of the model. We limit the number of estimators and leaf nodes to the lowest acceptable level.

You didn't expected to reach 100% accuracy, did you?

Now that we have a pretty accurate model, it's time to port it to the Attiny. A one liner.

print(clf.to_cpp_file('RandomForest.h'))
#ifndef UUID6033221040
#define UUID6033221040

/**
* RandomForestClassifier(base_estimator=DecisionTreeClassifier(), bootstrap=True, ccp_alpha=0.0, class_name=RandomForestClassifier, class_weight=None, criterion=gini, estimator_params=('criterion', 'max_depth', 'min_samples_split', 'min_samples_leaf', 'min_weight_fraction_leaf', 'max_features', 'max_leaf_nodes', 'min_impurity_decrease', 'random_state', 'ccp_alpha'), max_depth=None, max_features=auto, max_leaf_nodes=20, max_samples=None, min_impurity_decrease=0.0, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=7, n_jobs=None, num_outputs=10, oob_score=False, package_name=everywhereml.sklearn.ensemble, random_state=None, template_folder=everywhereml/sklearn/ensemble, verbose=0, warm_start=False)
*/
class RandomForestClassifier {
public:

/**
* Predict class from features
*/
int predict(float *x) {
int predictedValue = 0;
size_t startedAt = micros();


uint16_t votes[10] = { 0 };
uint8_t classIdx = 0;
float classScore = 0;

tree0(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree1(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree2(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree3(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree4(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree5(x, &classIdx, &classScore);
votes[classIdx] += classScore;

tree6(x, &classIdx, &classScore);
votes[classIdx] += classScore;

// return argmax of votes
uint8_t maxClassIdx = 0;
float maxVote = votes[0];

for (uint8_t i = 1; i < 10; i++) {
if (votes[i] > maxVote) {
maxClassIdx = i;
maxVote = votes[i];
}
}

predictedValue = maxClassIdx;
latency = micros() - startedAt;

return (lastPrediction = predictedValue);
}

/**
* Get latency in micros
*/
uint32_t latencyInMicros() {
return latency;
}

/**
* Get latency in millis
*/
uint16_t latencyInMillis() {
return latency / 1000;
}

protected:
float latency = 0;
int lastPrediction = 0;

/**
* Random forest's tree #0
*/
void tree0(float *x, uint8_t *classIdx, float *classScore) {

if (x[60] <= 2.5) {
if (x[49] <= 0.5) {
*classIdx = 7;
*classScore = 141.0;
return;
}
else {
*classIdx = 5;
*classScore = 116.0;
return;
}
}
else {
if (x[28] <= 0.5) {
if (x[36] <= 0.5) {
*classIdx = 0;
*classScore = 135.0;
return;
}
else {
if (x[61] <= 9.5) {
*classIdx = 4;
*classScore = 138.0;
return;
}
else {
*classIdx = 6;
*classScore = 122.0;
return;
}
}

... redacted for brevity ...

Copy the generated code and add it to your Arduino sketch.

Since the Attiny85 does not have a Serial port, we will blink a LED when the classifier predicts the correct digit.

Due to RAM constraints, we cannot store a sample for each class, so we'll limit to 2. If you are able to pass data to Attiny from an external source (for example via I2C), you can get rid of these samples.

#include "Classifier.h"

// samples from the MNIST dataset
// store as uint8_t to save RAM
uint8_t zero[] = {0, 0, 5, 13, 9, 1, 0, 0, 0, 0, 13, 15, 10, 15, 5, 0, 0, 3, 15, 2, 0, 11, 8, 0, 0, 4, 12, 0, 0, 8, 8, 0, 0, 5, 8, 0, 0, 9, 8, 0, 0, 4, 11, 0, 1, 12, 7, 0, 0, 2, 14, 5, 10, 12, 0, 0, 0, 0, 6, 13, 10, 0, 0, 0};
uint8_t one[] = {0, 0, 0, 12, 13, 5, 0, 0, 0, 0, 0, 11, 16, 9, 0, 0, 0, 0, 3, 15, 16, 6, 0, 0, 0, 7, 15, 16, 16, 2, 0, 0, 0, 0, 1, 16, 16, 3, 0, 0, 0, 0, 1, 16, 16, 6, 0, 0, 0, 0, 1, 16, 16, 6, 0, 0, 0, 0, 0, 11, 16, 10, 0, 0};
// uint8_t two[] = {0, 0, 0, 4, 15, 12, 0, 0, 0, 0, 3, 16, 15, 14, 0, 0, 0, 0, 8, 13, 8, 16, 0, 0, 0, 0, 1, 6, 15, 11, 0, 0, 0, 1, 8, 13, 15, 1, 0, 0, 0, 9, 16, 16, 5, 0, 0, 0, 0, 3, 13, 16, 16, 11, 5, 0, 0, 0, 0, 3, 11, 16, 9, 0};
// uint8_t three[] = {0, 0, 7, 15, 13, 1, 0, 0, 0, 8, 13, 6, 15, 4, 0, 0, 0, 2, 1, 13, 13, 0, 0, 0, 0, 0, 2, 15, 11, 1, 0, 0, 0, 0, 0, 1, 12, 12, 1, 0, 0, 0, 0, 0, 1, 10, 8, 0, 0, 0, 8, 4, 5, 14, 9, 0, 0, 0, 7, 13, 13, 9, 0, 0};
// uint8_t four[] = {0, 0, 0, 1, 11, 0, 0, 0, 0, 0, 0, 7, 8, 0, 0, 0, 0, 0, 1, 13, 6, 2, 2, 0, 0, 0, 7, 15, 0, 9, 8, 0, 0, 5, 16, 10, 0, 16, 6, 0, 0, 4, 15, 16, 13, 16, 1, 0, 0, 0, 0, 3, 15, 10, 0, 0, 0, 0, 0, 2, 16, 4, 0, 0};
// uint8_t five[] = {0, 0, 12, 10, 0, 0, 0, 0, 0, 0, 14, 16, 16, 14, 0, 0, 0, 0, 13, 16, 15, 10, 1, 0, 0, 0, 11, 16, 16, 7, 0, 0, 0, 0, 0, 4, 7, 16, 7, 0, 0, 0, 0, 0, 4, 16, 9, 0, 0, 0, 5, 4, 12, 16, 4, 0, 0, 0, 9, 16, 16, 10, 0, 0};
// uint8_t six[] = {0, 0, 0, 12, 13, 0, 0, 0, 0, 0, 5, 16, 8, 0, 0, 0, 0, 0, 13, 16, 3, 0, 0, 0, 0, 0, 14, 13, 0, 0, 0, 0, 0, 0, 15, 12, 7, 2, 0, 0, 0, 0, 13, 16, 13, 16, 3, 0, 0, 0, 7, 16, 11, 15, 8, 0, 0, 0, 1, 9, 15, 11, 3, 0};
// uint8_t seven[] = {0, 0, 7, 8, 13, 16, 15, 1, 0, 0, 7, 7, 4, 11, 12, 0, 0, 0, 0, 0, 8, 13, 1, 0, 0, 4, 8, 8, 15, 15, 6, 0, 0, 2, 11, 15, 15, 4, 0, 0, 0, 0, 0, 16, 5, 0, 0, 0, 0, 0, 9, 15, 1, 0, 0, 0, 0, 0, 13, 5, 0, 0, 0, 0};
// uint8_t eight[] = {0, 0, 9, 14, 8, 1, 0, 0, 0, 0, 12, 14, 14, 12, 0, 0, 0, 0, 9, 10, 0, 15, 4, 0, 0, 0, 3, 16, 12, 14, 2, 0, 0, 0, 4, 16, 16, 2, 0, 0, 0, 3, 16, 8, 10, 13, 2, 0, 0, 1, 15, 1, 3, 16, 8, 0, 0, 0, 11, 16, 15, 11, 1, 0};
// uint8_t nine[] = {0, 0, 11, 12, 0, 0, 0, 0, 0, 2, 16, 16, 16, 13, 0, 0, 0, 3, 16, 12, 10, 14, 0, 0, 0, 1, 16, 1, 12, 15, 0, 0, 0, 0, 13, 16, 9, 15, 2, 0, 0, 0, 0, 3, 0, 9, 11, 0, 0, 0, 0, 0, 9, 15, 4, 0, 0, 0, 9, 12, 13, 3, 0, 0};

// use this to classify
// we could be able to use uint8_t arrays directly, but let's pretend our dataset
// has floating point data (as most of the use cases)
float digit[64];

uint8_t LED = 1;
RandomForestClassifier clf;


void setup() {
    pinMode(LED, OUTPUT);
}


void loop() {
    // choose a sample randomly
    // then blink led if predict value matches expected one
    uint8_t *currentDigit;
    uint8_t currentClass;

    if (random(0, 10) > 5) {
        currentDigit = zero;
        currentClass = 0;
    }
    else {
        currentDigit = one;
        currentClass = 1;
    }

    // copy uint8_t array into float array
    for (int i = 0; i < 64; i++)
        digit[i] = currentDigit[i];

    digitalWrite(LED, clf.predict(digit) == currentClass);
}

Does it compiles?

Yes.

How much resources does it take?

7206 bytes of FLASH (94%) and 403 bytes of RAM (78%).

If you consider that 380 bytes of RAM are required to store the zero, uno and digit variables, the math is easy: the Random Forest classifier only requires 23 bytes of RAM.

So, to answer the fellow user @whaleygeek

Attiny85 only has 512 bytes of RAM, will the models run in that?

The answer is

You bet!

Having troubles? Ask a question

Related posts

esp32-cam

Esp32 Camera Object Detection

The beginner-friendly guide to run Edge Impulse FOMO Object Detection on the Esp32 Camera with little effort

tinyml

Iris flower classification with TinyML

An introduction to Machine Learning development for Arduino

tinyml

Arduino WiFi Indoor Positioning

Localize people and objects as they move around your building with the power of Machine Learning

libraries

EloquentTinyML for Arduino

An Arduino library to make TensorFlow Lite for Microcontrollers neural networks more accessible and easy to use

tinyml

Color classification with TinyML

Get familiar with data collection and pre-processing for Machine Learning tasks

Get monthly updates

Do not miss the next posts on TinyML and Esp32 camera. No spam, I promise

We use Mailchimp as our marketing platform. By submitting this form, you acknowledge that the information you provided will be transferred to Mailchimp for processing in accordance with their terms of use. We will use your email to send you updates relevant to this website.

Having troubles? Ask a question

© Copyright 2023 Eloquent Arduino. All Rights Reserved.