The world's first data terrarium now exists
Remember that post about Data Terrariums?
The world’s first data terrarium went live in the park in Healdsburg, California in June 2024.
Background: Principles of a Data Terrarium #
- The data terrarium is a balanced, self-contained data ecosystem. Unlike most edge computing devices, it is not tied to the cloud or even the internet.
- The only inputs and outputs are solar energy and data flow over wifi.
This version is air-cooled, so it also produces a small amount of direct heat. This is easily buffered by the environment - we ran the system just fine outdoors when it was about eighty-five Fahrenheit. Future versions will use circular economy principles to reduce waste heat.
Why? #
Why imagine and build something like this? We wanted to prove that at small scale, useful and powerful compute can be performed in a way that is completely supported by nothing but the sun. We wanted to demonstrate that systems like this can be constructed in everyday environments, from off the shelf parts, for reasonable amounts of money. We wanted to prove that data processing capabilities can be built in a way that is affordable, accessible, and Earth-friendly by constructing our idea for doing so in physical form.
Think this system isn’t useful? It can:
- Analyze a 30x whole-human genome in about an hour.
- Run LLAMA3 8B locally, producing haikus, answering questions, and even doing coding copilot-like activities. We made some awesome EFC-themed poems with LLAMA3 8B.
- Run retrieval-augmented generation (RAG) on texts stored locally (up to the SSD capacity, but hundreds of small PDFs easily).
- Generate images with Stable Diffusion.
- Run Rstudio, VS Code, python, C++, etc. This is a full-fledged AI workstation in a shopping cart powered by the sun.
Imagine with us the possible applications:
- Remote hospitals now have RAG capabilities for their local health records, even when the power goes out.
- Remote genome sequencing and onsite analysis is possible anywhere the shopping cart can go. It can be used to sequence pathogens causing outbreaks, sequence plants for biodiversity cataloguing, and provide genomic medicine to rural communities.
- Indigenous Sovereign Nations with such a system have direct ownership of their data, including its means of production.
- Educators have a powerful system for educating and mentoring students from elementary to doctoral levels about Big Data and Artificial Intelligence, anywhere.
What we accomplished #
We built a powerful Data Processing Center entirely on battery and solar #
We charged the battery the night before off a wall outlet, though in retrospect we should have done this off solar the day before (and could have easily done so had we planned ahead).
We built the entire compute system on a table in the hotel lobby. We then connected all the pieces - compute system, solar panels, battery, router, sequencer - and arranged them in the shopping cart.
We installed the operating system completely offline in a public park #
We turned the PC on (running on battery power) and walked to the park. We’d preloaded Ubuntu 22.04 on a flash drive. We set up the solar panels, checked our power usage (200 watts in, 100 watts out) and started the installation process.
We installed Ubuntu and even retrieved an update using public WiFi.
We answered questions about what we were doing for people in the park. Kids touched our solar panels. Parents asked us to point out the components of our data processing center.
We presented the system at the California CARES grant competition #
We then loaded the system up and drove it from Healdsburg to San Diego, where we unpacked it and pushed it around UC Campus until students presented their grant pitch. We let the judges check it out and see it running on solar outside the UCSD Design Building. Next time, we’d love to take the train (EFC Europe 2025?).
We donated the system (minus GPU) to the Indigenous Futures Institute #
We left the system (including cart, workstation, battery, and all the solar panels) to the Indigenous Futures Institute, where it will be heavily utilized to push the frontier of Sovereign AI guided by Indigenous Data Principles.
Unfortunately, as we’re self-funded and have a lot of projects, the GPU had to come back with us, but we have plans to get more (and we are happy to accept help).
Specifications #
The final specs were:
- Ryzen 7900 12-core CPU (65W TDP)
- 1x NVIDIA RTX 5000 Ada GPU
- 96GB Crucial DDR5 RAM
- 1x 2TB Crucial P5 SSD (for boot + binaries)
- 1x 4TB Crucial P3 Plus SSD (for larger data)
- Noctua NH-L12S Ghost S1 Edition Cooler
- MSI MPG B650i Edge WiFi mini-ITX motherboard
- Fractal Terra mini-ITX case
- 1056 Wh battery (Anker Solix C1000)
- 300W folding solar panel
- 1 Shopping Cart (Granger brand
- Oxford Nanopore minION Sequencer
- Mini travel wifi router
What’s next #
We’ve already done a lot with this system - this GPU has stories to tell. We’ll place more here on the blog over time.