【Community Conversation May 2024】Part 2 AI, Containerization, ZVM - ZimaOS Technical Framework

【Community Conversation May 2024】Part 2 AI, Containerization, ZVM - ZimaOS Technical Framework

Last week, our CEO Lauren and CTO Tiger had an engaging fireside chat with a community developer Axel. Throughout the conversation, we uncovered the innovative thinking that drives the ZimaOS team, from their visionary approach to hardware and software design to their meticulous selection of foundational technologies, and discussed the progress of AI development within our products. We explored a wide range of topics including immutable architecture, AI integration, containerization, virtualization, the app store ecosystem, and gained valuable insights from Axel’s firsthand experience with Zima hardware.

We divided this conversation into four distinct themes. In this episode, we’ll delve into the captivating topic of AI, containerization and ZVM.

  • Choices of AI Tools
  • UI Building
  • Something about LLM and Stable Diffusion
  • Containerization Architecture
  • Latest Function: Virtualization

Video version is at the end of this post

Axel
Of course, what I’ve seen within the videos and early meta version is like, for instance, also the feature around AI. And my question would be, I mean, what I’ve seen is basically 2 directions in within your a AI bet to take. One is to actually analyze or locate your own files on Zima Cube. And the other one is basically that you showed an implementation of stable diffusion. Let me ask here from an UI perspective.

So first of all, I would like to know what kind of tool are you using to go through your private files? I know, for instance, like private GPT, who does, who could do something like that, among others. And then the second question would be, what is the UI that you built on top and what are you using there? Because at least I think I have not seen it so far, but I’m not sure if I’m right.

Tiger
We started paying attention to this entire generative AI from the first day GPT came up, and we started investing our team resource, you know, learning how to build a local AI the time I think llama, the first open source model released by Meta-previously Facebook.

And the Chat. Mainly, I think we gave a new name. I think it’s called Zima Assist, but previously it was just Chat. We build that from, I wouldn’t say scratch, but we build that from like some were very close to the route. We’re not using some pre-existing solution, you know, just package it or just deploy it. So the model we use is just simply llama 2. There’s LMA3 released recently. Due to the nature, the local cubway resource is constrained versus whatever running in the Nvidia data center behind, you know, Azure behind the chat GPT. The original model simply doesn’t get loaded into the GPU in the ZimaCube. So we specifically target look at the quantified model, which is supposed to be quantified. So in other words, think of you’re watching a movie that was originally in 4K and then you have a great time enjoying the movie. But let’s say you’re on a trip in your phone, you’re watching like one 1080p movie. You can still get the story right? You lose some fidelity, some details, but you can still understand what the story is about. So quantized model is the same idea. So the same model, it’s simply lower precision. So we did that work pretty well on Zima Cube. And we use another framework called LangChain. I’m not sure if you heard of that.
Axel
Super popular.

Tiger
I don’t think there’s anything that’s even comparable to LangChain. If you really know, let me know. But first we started the learning LangChain from the early stage and we, our Chat app is actually built using the model and using the LangChain framework. It’s simply in the orchestration of your prompt engineering of the, you know, the, you’re talking to the model and templating, all of that. The UI, it’s built from scratch. We put this, we’re thinking about UI framework called Gradio. I’m not sure you’ve heard of that. It’s a framework basically for building UI. But due to our need of product, the need from Product Manager, we simply switch over to building a UI from scratch. Yeah, so that’s the entire stack of our chat. We’re not using some pre is, you know, established solution. We will build it from the bottom.

Axel
And as you also talked about the learning journey there, I think it’s quite plausible because I went through the same, I would say, processes, in particular when land chain actually got a lot of traction, how to implement our LLMs. For sure, what I lately saw, or I’m using myself because it’s also integrated into something like bluffin is Obama. And Ollama, of course, gives us the chance to not only implement Llama 2, Llama 3 and so on, but also a dozen of our airlines that run locally as well. So did you have a look at that one as well?

Tiger
We look at Olama. The fact that is I really like Olama. It’s actually in our to-do list, Lauren knows about it. I was talking to the team to switch over, to rewrite our entire chat based on Ollama.

It’s essentially that the time we started to build Chat, Ollama didn’t exist. If I knew it will exist and probably go for Ollama. I think Olama handles like a model downloading in a super convenient way, and then it provides API. I really love to switch our current chat infrastructure over to Ollama, but unfortunately we studied before Ollamo existed. So of course we’ll have to go with our first version. There’s always more version. Yes, this in our radar.

Axel
Yeah. Don’t get me wrong. I mean, at the end, it’s software development, right? And as I said, I’m going through the same journey and then embracing launching suddenly. And everything that was not possible before was suddenly more or less like seamless or more easy to solve. Thank you for sharing for that. And since you mentioned Gradio, it might be the same with Stable Diffusion. I used to use first a UI on top of stable diffusion called Auto Magic 1111 and later on Invoke AI. And we also, especially automatic 1111, uses radio as well. But I suppose, did you build a UI yourself and with radio, as far as understood.

Tiger
If you’re talking it about the Stable Diffusion, we actually use the automatic double 11 UI. I don’t know if our engine engineers and our founder engineers have altered a little bit. To be honest, I never use the original web UI of stable diffusion. I only see it on our own web. To me, I’m focus more on the functionality. I didn’t pay attention to the details. But I think it is based on the automatic double 11 UI. And you mentioned about Invoke AI. I think it’s a new. I look at it only once. It’s super promising. That that’s another topic. Actually I think very likely we will go into the app Store containerization, the app store concepts. We’re actually looking at running all these AI-based apps, including in bulk AI within a container so that we can easily put these AI apps in the shell in the CasaOS App Store. We’re actually looking at that. But I think we can get into that later.

Axel
That sounds fantastic. I mean, as you said, like, of course, invoke is newer, but I think it’s also way more simple than automatic 1111 from a user interface. So that sounds really fantastic. Which brings me basically to containerization. And I had a bunch of questions in both regards. Of course, I knew about from CasaOS on the magic that you have in place with your app store, with all the contributors from the open source community when it comes to Docker files or Docker compose in that case. But the question that I have is rather like, of course, there’s not only Docker, there are other ways on how to do a containerization. It might be through Podman, it might even be leaving containerization and going towards more LXC and so on, or this Roblox. So let me catch your thoughts on that and what ZimaOS will support.

Tiger
There’s a bit of history of containerization. Everything started off from the Cgroup, the control group under Linux, which isolates the processes of whatever Linux program and then someone simply took advantage of the control group and then build the idea of containerization. But I think that happened within Linux community and then someone else took that opportunity, built Docker basically, and orchestrated version of the original limit containerization. And they, I think they turn that into the company. And because it’s commercialized and you know, for the nature for this, in the spirit of open source, people simply don’t like anything that’s commercialized.

And then there’s alternatives and partner is one of the alternative. I like it very much when it came out, I think at early stage, I started playing with Podman. And there’s also another alternative to the Docker is called nerdctl. I’m not sure you’ve heard. Yeah, you can look it up later. It’s called NERDCTL. It’s similar to Podman. I think Podman, it’s from redhead and nerdctl. I think I forgot what’s the organization affiliate to it, but it’s simply to alternatives to Dockers. When I first joined this project, I seriously looked into Podman and nerdctl. And I was seriously thinking about, you know, switching over from Docker engine to those one of those.

But the thing is that entire CasaOS was built using Go Language by Google. And we find the entire developer experience of Docker Engine. It’s simply awesome. It’s a lot of resources. There’s community behind it. And, you know, in contrast, Pod May and dirt contro, like I said, this team is new. There’s no Guru in this team. Like people, there are people who started learning about programming, you know, within the company. I actually, to be honest, I started learning Golan after I joined a company. Before that, I was a Java guy. So with all that we state with Docker engine, because everything is well documented. After all, we deliver value versus cannot trying to make things, everything more ideal. Like we focus on delivery value and DOC engine simply meets that goal. Yeah, but who knows, in future we might switch over to nerdctl.

Axel
And it’s like, there are of course crazy approaches going on. I mentioned VanillaOS in the beginning and what they do basically with APX as like a, I would say, meta package manager that goes under an equivalent to as Crystal Box. And then you can basically use APX to install across the different means operating system, the packages and so on. Of course, there’s a lot of movement and innovation going on all the time. Maybe that’s the last thing. What about virtualization? I’ve seen basically, I think you call it ZVM, right?

Tiger
It’s a naming, Zima Virtualization Management , ZVM, not a new technology.

Axel
So what are you using for that then? Because I couldn’t find like detailed information of what kind of virtualization you use. Maybe you can tell me more about that.

Tiger
Definitely. That’s a great question. Again, everything we use to build the ZimaOS is from open source. And we use underneath everything. It’s a combination of QEMU and KVM. QEMU, it’s a emulator of hardware and KVM is a kernel based virtualization management simply exposes, you know, underlying hardware resource directly to do whatever apps within VM. Those are the two technologies underneath. But everything is exposed via a framework called libvirt is like Docker.

If you compare to what we just talked about that libvirt itself is not virtualization, but it orchestrates. You know, the management needs the request from the app and send whatever API over to either qdmu or KVM. So people always misunderstood between qvmu and KVM. They thought I can only use one or the other. In fact the visualization, the modern virtualization on the Linux, it’s using a mix of both. There are hardware like a PCI host, like a song card, all of those could be say, emulated. But things like CPU or memory, even GPU can be pass through via the KVM mechanism. So it’s a hybrid of both. It is actually extremely difficult to use only KVM to do virtualization. So it has to be a combination of two, in other words. on the other hand, you cannot just use QMU to emulate everything because it would be simply slow. Yeah, so we use libvirt. I think there’s some other technology like a virtual box. I don’t know much detail about the virtual. But simply because it’s not open source, we looked at, we considered that at the very beginning, but then we just decided we would stay with QMU KVM via the libvirt. I hope that clarifies your question.