Replies: 1 comment 1 reply
-
I've been considering files as a process, where file objects live for limited duration (from minutes to days) and need to be accessible to multiple other hosts within a network. I know REDIS has the ability to do something similar, but being written in C/C++, it has newly reported vulnerabilities discovered every other day... As a use-case, this appeals to me. We get the in-memory nature of processes that store binary data until they expire, isolation between those processes, with memory safety guarantees of Rust (plus WASM sandboxing). |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Lunatic, similar to Erlang and Elixir, encourages application designs around processes as the primitive building block. Processes in lunatic are extremely lightweight and spawning a process could be somewhat compared to creating a new object in object oriented languages. Even Alan Kay, the inventor of the OOP term, envisioned a more process like design when discussing object oriented patterns:
In the rest of this post I'm going to explore some of the new features in lunatic and the relationship between message passing and function/method invocations.
1. Permissions through processes
Lunatic has a much stronger sandboxing model around individual processes than Erlang. It allows you to spawn processes and limit their CPU & memory usage and access to certain host functions.
This characteristic is really important in a world where we depend on so many 3rd party dependencies that it becomes impossible to audit them all, but if we can sandbox them inside processes without any permissions, we can limit their impact. A pattern where you spawn a process with specific permissions just to call a library function inside of it could be a viable option in an environment where processes are cheap. Or even a more common use case, answering each web requests in a separate sandboxed process.
There are efforts in other runtimes to solve similar problems, like deno's permissions. However in deno's case the permissions affect all code that is running and is more intended for endusers running scripts than for developers giving specific permissions to dependencies. I feel like inverting the power here and giving the power to developers to set permissions on parts of their code base is a much more powerful concept. Somehow endusers always end up "clicking OK" until the script they are trying to run works.
Forbidding a process to access the network or the filesystem can be useful, but sometimes it's not enough. I would like to show here how we can use processes to augment host functions.
Recently two new exciting features landed in lunatic:
Registry
allows you to register processes under a name and version.Tags
allow for selective receiving of messages.With them we can create new interesting designs where many host functions are replaced or augmented by processes and messages.
One interesting use case would be changing host functions behaviour by wrapping them into processes. Let's look at an example of creating a wrapper process that changes the behaviour of the DNS
resolve
host call by limiting the lookup to only "*.example.com" subdomains:This process runs forever in a loop and answers requests, imitating a
lunatic::resolve
function, but only if the request is for "*.example.com" subdomains. It actually internally useslunatic::resolve
and adds additional behaviour to it.To use this
resolve
process we would create a newEnvrionment
and register the process under the name "resolve" and version "1.0":Now the
child
process spawned inside ofenv
can resolve domain names by first looking up theresolve
process and sending requests to it:Notice that only two host function namespaces are allowed inside the environment
lunatic::process
&lunatic::message
, but not thelunatic::net::*
one containing the originallunatic::resolve
host function. That way the child process can't circumvent our limitation and directly use theresolve
host function. That's a great way to implement some higher level logic around permissions inside of child processes. You basically block the original host function and re-expose it through a process that has a well known name and version.The
resolve
process could be considered an object and can have some internal state. It could for example keep track of lookups and rate limit them depending on the caller. Being able to change the behaviour of host functions programatically by wrapping them into processes become a superpower with unlimited possibilities.The code is quite verbose and uses some of the lower level APIs for educational purposes, but these features can always be wrapped into higher level types, abstracting away details from developers. For simplicity reasons I didn't include any error handling in the examples, but you will probably want to send a reply back in case of an error, instead of just not sending anything and have the client wait forever on the response. Also, I didn't try to compile the code.
2. Host functions as processes
The previous example demonstrates well how messaging is not that different from function calls, a request is made (with arguments) and a response is received (return value). We could take this a step further and replace most of our host functions with native processes. Using the versioning ability of the registry we could even provide backwards compatibility in the runtime by shipping for example a
resolve@2.0
that has a different signature/behaviour, but not removingresolve@1.0
from the VM. All this "processes" could be implemented as native processes using theProcess
trait in the runtime, but instead of using a message queue they would be function calls without much of additional overhead.One downside of this approach would be that we lose the ability to get instantiation time errors when loading the module, on host function type mismatches. Only during the runtime can we tell if the request was a correct message or not. This burden could always be moved onto the library developers that wrap the message sending into correctly typed functions. In this case we have truly come full circle and have again "host functions", but this time on the library and not VM level.
3. Resources as processes
We can even take this ideas further and treat resources as processes.
A good example would be UDP connections. A processes subscribes itself to the
UDP
process and gets all datagrams from the UPD connection or can send data to it. Another would be a stdio process, sending messages to it writes to the standard output and if there is new input the process forwards it to us.One more complicated example would be a
GPU
process representing a GPU resource. You could send messages to it containingdata
that is copied from the local memory into the GPU memory orcommands
that instruct the GPU what operations to perform on that data. This way of using GPUs maps well onto how modern GPUs actually work. Sending a big data message is usually considered bad practice in Erlang (and lunatic), because the process you are sending it to could live on a different machine and sending a lot of data over the network could slow everything down. Modern GPUs have separate memories and sending a lot of data to them requires copying them to a different memory and should be avoided when possible. Usually you want to transfer the data only once and later just perform operations on it. This is a good demonstration of how also the system limitations are communicated well by using a process abstraction.TCP streams
Even TCP streams may seem to be in a similar basket as UDP when it comes to modelling them with processes, they are quite different. Usually you will end up having a message based protocol built on top of the TCP streams, but the stream is not a message based protocol itself. To construct theses higher level messages you want the freedom to pull data out of the stream and put it into a message. You could take the same approach as UDP, but it just feels more natural to me to pull exact chunks of data out, instead of someone pushing arbitrary sized chunks into your message queue when they arrive.
With the UDP process there was a concept of subscribing to it and receiving new UDP datagrams, but with TCP I would use a different approach more similar to a function call. The main problem with TCP streams is that the decision of what represents a message/request/replay is defined by some higher level agreement and can't be decided without reading parts of the message first. In many cases this would be reading a few bytes that contain the size of the message and then the whole message once we have the size. This could be modelled by sending a request to the TCP stream process with the data size we want to read and waiting on a message containing the data.
This approach is almost identical to doing a
read()
function call and doesn't have any other benefits. In this case I would actually prefer just to use the host function directly and model the (message based) higher level protocols around processes and messages.Conclusion
I was never a big fan of Java's "everything is an object" or Unix's "everything is a file" approach. Some concepts just don't map well onto files or objects and the API around them feels somewhat awkward or it negatively impacts performance. And probably modelling your application architecture with an "everything is a process" approach would have a similar result.
However, I hope I managed to get you a bit excited about using processes to model different parts of your application, and to show how versatile processes actually are. Like an object, a process can represent many different things too.
I would love to hear from other developers some ideas that could push processes/actors into new and interesting use-cases. Please leave a comment if you have any thoughts!
Beta Was this translation helpful? Give feedback.
All reactions