Data scientists have mature platforms for realtime, large-scale data processing, but they force a certain structure on work which may not be suitable for all problems. Many modern data-driven tasks now represent microservices: small, decomposed units of compute exposed as APIs or available to be plugged into other parts of the business.
For many data teams, implementing a data-driven service requires architecting and deploying an entire platform, and current microservice or PaaS platforms have bad primitives for handling data-driven tasks and event-streams. Additionally, data-driven services often require inherently complex engineering (e.g. distribution and integration into middleware.)
Thus, data-scientists spent huge amounts of time wiring up complex infrastructure, boiler-plate, and middleware, all of which lie outside of their role and expertise.
- Slow time-to-market: building a service has a large fixed infrastructural cost
- Bureaucratic overhead: data-driven APIs or services require collaboration between data-science and other engineering teams
- Lack of robustness: robustness suffers as ill-suited infrastructure is used in production
Nstack is a platform for building composable, stream-based microservices.
An nstack microservice can be created by combining a regular function or class with a small configuration file. Aside from defining their API, the service developer does not have to deal with any boiler-plate or infrastructure, and can exclusively focus on their business-logic. Services can currently be written in Java, Python, or Haskell, and can contain any operating system, language, or binary packages.
import nstack import tensorflow as tf import numpy as np class ClickAnalytics(nstack.BaseService): def analyse(messages) : ....
name: ClickAnalytics stack: python api: | analyse: (Text) -> (Text) packages: [libsvm]
Once built and deployed, an nstack service can be attached to event sources and event sinks. Nstack comes with various integrations, such as Kafka, RabbitMQ, Redis, and the Nstack HTTP Gateway, or organisations can add their own custom sources and sinks.
This example workflow uses our service to process a stream of events from Kafka and write results to RabbitMQ. It can be started via the nstack CLI or DSL.
$ nstack start ClickAnalytics --source=kafka/click-stream --sink=rabbitMQ/analytics
nstack> myWorkflow = kafka/click-stream -> ClickAnalytics -> rabbitMQ/analytics nstack> start myWorkflow
Additionally, multiple microservices can be composed together to form workflows. Nstack handles the entire orchestration and passes messages through the workflow.
nstack> myWorkflow = kafka/click-stream -> ClickCleaning -> ClickAnalytics -> rabbitMQ/analytics nstack> start myWorkflow
In addition to enabling data teams to implement APIs and microservices, nstack enables other developers to implement existing nstack microservices in their own workflows. For instance, a developer in another team could easily deploy a new workflow using
ClickAnalytics, which reads from an HTTP endpoint and writes to Elasticsearch:
nstack> newWorkflow = http/endpoint-internal -> ClickAnalytics -> Elasticsearch/analytics nstack> start newWorkflow
Because nstack services are type-safe, developers can help guarantee a service can be implemented elsewhere without issues.
Nstack is platform agnostic, and can be deployed as a virtual appliance, a RedHat RPM, or as an AWS AMI.