I’m currently working toward 3 separate white papers to showcase key architectural aspects of Broadleaf Commerce Microservices: Observability, Advanced Composable Commerce, and Scalability. I wanted to provide an early glimpse of some content bubbling up from the observability topic.

There are several goals and features we want to highlight when talking about Broadleaf and observability:

  1. Demonstrate an architecture that exposes observability as a core tenant
  2. Deep dive into distributed tracing. Make sure the framework is traceable in an open way that can be leveraged by multiple observability platforms, including cloud-native products
  3. Make sure traces are ‘data route aware’ so that the layer of the FlexPackage in which the trace is harvested can be effectively identified
  4. Make sure the 3 pillars of observability are covered: Tracing, Logs, and Metrics. Make sure these concepts are easily correlated.
  5. Introduce a FlexPackage design that leverages read and process replicas. This allows separation of mostly fetch functionality from mostly long lifecycle write functionality while maintaining a relatively small overall footprint. This is the "balanced" FlexPackage design.
  6. Take steps to make our services good k8s citizens. This involved introducing functionality to guarantee service readiness and load ordering.
  7. Introduce an environment log to greatly assist with troubleshooting custom FlexPackage creation. FlexPackage creation, while entirely a config effort, can be tricky when it comes to understanding properties and messaging config. This helps with that.
  8. Introduce a feature to allow turning off all messaging channels, and then selectively enable/disable a subset. This helps create more sane configuration yml files.

For those of you not familiar with Broadleaf Commerce Microservices, we introduce a new concept entitled “FlexPackage” that is targeted directly at advanced composable commerce and will be covered in more detail in an upcoming whitepaper. For now, know that a FlexPackage is a configuration and packaging concept for Broadleaf allowing multiple microservices to exist in a single JVM runtime under a single Spring instance. This means that beans are for the most part shared, allowing increased economy in resource outlay. The FlexPackage performs smart routing of data and persistence to the correct, separate backing datastore for each contained microservice.

First, you'll notice this new project structure in the reference demo application:

This replaces our previous "application" directory that only contained the all-in-one FlexPackage (this concept is now the "min" FlexPackage).

  • min emits a single FlexPackage as part of maven lifecycle.
  • granular doesn't emit a java artifact itself - rather - it is a holding area for the granular docker and k8s builds that refer to the individual services defined in the project services directory.
  • balanced consists of 4 new FlexPackages (browse,cart,processing,supporting).

Balanced is unique in that it is designed to leverage read and process replicas. The processing FlexPackage is a "process" replica and contains some duplicates of services from other FlexPackages (e.g. catalog, customer). The purpose of the duplication is to facilitate a separate FlexPackage for long lifecycle write processes (e.g. import and indexing).

Process is not included in the normal, storefront facing load balancing of traffic. Rather, that traffic is directed to the browse and cart FlexPackages. This is a powerful way to manage load, connection pool sizing, and performance - and keep minimal footprints while still scaling. The docker directory contains the docker-compose files we're accustomed to. The k8s directory contains the new k8s manifests that demonstrate Kubernetes deployments with observability added.

If we dive into the config for one of the balanced FlexPackage, we can see a bit more. For browse, we should have a FlexPackage where most of the messaging channels are off (the message handling is picked up by the processing FlexPackage). We should also expect to need to discern and properly configure properties for the more granular services contained in this FlexPackage. Two pillars arise as important for effective FlexPackage creation:

  1. See all the message channels auto-configured by whatever is on the classpath for the FlexPackage. Moreover, determine what channels are enabled and disabled - and why.
  2. See all the properties auto-configured by whatever is on the classpath for the FlexPackage. This gives you the scope of what you need to concern yourself with and provides a sanity check of what is exposed at runtime.

This link shows what the report looks like for the browse FlexPackage. Notable is the reason column for the message channel listing. From this, you can determine specifically why a channel is enabled/disabled and understand what you need to do to change it (if applicable). In this case, all the sandbox transition, import, and bulk channels are disabled. This log is emitted near the beginning of container startup and can be disabled with a property.

Another facet of the startup is the ordered launch of pods. This is accomplished with a new component that looks at the Prometheus "up" status for one or more dependencies before proceeding with application load. This check happens at the beginning of Spring Boot lifecycle. The configuration looks like this:

And, you can see it working in the logs like so:

Finally, observability is built into these changes. As part of the k8s deployment, we add:

  • Prometheus (for time series metric monitoring)
  • Grafana (for dashboard visualization of metrics)
  • Jaeger agents (for capturing traces emitted from the application by OpenTracing)
  • ELK stack (for trace visualization, centralized logging, trace-to-log correlation, and trace-to-metric correlation).

These are examples, and the traces will still work well with other observability approaches, such as cloud-native solutions offered by cloud providers. OpenTracing can also be easily disabled for alternate instrumentation, based on appetite. We’re keeping an eye on OpenTelemetry as it matures, but right now we’re getting better results out of OpenTracing (or vendor-specific instrumentation).

We employ OpenTracing support for Spring with the most common application intersections automatically configured. We build on this by adding a ContextInfoCustomizer that intercepts OpenTracing spans and adds the current tenant data (if applicable).

You can see the distributed services are listed at the top.

Next, if you click on one of the traces, you can see context information about the trace. Of interest here are two different links. The link on the left leads to the correlated logs that hold the traceId across possibly multiple services or even pods. You can also expand the logs easily in the time region for additional context. The link on the right is a custom label our tracing adds. This label describes the timeframe of the span and links you over to the grafana dashboard with the pod pre-selected and timeframe set. This is extremely powerful. This goes further than the normal pod metric correlation. This correlates the entire cluster performance at the timeframe of your span. This is powerful because you can detect issues multiple layers deep, like database or Kafka - not just your immediate JVM.

I look forward to sharing more information about this topic with you all in the upcoming white paper. We’re excited to share the new FlexPackage samples and supporting features as part of the upcoming 1.3 release. We are likely to publish our scalability white paper first, but be on the lookout for the official composability white paper soon.