Guest post from LitmusChaos maintainers
As promised, we are back with another edition of monthly updates from the LitmusChaos community. With the growth of the Chaos Engineering community as well as the LitmusChaos community, we appreciate this massive participation and immense engagement and strive for the community to prosper and contribute back to its development.
This article is written to share monthly updates with the community from September 2022 to update the community on the latest happenings and updates around the LitmusChaos project.
LitmusChaos is a dynamic open source chaos engineering platform that enables teams to identify weaknesses and potential outages in infrastructures by inducing chaos engineering tests/experiments in a controlled manner. LitmusChaos is driven by the principles of Cloud-Native innovation and gave rise to the principles of Cloud-Native Chaos Engineering. Chaos engineering verifies the resilience of business services and helps DevOps pipelines proactively build code that is more resilient against software and infrastructure faults.
The LitmusChaos project was started in late 2017 to provide simple chaos jobs in Kubernetes. It became a CNCF sandbox project in 2020 and was promoted as a CNCF incubating project in January 2022. Today, it has maintainers from 5 different organizations across cloud-native vendors, solution providers, and end-users.
The project is used in production by more than 30 organizations, including large end-users like Adidas, FIS, iFood, Cyren, Intuit, Lenskart, Orange, and more as well as technology organizations like Red Hat and VMware.
LitmusChaos Releases 2.13.0
LitmusChaos version 2.13.0 was released on the 15th of September with some great new updates to the core components, the chaos center, and the Litmusctl.
Check out the release notes for deeper details on the release:
Release Notes (2.13.0)
Core Component Updates –
- Enhance network experiments to derive IPs of the pods for Kubernetes service if the target pod has a service mesh sidecar. This will enable us to run all network chaos experiments with service mesh enabled effectively. litmuschaos/litmus-go#558
- Adds Chaos SDK Templates for non-Kubernetes experiments (that is aws, gcp, vmware, azure) this will help us to speed up the experiment development process with a proper template for non-k8s services. To know more refer to the developer docs. litmuschaos/litmus-go#560
- Fixes the stress-chaos experiments to run chaos (helper pod) with minimum capabilities. This will allow running the stress experiments in a restricted environment (like Openshift) with lesser capabilities as mentioned in SCC docs. litmuschaos/litmus-go#557
- Enhance the HTTP status code experiment to have the ability to modify the response body. Also, it adds support to provide the content type and encoding for the body in modify body and status code experiment. litmuschaos/litmus-go#556
- Adds the ability to provide custom Service Account value for helper pod using CHAOS_SERVICE_ACCOUNT Env. It is optional, if not provided the helper pod will run with the same service account as the experiment pod. litmuschaos/chaos-runner#178
- Enhance chaos-operator to enable the leader-elect this ensures with multiple replicas a leader is elected and is the only one actively reconciling the set.litmuschaos/chaos-operator#417
- Refactor chaos operator code to convert the History field in ChaosResult spec as a Go pointer. litmuschaos/chaos-operator#416
- Enhance the chaos (helper pod) status check when waiting for completion with proper error handling. litmuschaos/litmus-go#552
- Adds document content for a better understanding of new HTTP chaos experiments and tunables. #3755
ChaosCenter Updates –
- Adds enhancement in chaos-scenarios replacing instance-id label with workflow-run-id, which is generated at runtime resulting in unique scenario/run creation on any scenario CRUD operations #3758
- Upgraded chaos operator go-pkg to 2.12.0 in gql-server & subscriber introducing support for source imagePullSecrets in chaosEngine & along with updates in core components based on 2.12.0. #3759
- Updated CRDs for ChaosEngines with source attributes updates in 2.12.0 CRD manifest #3742
- Adds support for providing service-type & makes clusterIP as default service-type for all services in helm-chart litmuschaos/litmus-helm#257
LitmusCTL Updates –
- Adds changes for Error handling in litmusctl apply manifest logic for better debugging & Usability. litmuschaos/litmusctl#97
- Adds .exe extension to binaries on Windows litmuschaos/litmusctl#96
- Upgrades gopkgs for argo-workflows to v3.3.1 and chaos-operator to 2.12.0 versions reducing vulnerabilities. litmuschaos/litmusctl#98
Thanks to our existing and new contributors for this release- @chandra-dixit-hcl @alebcay @Jasstkn @amityt @Saranya-jena @SarthakJain26 @Adarshkumar14 @Jonsy13 @ispeakc0de @avaakash @uditgaurav
Litmus-2.13.0 (Stable) cluster scope manifest
kubectl apply -f https://raw.githubusercontent.com/litmuschaos/litmus/2.13.0/mkdocs/docs/2.13.0/litmus-2.13.0.yaml
Litmus-2.13.0 (Stable) namespace scope manifest.
#Create a namespace eg: litmus
kubectl create ns litmus
#Install CRDs, if SELF_AGENT env is set to TRUE
kubectl apply -f https://raw.githubusercontent.com/litmuschaos/litmus/master/mkdocs/docs/2.13.0/litmus-portal-crds-2.13.0.yml
kubectl apply -f https://raw.githubusercontent.com/litmuschaos/litmus/master/mkdocs/docs/2.13.0/litmus-namespaced-2.13.0.yaml -n litmus
Upgrading from 2.12.0 to 2.13.0
kubectl apply -f https://raw.githubusercontent.com/litmuschaos/litmus/2.13.0/mkdocs/docs/2.13.0/upgrade-agent.yaml
- @chandra-dixit-hcl made their first contribution in #3752
Full Changelog: 2.12.0…2.13.0
Latest from the LitmusChaos Community
Here is a sneak-peek into the adopter story presented by Infracloud on how they are using LitmusChaos:
At InfraCloud, we are using Litmus to develop Resiliency Frameworks.
Why do we use Litmus.
To simulate various Chaos scenarios using fault injection templates provided by Litmus. Litmus also helps to incorporate custom fault templates developed using AWS SSM documents.
How do we use Litmus.
Currently, we have tested with different kind of scenarios including faults like pod deletion, network latency, resource stressing, network partitioning in databases, and many more.
Benefits in using Litmus.
- Easy deployment.
- Easy Fault injection.
- Custom Grading for experiments
- SSM integration helps to inject fault in both EKS and external AWS components.
Company website: https://www.infracloud.io/
Company GitHub: https://github.com/infracloudio
The LitmusChaos community is also proud to announce that Cloud Native Labs by HCL Technologies is a using LitmusChaos to bring the practice of Chaos Engineering for their larger teams and they will soon share their adoption story!
As the community continues to grow, so does the content. Over the month the community members have created some amazing and exciting content to uplift the presence of LitmusChaos on the Cloud Native map. Check out all the latest content curated by the community for the community:
Here are our monthly updates from August 2022, Check out the LitmusChaos August update blog: https://www.cncf.io/blog/2022/09/20/litmuschaos-august-2022-update/
LitmusChaos Community member Akash Shrivastava(Software Engineer at Harness) authored a blog on the HTTP Chaos Experiments he worked on over the past few months. Check out “Introduction to HTTP Chaos in LitmusChaos”: https://www.cncf.io/blog/2022/09/29/introduction-to-http-chaos-in-litmuschaos/
Chaos Engineering Meetup:
To teach the knowledge of Chaos Engineering, we kick-started the initiative of organising Chaos Engineering Meetups every month to talk about all things Chaos Engineering and move beyond the scope of LitmusChaos to give the community an opportunity to learn principles, concepts and larger ideas around chaos.
In the last edition of the Chaos Engineering Meetup, we had community member Caleb Xu (Cloud Architect at HCL Software) present their user story and experience with LitmusChaos at HCL Software.
Here is the detailed agenda:
Topic – “Improving Application and Platform Resilience with Litmus at HCL Software”
Talk abstract: Cloud native software and environments present unique challenges in software engineering, testing, and delivery. In this talk, we will discuss how we are currently using Litmus at HCL Software to improve the resilience of applications and infrastructure and how we are extending the framework to simulate real-world chaos events that we see in our environments.
Check out the recording from the Chaos Engineering Meetup September Edition:
The LitmusChaos Community meetings continue as a monthly cadence call to discuss the latest updates, happenings, and questions from the community. They are hosted every 3rd Wednesday of the month. Check out the latest from our last community meeting held on September 21st The call started off with a quick community roundup on the latest community updates by community leader Prithvi Raj and was followed by a detailed discussion on the release notes by core contributor Vedant Shrotria (Senior Software Engineer, Harness)
Check out the recording of the meeting:
LitmusChaos-Community-Sync-up- September 2022| Open Source Chaos Engineering |
Shout out to Dirk Michel on featuring LitmusChaos on his blog on Amazon EKS & FIS where he covers practicalities of implementing Chaos Engineering for Kubernetes workloads using LitmusChaos as one of the tool.
Check out: https://medium.com/@micheldirk/on-amazon-eks-and-fis-fd131b6284f1
On the other hand community member Akash Shrivastava has authored a tutorial blog on setting up LitmusChaos on a Raspberry Pi Cluster. Here is his detailed blog published on DEV.to: https://dev.to/litmus-chaos/setting-up-litmuschaos-on-raspberry-pi-cluster-3cm4
September was a month full of amazing videos which turned out to be really informative and useful to the community. Here are the three popular videos that turned out to drive community engagement for the month:
September witnessed the celebration of the Argo community with the ArgoCon taking center stage and we are proud of community members Amit Das and Saranya Jena who presented a lightning talk at the conference on how you can use Argo workflows to curate chaos scenarios with LitmusChaos
Lightning Talk: Using Argo Workflows to Curate Chaos Engineering wi… Amit Kumar Das & Saranya Jena
Chaos Engineering for Cloud-Native Application Resiliency
The last video in the video series includes LitmusChaos maintainer Karthik S joining community member Saiyam Pathak on his Kubesimplify episode to give a full fledged workshop on how one can get started with LitmusChaos and attack their workloads especially on Kubernetes.
Chaos engineering with Litmus – Complete hands-on workshop
In the end…
The LitmusChaos community continues to grow with amazing contributions (issues, suggestions, PRs) from the community and looks forward to more members joining in and contributing to the growth of the project.
Join the #litmus channel on the Kubernetes Slack to become a part of the community. Learn, Ask, and Contribute by being a part of the community.
Check out the Contributing Guide to get started with contributions
Subscribe to the LitmusChaos YouTube Channel for the latest videos.
Follow @LitmusChaos on Twitter for the latest social updates.
Check out the LitmusChaos blogs to learn more about LitmusChaos and you can write one too by using the tag #litmuschaos on DEV.to