Harbor started in 2014 as a humble internal project meant to address a simple use case: storing images for developers leveraging containers. The cloud native landscape was wildly different and tools like Kubernetes were just starting to see the light of the day. It took a few years for Harbor to mature to the point of being open sourced in 2016, but the project was a breath of fresh air for individuals and organizations attempting to find a solid container registry solution. We were confident Harbor was addressing critical use cases based on its strong growth in user base early on.
We were incredibly excited when Harbor was accepted to the Cloud Native Sandbox in the summer of 2018. Although Harbor had been open sourced for some years by this point, having a vendor-neutral home immediately impacted the project resulting in increased engagement via our community channels and GitHub activity.
There were many things we immediately began tackling after joining the Sandbox, including addressing some technical debt, laying out a roadmap based solely on community feedback, and expanding the number of contributors to include folks that have consistently worked on improving Harbor from other organizations. We’ve also started a bi-weekly community call where we hear directly from Harbor users on what’s working well and what’s not. Finally, we’ve ratified a project governance model that defines how the project operates at various levels.
Given Harbor’s already-large global user base across organizations small and large, proposing the project mature into the CNCF Incubator was a natural next step. The processes around progressing to Incubation are defined here. In order to be considered, certain growth and maturity characteristics must first be demonstrated by the project:
- Production usage: There must be users of the project that have deployed it to production environments and depend on its functionality for their business needs. We’ve worked closely with a number of large organizations leveraging Harbor the last number of years, so: check!
- Healthy maintainer team: There must be a healthy number of members on the team that can approve and accept new contributions to the project from the community. We have a number of maintainers that founded the project and continue to work on it full time, in addition to new maintainers joining the party: check!
- Healthy flow of contributions: The project must have a continuous and ongoing flow of new features and code being submitted and accepted into the codebase. Harbor released v1.6 in the summer of 2018, and we’re on the verge of releasing v1.7: check!
- CNCF’s Technical Oversight Committee (TOC) evaluated the proposal from the Harbor team and concluded that we had met all the required criteria. It is both deeply humbling and an honor to be in the company of other highly-respected incubated projects like gRPC, Fluentd, Envoy, Jaeger, Rook, NATS, and more.
What’s Harbor anyway?
Harbor is an open source cloud native registry that stores, signs, and scans container images for vulnerabilities.
Harbor solves common challenges by delivering trust, compliance, performance, and interoperability. It fills a gap for organizations and applications that cannot use a public or cloud-based registry, or want a consistent experience across clouds.
Harbor addresses the following common use cases:
- On-prem container registry – organizations with the desire to host sensitive production images on-premises can do so with Harbor.
- Vulnerability scanning – organizations can scan images before they are used in production. Images with failed vulnerability scans can be blocked from being pulled.
- Image signing – images can be signed via Notary to ensure provenance.
- Role-based Access Control – integration with LDAP (and AD) to provide user- and group-level permissions.
- Image replication – production images can be replicated to disparate Harbor nodes, providing disaster recovery, load balancing and the ability for organizations to replicate images to different geos to provide a more expedient image pull.
The “Harbor stack” is comprised of various 3rd-party components, including nginx, Docker Distribution v2, Redis, and PostgreSQL. Harbor also relies on Clair for vulnerability scanning, and Notary for image signing.
The Harbor components, highlighted in blue, are the heart of Harbor and are responsible for most of the heavy lifting in Harbor:
- Core Services provides an API and UI interface. Intercepts docker pushes / pulls to provide role-based access control and also to prevent vulnerables images from being pulled and subsequently used in production (all of this is configurable).
- Admin service is being phased out for v1.7, with feature / functionality being merged into the core service.
- Job Service is responsible for running background tasks (e.g., replication, one-shot or recurring vulnerability scans, etc.). Jobs are submitted by the core service and run in the job service component.
Currently Harbor is packaged via both docker-compose service definition and a Helm chart.
Want to learn more?
The best way to learn about Harbor is:
- Our website: https://goharbor.io/
- Harbor’s CNCF webinar: https://www.cncf.io/event/webinar-harbor/
- Slides: https://drive.google.com/file/d/1F6nvZhtw6-bgwdlySXLq3OIvlC6f6Vvb/view?usp=sharing
Community stats and graphs
Harbor has continued an upward trajectory of community growth through 2018. The stats below visualize the consistent growth pre- and post-acceptance into the Cloud Native Sandbox:
Where we are
Harbor is both mature and production-ready. We know of dozens of large organizations leveraging Harbor in production, including at least one serving millions of container images to tens-of-thousands of compute nodes. The various components that comprise Harbor’s overall architecture are battle-tested in real-world deployments.
Harbor is API driven and is being used in custom SaaS and on-prem products by various vendors and companies. It’s easy to integrate Harbor in your environment, whether a customer-facing SaaS or an internal development pipeline.
The Harbor team strives to release quarterly. We’re currently working on our eight major release, v1.7, due out soon. Over the last two releases alone we’ve made marked strides in achieving our long terms goals:
- Native support of Helm charts
- Initial support for deploying Harbor via Helm chart
- Refactoring of our persistence layer, now relying solely on PostgreSQL and Redis – this will help us achieve our high-availability goals
- Added labels and replication filtering based on labels
- Improvements to RBAC, including LDAP group-based access control
- Architecture simplification (i.e., collapsing admin server component responsibilities into core component)
Where we’re going
This is the fun part. 🙂
Harbor is a vibrant community of users – those who use Harbor and publicly share their experiences, the individuals who report and respond to issues, the folks who hang around in our Slack community, and those who spend time on GitHub improving our code and documentation. We’re all incredible fortunate at the rich and exciting ideas that are proposed via GitHub issues on a regular basis.
We’re still working on our v1.8 roadmap, but here are some major features we’re considering and might land at some point in the future (timing to be determined, and contributions are welcome!):
- Quotas – system- and project-level quotas; networking quotas; bandwidth quotas; user quotas; etc.
- Replication – the ability to replicate to non-Harbor nodes.
- Image proxying and caching – a docker pull would proxy a request to, say, Docker Hub, then scan the image before providing to developer. Alternatively, pre-cache images and block images that do not meet vulnerability requirements.
- One-click upgrades and rollbacks of Harbor.
- Clustering – Harbor nodes should cluster, replicate metadata (users, RBAC and system configuration, vulnerability scan results, etc.). Support for wide-area clustering is a stretch goal.
- BitTorrent-backed storage – images are transparently transferred via BT protocol.
- Improved multi-tenancy – provide additional multi-tenancy construct (system → tenant → project)
Please feel free to share your wishlist of features via GitHub; just open an issue and share your thoughts. We keep track of items the community desires and will prioritized based on demand.
How to get involved
Getting involved in Harbor is easy. Step 1: don’t be shy. We’re a friendly bunch of individuals working on an exciting open source project.
The lowest-barrier of entry is joining us on Slack. Ask questions, give feedback, request help, share your ideas on how to improve the project, or just say hello!
We love GitHub issues and pull requests. If you think something can be improved, let us know. If you want to spend a few minutes fixing something yourself – docs, code, error messages, you name it – please feel free to open a PR. We’ve previously discussed how to contribute, so don’t be shy. If you need help with the PR process, the quickest way to get an answer is probably to ping us on Slack.
See you on GitHub!
By James Zabala 詹姆斯 扎巴拉
Harbor始于2014年，是一个内部发起的项目，旨在解决一个简单的问题：帮助开发人员存储容器镜像。 当时云原生的版图和现在完全不同，像Kubernetes这样的工具刚刚开始引起注意。 直到2016年，Harbor逐渐成熟并开源，它为试图解决类似问题的个人和组织带来了新的选择。 而用户量的持续增长也让我们相信Harbor 解决了关键的问题。
当Harbor于2018年夏天被 CNCF 接受为 “沙箱”项目时，我们感到非常兴奋。虽然Harbor已经开源了几年，但是有一个供应商中立的“家”立刻提高了社区的活跃度和github上用户的参与度。
● 健康的维护团队：必须有足够数量的团队成员可以批准并接受社区对项目的新贡献。 除去新加入的维护人员外，本项目的创始团队成员仍在全职工作在此项目上。（满足！）
● CNCF的技术委员会(TOC)已经对Harbor团队提出的申请进行了评估，并认为我们已经满足了所有的条件。我们很荣幸地成为像gRPC, Fluentd, Envoy, Jaeger, Rook, NATS等有高度声望的项目中的一员。
● 内核服务-提供API和UI服务接口。解析docker push/pull请求以提供基于角色的访问控制，同时阻止漏洞镜像被拉取并随后用在生产环境中（都可配置）
● Harbor 主页：https://goharbor.io/
● Harbor CNCF在线研讨链接：https://www.cncf.io/event/webinar-harbor/
Harbor团队致力于季度发布。我们正在研发并将很快发布第八个大版本 – v1.7。对于实现Harbor的长期目标，仅在过去的两个版本中，我们就已经取得了显著的进步:
● 增加了Helm Charts的支持
● 支持使用Helm chart方式部署Harbor
● 简化架构（合并admin server组件的职能到core组件）
● 镜像代理和缓存——docker pull的请求将会被代理，比如对于从Docker Hub拉取的镜像，可以先对其进行扫描然后再提供给开发人员。还可以对镜像进行预缓存或者阻止不符合漏洞要求的镜像被下载