Both Swift and Ceph are open source, object storage systems. But despite their similarities, there are differences to consider when choosing one for OpenStack storage.
Two of the most common OpenStack storage options are Swift, which is being developed as part of the OpenStack project, and Ceph, an independent open source system. Both options offer object storage, and can be downloaded for free. As a result, it can be difficult to choose between the two. Here are some considerations for evaluating Swift vs. Ceph for OpenStack storage.
Support can be a challenge for both Swift and Ceph — and there are two options. Organizations can add staff to handle both the underlying hardware and open source software, or buy a supported code distribution, which comes with software support and configuration expertise.
Many vendors support Swift, each offering their own OpenStack distribution. Support can be software-only or include hardware, as well, if you buy a vendor’s pre-integrated OpenStack system. Until a couple of years ago, Ceph was supported by a startup company, Inktank, but is now fully supported by Red Hat. There are plenty of vendors selling pre-integrated Ceph appliances and addressing hardware support.
Acquisition and support are on a somewhat level playing field. Ensure that after-sales drive add-ons are reasonably priced, as some major vendors ask for huge markups on drives. Generally, Ceph vendors use commercial off-the-shelf drives and allow users to purchase standard drives from distribution, while some Swift vendors are more proprietary and require you to buy their drives.
本文翻译自Evaluate Swift vs. Ceph for OpenStack object storage，本文不代表Ceph中国社区观点。
Compare the functionality, maturity of Swift vs. Ceph
In the Swift vs. Ceph race for OpenStack storage, it would seem that Ceph is winning — at least right now.
Ceph is a mature product, with lots of usage already. But it isn’t wrinkle-free, as some parts of Ceph, such as the object storage daemon (OSD) code, are still under major renovation. Ceph also has filer and block-IO access mode support, and has been demonstrated by CERN to scale to large sizes.
Swift is also mature. However, large OpenStack deployments are still rare, so Swift scalability remains somewhat untested. Swift also entered the arena a couple of years after Ceph and has been playing catch-up since. As a result, some Swift developers are now focused on roadmap details that could help further differentiate Swift from Ceph.
This, currently, is leading to the development of proprietary Swift APIs that not only differ from Ceph, but also from Amazon Simple Storage System. Resistance to yet another set of interfaces is building and unless there are strong reasons for the divergence, Ceph’s market share might grow.
Looking at roadmaps, the Ceph Special Interest Group is articulating a good story. Red Hat and SanDisk recently partnered to improve SSD and flash performance in Ceph, in anticipation of hard drive usage declining in the next few years. One known deficit of Ceph, however, is the intense back-end traffic that can create performance bottlenecks. Erasure coding, as opposed to replication, improves traffic levels, and a Red Hat partnership with Mellanox allowed remote direct memory access and fast LAN links to improve throughput and response time.
Further improvement is in the works, according to Red Hat. For example, Ceph’s OSD code, which drives storage devices, is being rewritten and tuned for higher performance. Ceph code is also already structured for software-defined infrastructure and can be easily virtualized and replicated easily. This makes Ceph suitable for hyper-converged configurations.
通过观察路线图，Ceph Special Interest小组表达出一个非常好的故事。Redhat和SanDisk最近发表声明合作，优化SSD和全闪架构在Ceph中的性能。Ceph的不足在于后端大量的流量，会产生性能瓶颈。纠删码(Erasure Codeing)，不同于复制可以优化流量等级，红帽和Mellanox合作，使用远程内存直接访问(remote direct memory access)和告诉的局域网连接优化吞吐量和回复时间。
Data consistency in Swift vs. Ceph
Swift and Ceph differ in terms of data consistency management. Swift offers eventual consistency, where some of the replicas of a data object are written asynchronously from the first copy. This exposes the possibility of an incomplete update returning wrong data, but it works well when the replicas are in different geographical regions.
Ceph uses a synchronous process that requires a quorum of replicas to be written before acknowledging write complete. This guarantees consistency, but adds latency if the remote site has to be part of the quorum. You can overcome these issues by choosing the right replica placement or by setting controls. This is also true for the exposure of Swift to incomplete writes, where the write_affinity setting can be used to force a quorum based on multiple local writes.
While the write quorum issue has a huge impact on performance, it can be resolved to only local storage in either case.
In the Swift vs. Ceph race for OpenStack storage, it would seem that Ceph is winning — at least right now. But to complete the OpenStack storage story, it’s important to address block-IO. The OpenStack Cinder project addresses this, providing a front end for a wide variety of SAN- and LAN-based networked storage. Traditional block-IO software, such as iSCSI, is used in these boxes. There is no competitive software stack to Cinder.
Swift和Ceph在数据一致性的管理上并不相同。Swift使用最终一致性，一些副本实在第一份副本写入后异步写入完成的。这种方式可能会由于未完成的更新导致数据返回错误(an incomplete update returning wrong data)，但是对于副本在不同地域的部署环境有比较好的支持。