半兽人 发表于: 2015-03-10   最后更新时间: 2019-11-09 14:17:12  
{{totalSubscript}} 订阅,12931 游览

We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. As of 0.8 you can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.
我们推荐使用多种驱动来获取良好的吞吐量,而不是Kafka与应用程序日志或其他操作系统的文件系统共享相同的驱动。你可以将这些RAID驱动器一起打成一个卷或格式,并将每个驱动器作为其自己的目录。 由于Kafka有副本功能,RAID提供的冗余也可以在应用程序级别提供。这个选择有几个权衡。

If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.

RAID can potentially do better at balancing load between disks (although it doesn't always seem to) because it balances load at a lower level. The primary downside of RAID is that it is usually a big performance hit for write throughput and reduces the available disk space.

Another potential benefit of RAID is the ability to tolerate disk failures. However our experience has been that rebuilding the RAID array is so I/O intensive that it effectively disables the server, so this does not provide much real availability improvement.

您需要解锁本帖隐藏内容请: 点击这里

I'm CxY 1年前

RAID的另一个优点是能够容忍磁盘故障。但是,我们的经验是,重建RAID阵列I / O密集型,它有效地禁用服务器,因此不能提供很多实际可用性改进。


半兽人 -> I'm CxY 1年前