mirror of
https://github.com/torvalds/linux.git
synced 2025-04-12 15:25:13 +00:00

The data_err=abort was initially introduced to address users' worries about data corruption spreading unnoticed. With direct writes, we can rely on return values to confirm successful writes to disk. But with buffered writes, a successful return only means the data has been written to memory. Users have no way of knowing if the data has actually written it to disk unless they use fsync (which impacts performance and can sometimes miss errors). The current data_err=abort implementation relies on the ordered data list, but past changes have inadvertently altered its behavior. For example, if an extent is unwritten, we do not add the inode to the ordered data list. Therefore, jbd2 will not wait for the data write-back of that inode to complete and check for errors in the inode mapping. Moreover, the checks performed by jbd2 can also miss errors. Now, all buffered writes eventually call ext4_end_bio(), where I/O errors are checked. Therefore, we can check for the data_err=abort mode at this point and abort the journal in a kworker (due to the interrupt context). Therefore, when data_err=abort is enabled, the journal is aborted in ext4_end_io_end() when an I/O error is detected in ext4_end_bio() to make users who are concerned about the contents of the file happy. Suggested-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/c7ab26f3-85ad-4b31-b132-0afb0e07bf79@huawei.com Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122110533.4116662-6-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>