commit d57287729e229188e7d07ef0117fe927664e08cb Author: Greg Kroah-Hartman Date: Thu Jan 12 11:59:20 2023 +0100 Linux 5.15.87 Link: https://lore.kernel.org/r/20230110180031.620810905@linuxfoundation.org Tested-by: Florian Fainelli Tested-by: Shuah Khan Tested-by: Linux Kernel Functional Testing Tested-by: Jon Hunter Tested-by: Sudip Mukherjee Tested-by: Bagas Sanjaya Tested-by: Allen Pais Tested-by: Guenter Roeck Tested-by: Kelsey Steele Tested-by: Ron Economos Signed-off-by: Greg Kroah-Hartman commit 24186c6822882aafe97925f07ac96726b7ccbfd2 Author: Jocelyn Falempe Date: Thu Oct 13 15:28:10 2022 +0200 drm/mgag200: Fix PLL setup for G200_SE_A rev >=4 commit b389286d0234e1edbaf62ed8bc0892a568c33662 upstream. For G200_SE_A, PLL M setting is wrong, which leads to blank screen, or "signal out of range" on VGA display. previous code had "m |= 0x80" which was changed to m |= ((pixpllcn & BIT(8)) >> 1); Tested on G200_SE_A rev 42 This line of code was moved to another file with commit 877507bb954e ("drm/mgag200: Provide per-device callbacks for PIXPLLC") but can be easily backported before this commit. v2: * put BIT(7) First to respect MSB-to-LSB (Thomas) * Add a comment to explain that this bit must be set (Thomas) Fixes: 2dd040946ecf ("drm/mgag200: Store values (not bits) in struct mgag200_pll_values") Cc: stable@vger.kernel.org Signed-off-by: Jocelyn Falempe Reviewed-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20221013132810.521945-1-jfalempe@redhat.com Signed-off-by: Greg Kroah-Hartman commit e326ee018a2486f885b3345ee992805a14bf12cc Author: Harshit Mogalapalli Date: Tue Jan 10 08:46:47 2023 -0800 io_uring: Fix unsigned 'res' comparison with zero in io_fixup_rw_res() Smatch warning: io_fixup_rw_res() warn: unsigned 'res' is never less than zero. Change type of 'res' from unsigned to long. Fixes: d6b7efc722a2 ("io_uring/rw: fix error'ed retry return values") Signed-off-by: Harshit Mogalapalli Signed-off-by: Greg Kroah-Hartman commit b2b6eefab43d68f02513c321900dfb56364ac1bd Author: Ard Biesheuvel Date: Thu Oct 20 10:39:10 2022 +0200 efi: random: combine bootloader provided RNG seed with RNG protocol output commit 196dff2712ca5a2e651977bb2fe6b05474111a83 upstream. Instead of blindly creating the EFI random seed configuration table if the RNG protocol is implemented and works, check whether such a EFI configuration table was provided by an earlier boot stage and if so, concatenate the existing and the new seeds, leaving it up to the core code to mix it in and credit it the way it sees fit. This can be used for, e.g., systemd-boot, to pass an additional seed to Linux in a way that can be consumed by the kernel very early. In that case, the following definitions should be used to pass the seed to the EFI stub: struct linux_efi_random_seed { u32 size; // of the 'seed' array in bytes u8 seed[]; }; The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY pool memory, and the address of the struct in memory should be installed as a EFI configuration table using the following GUID: LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b Note that doing so is safe even on kernels that were built without this patch applied, but the seed will simply be overwritten with a seed derived from the EFI RNG protocol, if available. The recommended seed size is 32 bytes, and seeds larger than 512 bytes are considered corrupted and ignored entirely. In order to preserve forward secrecy, seeds from previous bootloaders are memzero'd out, and in order to preserve memory, those older seeds are also freed from memory. Freeing from memory without first memzeroing is not safe to do, as it's possible that nothing else will ever overwrite those pages used by EFI. Reviewed-by: Jason A. Donenfeld [ardb: incorporate Jason's followup changes to extend the maximum seed size on the consumer end, memzero() it and drop a needless printk] Signed-off-by: Ard Biesheuvel Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit 99c0759495a048193aa872b669efb1f529ea597a Author: Jan Kara Date: Thu Sep 8 11:10:32 2022 +0200 mbcache: Avoid nesting of cache->c_list_lock under bit locks commit 5fc4cbd9fde5d4630494fd6ffc884148fb618087 upstream. Commit 307af6c87937 ("mbcache: automatically delete entries from cache on freeing") started nesting cache->c_list_lock under the bit locks protecting hash buckets of the mbcache hash table in mb_cache_entry_create(). This causes problems for real-time kernels because there spinlocks are sleeping locks while bitlocks stay atomic. Luckily the nesting is easy to avoid by holding entry reference until the entry is added to the LRU list. This makes sure we cannot race with entry deletion. Cc: stable@kernel.org Fixes: 307af6c87937 ("mbcache: automatically delete entries from cache on freeing") Reported-by: Mike Galbraith Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220908091032.10513-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit d50d6c193adb98657070951b892bde665c50b2b7 Author: Jie Wang Date: Mon Nov 14 16:20:47 2022 +0800 net: hns3: fix return value check bug of rx copybreak commit 29df7c695ed67a8fa32bb7805bad8fe2a76c1f88 upstream. The refactoring of rx copybreak modifies the original return logic, which will make this feature unavailable. So this patch fixes the return logic of rx copybreak. Fixes: e74a726da2c4 ("net: hns3: refactor hns3_nic_reuse_page()") Fixes: 99f6b5fb5f63 ("net: hns3: use bounce buffer when rx page can not be reused") Signed-off-by: Jie Wang Signed-off-by: Hao Lan Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit d4e6a13eb9a3361e2aaa17558687a2bd8b26d97c Author: Qu Wenruo Date: Tue Oct 18 09:56:38 2022 +0800 btrfs: make thaw time super block check to also verify checksum commit 3d17adea74a56a4965f7a603d8ed8c66bb9356d9 upstream. Previous commit a05d3c915314 ("btrfs: check superblock to ensure the fs was not modified at thaw time") only checks the content of the super block, but it doesn't really check if the on-disk super block has a matching checksum. This patch will add the checksum verification to thaw time superblock verification. This involves the following extra changes: - Export btrfs_check_super_csum() As we need to call it in super.c. - Change the argument list of btrfs_check_super_csum() Instead of passing a char *, directly pass struct btrfs_super_block * pointer. - Verify that our checksum type didn't change before checking the checksum value, like it's done at mount time Fixes: a05d3c915314 ("btrfs: check superblock to ensure the fs was not modified at thaw time") Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 70a1dccd0e58155b4e2a2fb738d77c65d32662d3 Author: Muhammad Usama Anjum Date: Wed Jan 19 15:15:22 2022 +0500 selftests: set the BUILD variable to absolute path commit 5ad51ab618de5d05f4e692ebabeb6fe6289aaa57 upstream. The build of kselftests fails if relative path is specified through KBUILD_OUTPUT or O= method. BUILD variable is used to determine the path of the output objects. When make is run from other directories with relative paths, the exact path of the build objects is ambiguous and build fails. make[1]: Entering directory '/home/usama/repos/kernel/linux_mainline2/tools/testing/selftests/alsa' gcc mixer-test.c -L/usr/lib/x86_64-linux-gnu -lasound -o build/kselftest/alsa/mixer-test /usr/bin/ld: cannot open output file build/kselftest/alsa/mixer-test Set the BUILD variable to the absolute path of the output directory. Make the logic readable and easy to follow. Use spaces instead of tabs for indentation as if with tab indentation is considered recipe in make. Signed-off-by: Muhammad Usama Anjum Signed-off-by: Shuah Khan Signed-off-by: Tyler Hicks (Microsoft) Signed-off-by: Greg Kroah-Hartman commit 58fef3ebc83cfaeffa1f24182f3394248c0da907 Author: Eric Biggers Date: Tue Nov 1 22:33:12 2022 -0700 ext4: don't allow journal inode to have encrypt flag commit 105c78e12468413e426625831faa7db4284e1fec upstream. Mounting a filesystem whose journal inode has the encrypt flag causes a NULL dereference in fscrypt_limit_io_blocks() when the 'inlinecrypt' mount option is used. The problem is that when jbd2_journal_init_inode() calls bmap(), it eventually finds its way into ext4_iomap_begin(), which calls fscrypt_limit_io_blocks(). fscrypt_limit_io_blocks() requires that if the inode is encrypted, then its encryption key must already be set up. That's not the case here, since the journal inode is never "opened" like a normal file would be. Hence the crash. A reproducer is: mkfs.ext4 -F /dev/vdb debugfs -w /dev/vdb -R "set_inode_field <8> flags 0x80808" mount /dev/vdb /mnt -o inlinecrypt To fix this, make ext4 consider journal inodes with the encrypt flag to be invalid. (Note, maybe other flags should be rejected on the journal inode too. For now, this is just the minimal fix for the above issue.) I've marked this as fixing the commit that introduced the call to fscrypt_limit_io_blocks(), since that's what made an actual crash start being possible. But this fix could be applied to any version of ext4 that supports the encrypt feature. Reported-by: syzbot+ba9dac45bc76c490b7c3@syzkaller.appspotmail.com Fixes: 38ea50daa7a4 ("ext4: support direct I/O with fscrypt using blk-crypto") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221102053312.189962-1-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit bd5dc96fea4edd16d2e22f41b4dd50a4cfbeb919 Author: Matthieu Baerts Date: Fri Dec 9 16:28:10 2022 -0800 mptcp: use proper req destructor for IPv6 commit d3295fee3c756ece33ac0d935e172e68c0a4161b upstream. Before, only the destructor from TCP request sock in IPv4 was called even if the subflow was IPv6. It is important to use the right destructor to avoid memory leaks with some advanced IPv6 features, e.g. when the request socks contain specific IPv6 options. Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree") Reviewed-by: Mat Martineau Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 78bd6ab52c03894a63692e7e09c067842a16af17 Author: Matthieu Baerts Date: Fri Dec 9 16:28:09 2022 -0800 mptcp: dedicated request sock for subflow in v6 commit 34b21d1ddc8ace77a8fa35c1b1e06377209e0dae upstream. tcp_request_sock_ops structure is specific to IPv4. It should then not be used with MPTCP subflows on top of IPv6. For example, it contains the 'family' field, initialised to AF_INET. This 'family' field is used by TCP FastOpen code to generate the cookie but also by TCP Metrics, SELinux and SYN Cookies. Using the wrong family will not lead to crashes but displaying/using/checking wrong things. Note that 'send_reset' callback from request_sock_ops structure is used in some error paths. It is then also important to use the correct one for IPv4 or IPv6. The slab name can also be different in IPv4 and IPv6, it will be used when printing some log messages. The slab pointer will anyway be the same because the object size is the same for both v4 and v6. A BUILD_BUG_ON() has also been added to make sure this size is the same. Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reviewed-by: Mat Martineau Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 6e9c1aef3e32a97100de151ab7fa55270d9e01cd Author: Mario Limonciello Date: Thu Jan 5 14:10:50 2023 -0600 Revert "ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007" A number of AMD based Rembrandt laptops are not working properly in suspend/resume. This has been root caused to be from the BIOS implementation not populating code for the AMD GUID in uPEP, but instead only the Microsoft one. In later kernels this has been fixed by using the Microsoft GUID instead. The following series of patches has fixed it in newer kernels: commit ed470febf837 ("ACPI: PM: s2idle: Add support for upcoming AMD uPEP HID AMDI008") commit 1a2dcab517cb ("ACPI: PM: s2idle: Use LPS0 idle if ACPI_FADT_LOW_POWER_S0 is unset") commit 100a57379380 ("ACPI: x86: s2idle: Move _HID handling for AMD systems into structures") commit fd894f05cf30 ("ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt") commit a0bc002393d4 ("ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID") commit d0f61e89f08d ("ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE") commit ddeea2c3cb88 ("ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14") commit 888ca9c7955e ("ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7") commit 631b54519e8e ("ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13") commit 39f81776c680 ("ACPI: x86: s2idle: Fix a NULL pointer dereference") commit 54bd1e548701 ("ACPI: x86: s2idle: Add another ID to s2idle_dmi_table") commit 577821f756cf ("ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865") commit e6d180a35bc0 ("ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+") This is needlessly complex for 5.15.y though. To accomplish the same effective result revert commit f0c6225531e4 ("ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007") instead. Link: https://lore.kernel.org/stable/MN0PR12MB61015DB3D6EDBFD841157918E2F59@MN0PR12MB6101.namprd12.prod.outlook.com/ Signed-off-by: Mario Limonciello Signed-off-by: Greg Kroah-Hartman commit e32f867b37da7902685c9a106bef819506aa1a92 Author: William Liu Date: Fri Dec 30 13:03:15 2022 +0900 ksmbd: check nt_len to be at least CIFS_ENCPWD_SIZE in ksmbd_decode_ntlmssp_auth_blob commit 797805d81baa814f76cf7bdab35f86408a79d707 upstream. "nt_len - CIFS_ENCPWD_SIZE" is passed directly from ksmbd_decode_ntlmssp_auth_blob to ksmbd_auth_ntlmv2. Malicious requests can set nt_len to less than CIFS_ENCPWD_SIZE, which results in a negative number (or large unsigned value) used for a subsequent memcpy in ksmbd_auth_ntlvm2 and can cause a panic. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: William Liu Signed-off-by: Hrvoje Mišetić Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 4136f1ac1ecd20ae100e247082a282715ad0e6de Author: Namjae Jeon Date: Sat Dec 31 17:32:31 2022 +0900 ksmbd: fix infinite loop in ksmbd_conn_handler_loop() commit 83dcedd5540d4ac61376ddff5362f7d9f866a6ec upstream. If kernel_recvmsg() return -EAGAIN in ksmbd_tcp_readv() and go round again, It will cause infinite loop issue. And all threads from next connections would be doing that. This patch add max retry count(2) to avoid it. kernel_recvmsg() will wait during 7sec timeout and try to retry two time if -EAGAIN is returned. And add flags of kvmalloc to __GFP_NOWARN and __GFP_NORETRY to disconnect immediately without retrying on memory alloation failure. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18259 Reviewed-by: Sergey Senozhatsky Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit f10defb0be6ac42fb6a97b45920d32da6bd6fde8 Author: Linus Torvalds Date: Wed Jan 4 11:06:28 2023 -0800 hfs/hfsplus: avoid WARN_ON() for sanity check, use proper error handling commit cb7a95af78d29442b8294683eca4897544b8ef46 upstream. Commit 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check") fixed a build warning by turning a comment into a WARN_ON(), but it turns out that syzbot then complains because it can trigger said warning with a corrupted hfs image. The warning actually does warn about a bad situation, but we are much better off just handling it as the error it is. So rather than warn about us doing bad things, stop doing the bad things and return -EIO. While at it, also fix a memory leak that was introduced by an earlier fix for a similar syzbot warning situation, and add a check for one case that historically wasn't handled at all (ie neither comment nor subsequent WARN_ON). Reported-by: syzbot+7bb7cd3595533513a9e7@syzkaller.appspotmail.com Fixes: 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check") Fixes: 8d824e69d9f3 ("hfs: fix OOB Read in __hfs_brec_find") Link: https://lore.kernel.org/lkml/000000000000dbce4e05f170f289@google.com/ Tested-by: Michael Schmitz Cc: Arnd Bergmann Cc: Matthew Wilcox Cc: Viacheslav Dubeyko Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 48d9e2e6de01ed35e965eb549758a837c07b601d Author: Arnd Bergmann Date: Mon Nov 8 18:35:04 2021 -0800 hfs/hfsplus: use WARN_ON for sanity check commit 55d1cbbbb29e6656c662ee8f73ba1fc4777532eb upstream. gcc warns about a couple of instances in which a sanity check exists but the author wasn't sure how to react to it failing, which makes it look like a possible bug: fs/hfsplus/inode.c: In function 'hfsplus_cat_read_inode': fs/hfsplus/inode.c:503:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 503 | /* panic? */; | ^ fs/hfsplus/inode.c:524:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 524 | /* panic? */; | ^ fs/hfsplus/inode.c: In function 'hfsplus_cat_write_inode': fs/hfsplus/inode.c:582:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 582 | /* panic? */; | ^ fs/hfsplus/inode.c:608:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 608 | /* panic? */; | ^ fs/hfs/inode.c: In function 'hfs_write_inode': fs/hfs/inode.c:464:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 464 | /* panic? */; | ^ fs/hfs/inode.c:485:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 485 | /* panic? */; | ^ panic() is probably not the correct choice here, but a WARN_ON seems appropriate and avoids the compile-time warning. Link: https://lkml.kernel.org/r/20210927102149.1809384-1-arnd@kernel.org Link: https://lore.kernel.org/all/20210322223249.2632268-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann Reviewed-by: Christian Brauner Cc: Alexander Viro Cc: Christian Brauner Cc: Greg Kroah-Hartman Cc: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f5a9bbf962e2c4b1d9addbfaf16d7ffcc2f63bde Author: Zhenyu Wang Date: Mon Dec 19 22:03:57 2022 +0800 drm/i915/gvt: fix vgpu debugfs clean in remove commit 704f3384f322b40ba24d958473edfb1c9750c8fd upstream. Check carefully on root debugfs available when destroying vgpu, e.g in remove case drm minor's debugfs root might already be destroyed, which led to kernel oops like below. Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering intel_vgpu_mdev b1338b2d-a709-4c23-b766-cc436c36cdf0: Removing from iommu group 14 BUG: kernel NULL pointer dereference, address: 0000000000000150 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 3 PID: 1046 Comm: driverctl Not tainted 6.1.0-rc2+ #6 Hardware name: HP HP ProDesk 600 G3 MT/829D, BIOS P02 Ver. 02.44 09/13/2022 RIP: 0010:__lock_acquire+0x5e2/0x1f90 Code: 87 ad 09 00 00 39 05 e1 1e cc 02 0f 82 f1 09 00 00 ba 01 00 00 00 48 83 c4 48 89 d0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ff <48> 81 3f 60 9e c2 b6 45 0f 45 f8 83 fe 01 0f 87 55 fa ff ff 89 f0 RSP: 0018:ffff9f770274f948 EFLAGS: 00010046 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000150 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8895d1173300 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000150 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fc9b2ba0740(0000) GS:ffff889cdfcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000150 CR3: 000000010fd93005 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: lock_acquire+0xbf/0x2b0 ? simple_recursive_removal+0xa5/0x2b0 ? lock_release+0x13d/0x2d0 down_write+0x2a/0xd0 ? simple_recursive_removal+0xa5/0x2b0 simple_recursive_removal+0xa5/0x2b0 ? start_creating.part.0+0x110/0x110 ? _raw_spin_unlock+0x29/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_remove_vgpu+0x15/0x30 [kvmgt] intel_gvt_destroy_vgpu+0x60/0x100 [kvmgt] intel_vgpu_release_dev+0xe/0x20 [kvmgt] device_release+0x30/0x80 kobject_put+0x79/0x1b0 device_release_driver_internal+0x1b8/0x230 bus_remove_device+0xec/0x160 device_del+0x189/0x400 ? up_write+0x9c/0x1b0 ? mdev_device_remove_common+0x60/0x60 [mdev] mdev_device_remove_common+0x22/0x60 [mdev] mdev_device_remove_cb+0x17/0x20 [mdev] device_for_each_child+0x56/0x80 mdev_unregister_parent+0x5a/0x81 [mdev] intel_gvt_clean_device+0x2d/0xe0 [kvmgt] intel_gvt_driver_remove+0x2e/0xb0 [i915] i915_driver_remove+0xac/0x100 [i915] i915_pci_remove+0x1a/0x30 [i915] pci_device_remove+0x31/0xa0 device_release_driver_internal+0x1b8/0x230 unbind_store+0xd8/0x100 kernfs_fop_write_iter+0x156/0x210 vfs_write+0x236/0x4a0 ksys_write+0x61/0xd0 do_syscall_64+0x55/0x80 ? find_held_lock+0x2b/0x80 ? lock_release+0x13d/0x2d0 ? up_read+0x17/0x20 ? lock_is_held_type+0xe3/0x140 ? asm_exc_page_fault+0x22/0x30 ? lockdep_hardirqs_on+0x7d/0x100 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fc9b2c9e0c4 Code: 15 71 7d 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 3d 05 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 RSP: 002b:00007ffec29c81c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc9b2c9e0c4 RDX: 000000000000000d RSI: 0000559f8b5f48a0 RDI: 0000000000000001 RBP: 0000559f8b5f48a0 R08: 0000559f8b5f3540 R09: 00007fc9b2d76d30 R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000d R13: 00007fc9b2d77780 R14: 000000000000000d R15: 00007fc9b2d72a00 Modules linked in: sunrpc intel_rapl_msr intel_rapl_common intel_pmc_core_pltdrv intel_pmc_core intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ee1004 igbvf rapl vfat fat intel_cstate intel_uncore pktcdvd i2c_i801 pcspkr wmi_bmof i2c_smbus acpi_pad vfio_pci vfio_pci_core vfio_virqfd zram fuse dm_multipath kvmgt mdev vfio_iommu_type1 vfio kvm irqbypass i915 nvme e1000e igb nvme_core crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic serio_raw ghash_clmulni_intel sha512_ssse3 dca drm_buddy intel_gtt video wmi drm_display_helper ttm CR2: 0000000000000150 ---[ end trace 0000000000000000 ]--- Cc: Wang Zhi Cc: He Yu Cc: Alex Williamson Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang Tested-by: Yu He Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-2-zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit ae9a61511736cc71a99f01e8b7b90f6fb6128ed8 Author: Zhenyu Wang Date: Mon Dec 19 22:03:56 2022 +0800 drm/i915/gvt: fix gvt debugfs destroy commit c4b850d1f448a901fbf4f7f36dec38c84009b489 upstream. When gvt debug fs is destroyed, need to have a sane check if drm minor's debugfs root is still available or not, otherwise in case like device remove through unbinding, drm minor's debugfs directory has already been removed, then intel_gvt_debugfs_clean() would act upon dangling pointer like below oops. i915 0000:00:02.0: Direct firmware load for i915/gvt/vid_0x8086_did_0x1926_rid_0x0a.golden_hw_state failed with error -2 i915 0000:00:02.0: MDEV: Registered Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 2486 Comm: gfx-unbind.sh Tainted: G I 6.1.0-rc8+ #15 Hardware name: Dell Inc. XPS 13 9350/0JXC1H, BIOS 1.13.0 02/10/2020 RIP: 0010:down_write+0x1f/0x90 Code: 1d ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 89 fb e8 62 c0 ff ff bf 01 00 00 00 e8 28 5e 31 ff 31 c0 ba 01 00 00 00 48 0f b1 13 75 33 65 48 8b 04 25 c0 bd 01 00 48 89 43 08 bf 01 RSP: 0018:ffff9eb3036ffcc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: ffffff8100000000 RDX: 0000000000000001 RSI: 0000000000000064 RDI: ffffffffa48787a8 RBP: ffff9eb3036ffd30 R08: ffffeb1fc45a0608 R09: ffffeb1fc45a05c0 R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000 R13: ffff91acc33fa328 R14: ffff91acc033f080 R15: ffff91acced533e0 FS: 00007f6947bba740(0000) GS:ffff91ae36d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000001133a2002 CR4: 00000000003706e0 Call Trace: simple_recursive_removal+0x9f/0x2a0 ? start_creating.part.0+0x120/0x120 ? _raw_spin_lock+0x13/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_clean+0x15/0x30 [kvmgt] intel_gvt_clean_device+0x49/0xe0 [kvmgt] intel_gvt_driver_remove+0x2f/0xb0 i915_driver_remove+0xa4/0xf0 i915_pci_remove+0x1a/0x30 pci_device_remove+0x33/0xa0 device_release_driver_internal+0x1b2/0x230 unbind_store+0xe0/0x110 kernfs_fop_write_iter+0x11b/0x1f0 vfs_write+0x203/0x3d0 ksys_write+0x63/0xe0 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6947cb5190 Code: 40 00 48 8b 15 71 9c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d 51 24 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 RSP: 002b:00007ffcbac45a28 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f6947cb5190 RDX: 000000000000000d RSI: 0000555e35c866a0 RDI: 0000000000000001 RBP: 0000555e35c866a0 R08: 0000000000000002 R09: 0000555e358cb97c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000001 R13: 000000000000000d R14: 0000000000000000 R15: 0000555e358cb8e0 Modules linked in: kvmgt CR2: 00000000000000a0 ---[ end trace 0000000000000000 ]--- Cc: Wang, Zhi Cc: He, Yu Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-1-zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit eb3e943a324309497359038e5465b0c5947f4e61 Author: Björn Töpel Date: Mon Jan 2 17:07:48 2023 +0100 riscv, kprobes: Stricter c.jr/c.jalr decoding commit b2d473a6019ef9a54b0156ecdb2e0398c9fa6a24 upstream. In the compressed instruction extension, c.jr, c.jalr, c.mv, and c.add is encoded the following way (each instruction is 16b): ---+-+-----------+-----------+-- 100 0 rs1[4:0]!=0 00000 10 : c.jr 100 1 rs1[4:0]!=0 00000 10 : c.jalr 100 0 rd[4:0]!=0 rs2[4:0]!=0 10 : c.mv 100 1 rd[4:0]!=0 rs2[4:0]!=0 10 : c.add The following logic is used to decode c.jr and c.jalr: insn & 0xf007 == 0x8002 => instruction is an c.jr insn & 0xf007 == 0x9002 => instruction is an c.jalr When 0xf007 is used to mask the instruction, c.mv can be incorrectly decoded as c.jr, and c.add as c.jalr. Correct the decoding by changing the mask from 0xf007 to 0xf07f. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Björn Töpel Reviewed-by: Conor Dooley Reviewed-by: Guo Ren Link: https://lore.kernel.org/r/20230102160748.1307289-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 620a229f576a8bbc2e9d8eb79355ec74423da5e9 Author: Ben Dooks Date: Thu Dec 29 17:05:45 2022 +0000 riscv: uaccess: fix type of 0 variable on error in get_user() commit b9b916aee6715cd7f3318af6dc360c4729417b94 upstream. If the get_user(x, ptr) has x as a pointer, then the setting of (x) = 0 is going to produce the following sparse warning, so fix this by forcing the type of 'x' when access_ok() fails. fs/aio.c:2073:21: warning: Using plain integer as NULL pointer Signed-off-by: Ben Dooks Reviewed-by: Palmer Dabbelt Link: https://lore.kernel.org/r/20221229170545.718264-1-ben-linux@fluff.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 8e05a993f8aa420b836e93ef46869c287bbb2830 Author: Srinivas Pandruvada Date: Tue Dec 27 16:10:05 2022 -0800 thermal: int340x: Add missing attribute for data rate base commit b878d3ba9bb41cddb73ba4b56e5552f0a638daca upstream. Commit 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver")' added rfi_restriction_data_rate_base string, mmio details and documentation, but missed adding attribute to sysfs. Add missing sysfs attribute. Fixes: 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver") Cc: 5.11+ # v5.11+ Signed-off-by: Srinivas Pandruvada Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit c3222fd2822503ee00e7f4fbf1d1cf8485eccda3 Author: Pavel Begunkov Date: Thu Jan 5 10:49:15 2023 +0000 io_uring: fix CQ waiting timeout handling commit 12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8 upstream. Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in particular we rearm it anew every time we get into io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2 CQEs and getting a task_work in the middle may double the timeout value, or even worse in some cases task may wait indefinitely. Cc: stable@vger.kernel.org Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts") Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.1672915663.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b7b9bc93055d69283fe259c38ea9c46a7a6c3421 Author: Jens Axboe Date: Wed Jan 4 08:52:06 2023 -0700 block: don't allow splitting of a REQ_NOWAIT bio commit 9cea62b2cbabff8ed46f2df17778b624ad9dd25a upstream. If we split a bio marked with REQ_NOWAIT, then we can trigger spurious EAGAIN if constituent parts of that split bio end up failing request allocations. Parts will complete just fine, but just a single failure in one of the chained bios will yield an EAGAIN final result for the parent bio. Return EAGAIN early if we end up needing to split such a bio, which allows for saner recovery handling. Cc: stable@vger.kernel.org # 5.15+ Link: https://github.com/axboe/liburing/issues/766 Reported-by: Michael Kelley Reviewed-by: Keith Busch Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit e1358c878711041204d9b4ec4e063317d75a0ac1 Author: Paul Menzel Date: Mon Jan 2 14:57:30 2023 +0100 fbdev: matroxfb: G200eW: Increase max memory from 1 MB to 16 MB commit f685dd7a8025f2554f73748cfdb8143a21fb92c7 upstream. Commit 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the same as vbG200 to avoid black screen") accidently decreases the maximum memory size for the Matrox G200eW (102b:0532) from 8 MB to 1 MB by missing one zero. This caused the driver initialization to fail with the messages below, as the minimum required VRAM size is 2 MB: [ 9.436420] matroxfb: Matrox MGA-G200eW (PCI) detected [ 9.444502] matroxfb: cannot determine memory size [ 9.449316] matroxfb: probe of 0000:0a:03.0 failed with error -1 So, add the missing 0 to make it the intended 16 MB. Successfully tested on the Dell PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013, that the warning is gone. While at it, add a leading 0 to the maxdisplayable entry, so it’s aligned properly. The value could probably also be increased from 8 MB to 16 MB, as the G200 uses the same values, but I have not checked any datasheet. Note, matroxfb is obsolete and superseded by the maintained DRM driver mga200, which is used by default on most systems where both drivers are available. Therefore, on most systems it was only a cosmetic issue. Fixes: 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the same as vbG200 to avoid black screen") Link: https://lore.kernel.org/linux-fbdev/972999d3-b75d-5680-fcef-6e6905c52ac5@suse.de/T/#mb6953a9995ebd18acc8552f99d6db39787aec775 Cc: it+linux-fbdev@molgen.mpg.de Cc: Z. Liu Cc: Rich Felker Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel Signed-off-by: Helge Deller Signed-off-by: Greg Kroah-Hartman commit 682a7d064f3546ab03101f9c13adbdd0faefacdf Author: Jeff Layton Date: Tue Dec 13 13:08:26 2022 -0500 nfsd: fix handling of readdir in v4root vs. mount upcall timeout commit cad853374d85fe678d721512cecfabd7636e51f3 upstream. If v4 READDIR operation hits a mountpoint and gets back an error, then it will include that entry in the reply and set RDATTR_ERROR for it to the error. That's fine for "normal" exported filesystems, but on the v4root, we need to be more careful to only expose the existence of dentries that lead to exports. If the mountd upcall times out while checking to see whether a mountpoint on the v4root is exported, then we have no recourse other than to fail the whole operation. Cc: Steve Dickson Link: https://bugzilla.kernel.org/show_bug.cgi?id=216777 Reported-by: JianHong Yin Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Cc: Signed-off-by: Greg Kroah-Hartman commit cb42aa7b5f726e3fddc8656b8f5c723537d654f1 Author: Rodrigo Branco Date: Tue Jan 3 14:17:51 2023 -0600 x86/bugs: Flush IBP in ib_prctl_set() commit a664ec9158eeddd75121d39c9a0758016097fa96 upstream. We missed the window between the TIF flag update and the next reschedule. Signed-off-by: Rodrigo Branco Reviewed-by: Borislav Petkov (AMD) Signed-off-by: Ingo Molnar Cc: Signed-off-by: Greg Kroah-Hartman commit 554a880a1fff46dd5a355dec21cd77d542a0ddf2 Author: Takashi Iwai Date: Tue Nov 22 12:51:22 2022 +0100 x86/kexec: Fix double-free of elf header buffer commit d00dd2f2645dca04cf399d8fc692f3f69b6dd996 upstream. After b3e34a47f989 ("x86/kexec: fix memory leak of elf header buffer"), freeing image->elf_headers in the error path of crash_load_segments() is not needed because kimage_file_post_load_cleanup() will take care of that later. And not clearing it could result in a double-free. Drop the superfluous vfree() call at the error path of crash_load_segments(). Fixes: b3e34a47f989 ("x86/kexec: fix memory leak of elf header buffer") Signed-off-by: Takashi Iwai Signed-off-by: Borislav Petkov (AMD) Acked-by: Baoquan He Acked-by: Vlastimil Babka Cc: Link: https://lore.kernel.org/r/20221122115122.13937-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman commit 264241a6104547104a03d6ce247a2e4f2e900909 Author: Qu Wenruo Date: Wed Aug 24 20:16:22 2022 +0800 btrfs: check superblock to ensure the fs was not modified at thaw time [ Upstream commit a05d3c9153145283ce9c58a1d7a9056fbb85f6a1 ] [BACKGROUND] There is an incident report that, one user hibernated the system, with one btrfs on removable device still mounted. Then by some incident, the btrfs got mounted and modified by another system/OS, then back to the hibernated system. After resuming from the hibernation, new write happened into the victim btrfs. Now the fs is completely broken, since the underlying btrfs is no longer the same one before the hibernation, and the user lost their data due to various transid mismatch. [REPRODUCER] We can emulate the situation using the following small script: truncate -s 1G $dev mkfs.btrfs -f $dev mount $dev $mnt fsstress -w -d $mnt -n 500 sync xfs_freeze -f $mnt cp $dev $dev.backup # There is no way to mount the same cloned fs on the same system, # as the conflicting fsid will be rejected by btrfs. # Thus here we have to wipe the fs using a different btrfs. mkfs.btrfs -f $dev.backup dd if=$dev.backup of=$dev bs=1M xfs_freeze -u $mnt fsstress -w -d $mnt -n 20 umount $mnt btrfs check $dev The final fsck will fail due to some tree blocks has incorrect fsid. This is enough to emulate the problem hit by the unfortunate user. [ENHANCEMENT] Although such case should not be that common, it can still happen from time to time. From the view of btrfs, we can detect any unexpected super block change, and if there is any unexpected change, we just mark the fs read-only, and thaw the fs. By this we can limit the damage to minimal, and I hope no one would lose their data by this anymore. Suggested-by: Goffredo Baroncelli Link: https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@bluemole.com/ Reviewed-by: Anand Jain Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 69f4bda5f4e6b129efbb9158b157d5a6d63f75af Author: Christoph Hellwig Date: Wed Dec 21 10:12:17 2022 +0100 nvme: also return I/O command effects from nvme_command_effects [ Upstream commit 831ed60c2aca2d7c517b2da22897a90224a97d27 ] To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Kanchan Joshi Signed-off-by: Sasha Levin commit a6a4b057cd47f7383890d8e8ebfa62eea4bca891 Author: Christoph Hellwig Date: Mon Dec 12 15:20:04 2022 +0100 nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it [ Upstream commit 61f37154c599cf9f2f84dcbd9be842f8645a7099 ] Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reviewed-by: Kanchan Joshi Reviewed-by: Chaitanya Kulkarni Signed-off-by: Sasha Levin commit f9309dcaa9c0e0d0d7b1dc8d3651db460ae6117f Author: Jens Axboe Date: Fri Dec 23 06:37:08 2022 -0700 io_uring: check for valid register opcode earlier [ Upstream commit 343190841a1f22b96996d9f8cfab902a4d1bfd0e ] We only check the register opcode value inside the restricted ring section, move it into the main io_uring_register() function instead and check it up front. Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 4df413d46960f11c8c105238cfc3f5ff4c95c003 Author: Yanjun Zhang Date: Thu Dec 22 09:57:21 2022 +0800 nvme: fix multipath crash caused by flush request when blktrace is enabled [ Upstream commit 3659fb5ac29a5e6102bebe494ac789fd47fb78f4 ] The flush request initialized by blk_kick_flush has NULL bio, and it may be dealt with nvme_end_req during io completion. When blktrace is enabled, nvme_trace_bio_complete with multipath activated trying to access NULL pointer bio from flush request results in the following crash: [ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a [ 2517.835213] #PF: supervisor read access in kernel mode [ 2517.838724] #PF: error_code(0x0000) - not-present page [ 2517.842222] PGD 7b2d51067 P4D 0 [ 2517.845684] Oops: 0000 [#1] SMP NOPTI [ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S 5.15.67-0.cl9.x86_64 #1 [ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022 [ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30 [ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba [ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286 [ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000 [ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000 [ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000 [ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8 [ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018 [ 2517.894434] FS: 0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000 [ 2517.898299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0 [ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2517.913761] PKRU: 55555554 [ 2517.917558] Call Trace: [ 2517.921294] [ 2517.924982] nvme_complete_rq+0x1c3/0x1e0 [nvme_core] [ 2517.928715] nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp] [ 2517.932442] nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp] [ 2517.936137] ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp] [ 2517.939830] tcp_read_sock+0x9c/0x260 [ 2517.943486] nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp] [ 2517.947173] nvme_tcp_io_work+0x64/0x90 [nvme_tcp] [ 2517.950834] process_one_work+0x1e8/0x390 [ 2517.954473] worker_thread+0x53/0x3c0 [ 2517.958069] ? process_one_work+0x390/0x390 [ 2517.961655] kthread+0x10c/0x130 [ 2517.965211] ? set_kthread_struct+0x40/0x40 [ 2517.968760] ret_from_fork+0x1f/0x30 [ 2517.972285] To avoid this situation, add a NULL check for req->bio before calling trace_block_bio_complete. Signed-off-by: Yanjun Zhang Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 03ce7921285e8998263ca1257183b54ab827412c Author: Hans de Goede Date: Tue Dec 13 13:32:46 2022 +0100 ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet [ Upstream commit a1dec9d70b6ad97087b60b81d2492134a84208c6 ] The Advantech MICA-071 tablet deviates from the defaults for a non CR Bay Trail based tablet in several ways: 1. It uses an analog MIC on IN3 rather then using DMIC1 2. It only has 1 speaker 3. It needs the OVCD current threshold to be set to 1500uA instead of the default 2000uA to reliable differentiate between headphones vs headsets Add a quirk with these settings for this tablet. Signed-off-by: Hans de Goede Acked-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20221213123246.11226-1-hdegoede@redhat.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 0dca7375e2b918d5b2e5604f654d57bbd26f98d6 Author: Jan Kara Date: Wed Dec 21 17:45:51 2022 +0100 udf: Fix extension of the last extent in the file [ Upstream commit 83c7423d1eb6806d13c521d1002cc1a012111719 ] When extending the last extent in the file within the last block, we wrongly computed the length of the last extent. This is mostly a cosmetical problem since the extent does not contain any data and the length will be fixed up by following operations but still. Fixes: 1f3868f06855 ("udf: Fix extending file within last block") Signed-off-by: Jan Kara Signed-off-by: Sasha Levin commit dc1bc903970bdf63ca40ab923d3ccb765da9a8d9 Author: Zhengchao Shao Date: Wed Jan 4 14:51:46 2023 +0800 caif: fix memory leak in cfctrl_linkup_request() [ Upstream commit fe69230f05897b3de758427b574fc98025dfc907 ] When linktype is unknown or kzalloc failed in cfctrl_linkup_request(), pkt is not released. Add release process to error path. Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack") Fixes: 8d545c8f958f ("caif: Disconnect without waiting for response") Signed-off-by: Zhengchao Shao Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20230104065146.1153009-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit bce3680b48d20143579f1b537fffede7746c5614 Author: Dan Carpenter Date: Tue Nov 15 16:15:18 2022 +0300 drm/i915: unpin on error in intel_vgpu_shadow_mm_pin() [ Upstream commit 3792fc508c095abd84b10ceae12bd773e61fdc36 ] Call intel_vgpu_unpin_mm() on this error path. Fixes: 418741480809 ("drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled.") Signed-off-by: Dan Carpenter Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/Y3OQ5tgZIVxyQ/WV@kili Reviewed-by: Zhenyu Wang Signed-off-by: Sasha Levin commit da6a3653b82cd0853e81fb97e24afbe6ee926eb5 Author: Namhyung Kim Date: Tue Jan 3 22:44:02 2023 -0800 perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match non BPF mode [ Upstream commit 54b353a20c7e8be98414754f5aff98c8a68fcc1f ] The --for-each-cgroup can have the same cgroup multiple times, but this confuses BPF counters (since they have the same cgroup id), making only the last cgroup events to be counted. Let's check the cgroup name before adding a new entry to the cgroups list. Before: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': msec cpu-clock / context-switches / cpu-migrations / page-faults / cycles / instructions / branches / branch-misses / 8,016.04 msec cpu-clock / # 7.998 CPUs utilized 6,152 context-switches / # 767.461 /sec 250 cpu-migrations / # 31.187 /sec 442 page-faults / # 55.139 /sec 613,111,487 cycles / # 0.076 GHz 280,599,604 instructions / # 0.46 insn per cycle 57,692,724 branches / # 7.197 M/sec 3,385,168 branch-misses / # 5.87% of all branches 1.002220125 seconds time elapsed After it becomes similar to the non-BPF mode: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': 8,013.38 msec cpu-clock / # 7.998 CPUs utilized 6,859 context-switches / # 855.944 /sec 334 cpu-migrations / # 41.680 /sec 345 page-faults / # 43.053 /sec 782,326,119 cycles / # 0.098 GHz 471,645,724 instructions / # 0.60 insn per cycle 94,963,430 branches / # 11.851 M/sec 3,685,511 branch-misses / # 3.88% of all branches 1.001864539 seconds time elapsed Committer notes: As a reminder, to test with BPF counters one has to use BUILD_BPF_SKEL=1 in the make command line and have clang/llvm installed when building perf, otherwise the --bpf-counters option will not be available: # perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Error: unknown option `bpf-counters' Usage: perf stat [] [] -a, --all-cpus system-wide collection from all CPUs # Fixes: bb1c15b60b981d10 ("perf stat: Support regex pattern in --for-each-cgroup") Signed-off-by: Namhyung Kim Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: bpf@vger.kernel.org Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Song Liu Link: https://lore.kernel.org/r/20230104064402.1551516-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 11cd4ec6359d90b13ffb8f85a9df8637f0cf8d95 Author: Szymon Heidrich Date: Tue Jan 3 10:17:09 2023 +0100 usb: rndis_host: Secure rndis_query check against int overflow [ Upstream commit c7dd13805f8b8fc1ce3b6d40f6aff47e66b72ad2 ] Variables off and len typed as uint32 in rndis_query function are controlled by incoming RNDIS response message thus their value may be manipulated. Setting off to a unexpectetly large value will cause the sum with len and 8 to overflow and pass the implemented validation step. Consequently the response pointer will be referring to a location past the expected buffer boundaries allowing information leakage e.g. via RNDIS_OID_802_3_PERMANENT_ADDRESS OID. Fixes: ddda08624013 ("USB: rndis_host, various cleanups") Signed-off-by: Szymon Heidrich Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6ea5273c71dd2d07c0a2459594eb34bc087939f7 Author: Geetha sowjanya Date: Tue Jan 3 09:20:12 2023 +0530 octeontx2-pf: Fix lmtst ID used in aura free [ Upstream commit 4af1b64f80fbe1275fb02c5f1c0cef099a4a231f ] Current code uses per_cpu pointer to get the lmtst_id mapped to the core on which aura_free() is executed. Using per_cpu pointer without preemption disable causing mismatch between lmtst_id and core on which pointer gets freed. This patch fixes the issue by disabling preemption around aura_free. Fixes: ef6c8da71eaf ("octeontx2-pf: cn10K: Reserve LMTST lines per core") Signed-off-by: Sunil Goutham Signed-off-by: Geetha sowjanya Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4e5f2c74cbbf5ff700d1ddb04312b36f5348b1c0 Author: Daniil Tatianin Date: Mon Jan 2 12:53:35 2023 +0300 drivers/net/bonding/bond_3ad: return when there's no aggregator [ Upstream commit 9c807965483f42df1d053b7436eedd6cf28ece6f ] Otherwise we would dereference a NULL aggregator pointer when calling __set_agg_ports_ready on the line below. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Daniil Tatianin Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8414983c2e649364d8af29080a0869266b31abb6 Author: Tetsuo Handa Date: Mon Jan 2 23:05:33 2023 +0900 fs/ntfs3: don't hold ni_lock when calling truncate_setsize() [ Upstream commit 0226635c304cfd5c9db9b78c259cb713819b057e ] syzbot is reporting hung task at do_user_addr_fault() [1], for there is a silent deadlock between PG_locked bit and ni_lock lock. Since filemap_update_page() calls filemap_read_folio() after calling folio_trylock() which will set PG_locked bit, ntfs_truncate() must not call truncate_setsize() which will wait for PG_locked bit to be cleared when holding ni_lock lock. Link: https://lore.kernel.org/all/00000000000060d41f05f139aa44@google.com/ Link: https://syzkaller.appspot.com/bug?extid=bed15dbf10294aa4f2ae [1] Reported-by: syzbot Debugged-by: Linus Torvalds Co-developed-by: Hillf Danton Signed-off-by: Hillf Danton Signed-off-by: Tetsuo Handa Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit a23e8376e613cf17bfbd4c799bae38951995c922 Author: Philipp Zabel Date: Tue Nov 8 15:14:20 2022 +0100 drm/imx: ipuv3-plane: Fix overlay plane width [ Upstream commit 92d43bd3bc9728c1fb114d7011d46f5ea9489e28 ] ipu_src_rect_width() was introduced to support odd screen resolutions such as 1366x768 by internally rounding up primary plane width to a multiple of 8 and compensating with reduced horizontal blanking. This also caused overlay plane width to be rounded up, which was not intended. Fix overlay plane width by limiting the rounding up to the primary plane. drm_rect_width(&new_state->src) >> 16 is the same value as drm_rect_width(dst) because there is no plane scaling support. Fixes: 94dfec48fca7 ("drm/imx: Add 8 pixel alignment fix") Reviewed-by: Lucas Stach Link: https://lore.kernel.org/r/20221108141420.176696-1-p.zabel@pengutronix.de Signed-off-by: Philipp Zabel Link: https://patchwork.freedesktop.org/patch/msgid/20221108141420.176696-1-p.zabel@pengutronix.de Tested-by: Ian Ray (cherry picked from commit 4333472f8d7befe62359fecb1083cd57a6e07bfc) Signed-off-by: Philipp Zabel Signed-off-by: Sasha Levin commit a8f7fd322f566a9f2f05bc138c4dfc2ae1fe5e74 Author: Miaoqian Lin Date: Thu Dec 29 13:09:00 2022 +0400 perf tools: Fix resources leak in perf_data__open_dir() [ Upstream commit 0a6564ebd953c4590663c9a3c99a3ea9920ade6f ] In perf_data__open_dir(), opendir() opens the directory stream. Add missing closedir() to release it after use. Fixes: eb6176709b235b96 ("perf data: Add perf_data__open_dir_data function") Reviewed-by: Adrian Hunter Signed-off-by: Miaoqian Lin Cc: Alexander Shishkin Cc: Alexey Bayduraev Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20221229090903.1402395-1-linmq006@gmail.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit a1e1521b463968b4eca7163f61fb6cc54d008061 Author: Jozsef Kadlecsik Date: Fri Dec 30 13:24:38 2022 +0100 netfilter: ipset: Rework long task execution when adding/deleting entries [ Upstream commit 5e29dc36bd5e2166b834ceb19990d9e68a734d7d ] When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 6f19a384836778379f40940aa34e6f4c0189d219 Author: Jozsef Kadlecsik Date: Fri Dec 30 13:24:37 2022 +0100 netfilter: ipset: fix hash:net,port,net hang with /0 subnet [ Upstream commit a31d47be64b9b74f8cfedffe03e0a8a1f9e51f23 ] The hash:net,port,net set type supports /0 subnets. However, the patch commit 5f7b51bf09baca8e titled "netfilter: ipset: Limit the maximal range of consecutive elements to add/delete" did not take into account it and resulted in an endless loop. The bug is actually older but the patch 5f7b51bf09baca8e brings it out earlier. Handle /0 subnets properly in hash:net,port,net set types. Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: Марк Коренберг Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 774d259749d7883e1b962d42e1ff1a2e1a922c79 Author: Horatiu Vultur Date: Mon Jan 2 13:12:15 2023 +0100 net: sparx5: Fix reading of the MAC address [ Upstream commit 588ab2dc25f60efeb516b4abedb6c551949cc185 ] There is an issue with the checking of the return value of 'of_get_mac_address', which returns 0 on success and negative value on failure. The driver interpretated the result the opposite way. Therefore if there was a MAC address defined in the DT, then the driver was generating a random MAC address otherwise it would use address 0. Fix this by checking correctly the return value of 'of_get_mac_address' Fixes: b74ef9f9cb91 ("net: sparx5: Do not use mac_addr uninitialized in mchp_sparx5_probe()") Signed-off-by: Horatiu Vultur Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 04dc4003e5df33fb38d3dd85568b763910c479d4 Author: Jamal Hadi Salim Date: Sun Jan 1 16:57:44 2023 -0500 net: sched: cbq: dont intepret cls results when asked to drop [ Upstream commit caa4b35b4317d5147b3ab0fbdc9c075c7d2e9c12 ] If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume that res.class contains a valid pointer Sample splat reported by Kyle Zeng [ 5.405624] 0: reclassify loop, rule prio 0, protocol 800 [ 5.406326] ================================================================== [ 5.407240] BUG: KASAN: slab-out-of-bounds in cbq_enqueue+0x54b/0xea0 [ 5.407987] Read of size 1 at addr ffff88800e3122aa by task poc/299 [ 5.408731] [ 5.408897] CPU: 0 PID: 299 Comm: poc Not tainted 5.10.155+ #15 [ 5.409516] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 5.410439] Call Trace: [ 5.410764] dump_stack+0x87/0xcd [ 5.411153] print_address_description+0x7a/0x6b0 [ 5.411687] ? vprintk_func+0xb9/0xc0 [ 5.411905] ? printk+0x76/0x96 [ 5.412110] ? cbq_enqueue+0x54b/0xea0 [ 5.412323] kasan_report+0x17d/0x220 [ 5.412591] ? cbq_enqueue+0x54b/0xea0 [ 5.412803] __asan_report_load1_noabort+0x10/0x20 [ 5.413119] cbq_enqueue+0x54b/0xea0 [ 5.413400] ? __kasan_check_write+0x10/0x20 [ 5.413679] __dev_queue_xmit+0x9c0/0x1db0 [ 5.413922] dev_queue_xmit+0xc/0x10 [ 5.414136] ip_finish_output2+0x8bc/0xcd0 [ 5.414436] __ip_finish_output+0x472/0x7a0 [ 5.414692] ip_finish_output+0x5c/0x190 [ 5.414940] ip_output+0x2d8/0x3c0 [ 5.415150] ? ip_mc_finish_output+0x320/0x320 [ 5.415429] __ip_queue_xmit+0x753/0x1760 [ 5.415664] ip_queue_xmit+0x47/0x60 [ 5.415874] __tcp_transmit_skb+0x1ef9/0x34c0 [ 5.416129] tcp_connect+0x1f5e/0x4cb0 [ 5.416347] tcp_v4_connect+0xc8d/0x18c0 [ 5.416577] __inet_stream_connect+0x1ae/0xb40 [ 5.416836] ? local_bh_enable+0x11/0x20 [ 5.417066] ? lock_sock_nested+0x175/0x1d0 [ 5.417309] inet_stream_connect+0x5d/0x90 [ 5.417548] ? __inet_stream_connect+0xb40/0xb40 [ 5.417817] __sys_connect+0x260/0x2b0 [ 5.418037] __x64_sys_connect+0x76/0x80 [ 5.418267] do_syscall_64+0x31/0x50 [ 5.418477] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.418770] RIP: 0033:0x473bb7 [ 5.418952] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89 [ 5.420046] RSP: 002b:00007fffd20eb0f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a [ 5.420472] RAX: ffffffffffffffda RBX: 00007fffd20eb578 RCX: 0000000000473bb7 [ 5.420872] RDX: 0000000000000010 RSI: 00007fffd20eb110 RDI: 0000000000000007 [ 5.421271] RBP: 00007fffd20eb150 R08: 0000000000000001 R09: 0000000000000004 [ 5.421671] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 5.422071] R13: 00007fffd20eb568 R14: 00000000004fc740 R15: 0000000000000002 [ 5.422471] [ 5.422562] Allocated by task 299: [ 5.422782] __kasan_kmalloc+0x12d/0x160 [ 5.423007] kasan_kmalloc+0x5/0x10 [ 5.423208] kmem_cache_alloc_trace+0x201/0x2e0 [ 5.423492] tcf_proto_create+0x65/0x290 [ 5.423721] tc_new_tfilter+0x137e/0x1830 [ 5.423957] rtnetlink_rcv_msg+0x730/0x9f0 [ 5.424197] netlink_rcv_skb+0x166/0x300 [ 5.424428] rtnetlink_rcv+0x11/0x20 [ 5.424639] netlink_unicast+0x673/0x860 [ 5.424870] netlink_sendmsg+0x6af/0x9f0 [ 5.425100] __sys_sendto+0x58d/0x5a0 [ 5.425315] __x64_sys_sendto+0xda/0xf0 [ 5.425539] do_syscall_64+0x31/0x50 [ 5.425764] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.426065] [ 5.426157] The buggy address belongs to the object at ffff88800e312200 [ 5.426157] which belongs to the cache kmalloc-128 of size 128 [ 5.426955] The buggy address is located 42 bytes to the right of [ 5.426955] 128-byte region [ffff88800e312200, ffff88800e312280) [ 5.427688] The buggy address belongs to the page: [ 5.427992] page:000000009875fabc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xe312 [ 5.428562] flags: 0x100000000000200(slab) [ 5.428812] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888007843680 [ 5.429325] raw: 0000000000000000 0000000000100010 00000001ffffffff ffff88800e312401 [ 5.429875] page dumped because: kasan: bad access detected [ 5.430214] page->mem_cgroup:ffff88800e312401 [ 5.430471] [ 5.430564] Memory state around the buggy address: [ 5.430846] ffff88800e312180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.431267] ffff88800e312200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.431705] >ffff88800e312280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.432123] ^ [ 5.432391] ffff88800e312300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.432810] ffff88800e312380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.433229] ================================================================== [ 5.433648] Disabling lock debugging due to kernel taint Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Kyle Zeng Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f02327a4877a06cbc8277e22d4834cb189565187 Author: Jamal Hadi Salim Date: Sun Jan 1 16:57:43 2023 -0500 net: sched: atm: dont intepret cls results when asked to drop [ Upstream commit a2965c7be0522eaa18808684b7b82b248515511b ] If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume res.class contains a valid pointer Fixes: b0188d4dbe5f ("[NET_SCHED]: sch_atm: Lindent") Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 95da1882ce9372ba20278f87cdb7a34f9812c4b5 Author: Miaoqian Lin Date: Mon Jan 2 12:20:39 2023 +0400 gpio: sifive: Fix refcount leak in sifive_gpio_probe [ Upstream commit 694175cd8a1643cde3acb45c9294bca44a8e08e9 ] of_irq_find_parent() returns a node pointer with refcount incremented, We should use of_node_put() on it when not needed anymore. Add missing of_node_put() to avoid refcount leak. Fixes: 96868dce644d ("gpio/sifive: Add GPIO driver for SiFive SoCs") Signed-off-by: Miaoqian Lin Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit da9c9883ec96670bf902a79b1e87532397553a72 Author: Xiubo Li Date: Thu Nov 17 10:43:21 2022 +0800 ceph: switch to vfs_inode_has_locks() to fix file lock bug [ Upstream commit 461ab10ef7e6ea9b41a0571a7fc6a72af9549a3c ] For the POSIX locks they are using the same owner, which is the thread id. And multiple POSIX locks could be merged into single one, so when checking whether the 'file' has locks may fail. For a file where some openers use locking and others don't is a really odd usage pattern though. Locks are like stoplights -- they only work if everyone pays attention to them. Just switch ceph_get_caps() to check whether any locks are set on the inode. If there are POSIX/OFD/FLOCK locks on the file at the time, we should set CHECK_FILELOCK, regardless of what fd was used to set the lock. Fixes: ff5d913dfc71 ("ceph: return -EIO if read/write against filp that lost file locks") Signed-off-by: Xiubo Li Reviewed-by: Jeff Layton Reviewed-by: Ilya Dryomov Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin commit 54e72ce5f1d7bb0eccd293f65cc2571a1b3a1a8f Author: Jeff Layton Date: Mon Nov 14 08:33:09 2022 -0500 filelock: new helper: vfs_inode_has_locks [ Upstream commit ab1ddef98a715eddb65309ffa83267e4e84a571e ] Ceph has a need to know whether a particular inode has any locks set on it. It's currently tracking that by a num_locks field in its filp->private_data, but that's problematic as it tries to decrement this field when releasing locks and that can race with the file being torn down. Add a new vfs_inode_has_locks helper that just returns whether any locks are currently held on the inode. Reviewed-by: Xiubo Li Reviewed-by: Christoph Hellwig Signed-off-by: Jeff Layton Stable-dep-of: 461ab10ef7e6 ("ceph: switch to vfs_inode_has_locks() to fix file lock bug") Signed-off-by: Sasha Levin commit f34b03ce3a86af33a32d9a576391b771f01f2cba Author: Carlo Caione Date: Mon Dec 19 09:43:05 2022 +0100 drm/meson: Reduce the FIFO lines held when AFBC is not used [ Upstream commit 3b754ed6d1cd90017e66e5cc16f3923e4a952ffc ] Having a bigger number of FIFO lines held after vsync is only useful to SoCs using AFBC to give time to the AFBC decoder to be reset, configured and enabled again. For SoCs not using AFBC this, on the contrary, is causing on some displays issues and a few pixels vertical offset in the displayed image. Conditionally increase the number of lines held after vsync only for SoCs using AFBC, leaving the default value for all the others. Fixes: 24e0d4058eff ("drm/meson: hold 32 lines after vsync to give time for AFBC start") Signed-off-by: Carlo Caione Acked-by: Martin Blumenstingl Acked-by: Neil Armstrong [narmstrong: added fixes tag] Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20221216-afbc_s905x-v1-0-033bebf780d9@baylibre.com Signed-off-by: Sasha Levin commit 05a8410b0fce194c5006d48244cd3546b92d022f Author: Maor Gottlieb Date: Wed Dec 28 14:56:10 2022 +0200 RDMA/mlx5: Fix validation of max_rd_atomic caps for DC [ Upstream commit 8de8482fe5732fbef4f5af82bc0c0362c804cd1f ] Currently, when modifying DC, we validate max_rd_atomic user attribute against the RC cap, validate against DC. RC and DC QP types have different device limitations. This can cause userspace created DC QPs to malfunction. Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP") Link: https://lore.kernel.org/r/0c5aee72cea188c3bb770f4207cce7abc9b6fc74.1672231736.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit 8d89870d63758363b07ace5c2df82d6bf865f78b Author: Shay Drory Date: Wed Dec 28 14:56:09 2022 +0200 RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device [ Upstream commit 38b50aa44495d5eb4218f0b82fc2da76505cec53 ] Currently, when mlx5_ib_get_hw_stats() is used for device (port_num = 0), there is a special handling in order to use the correct counters, but, port_num is being passed down the stack without any change. Also, some functions assume that port_num >=1. As a result, the following oops can occur. BUG: unable to handle page fault for address: ffff89510294f1a8 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP CPU: 8 PID: 1382 Comm: devlink Tainted: G W 6.1.0-rc4_for_upstream_base_2022_11_10_16_12 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock+0xc/0x20 Call Trace: mlx5_ib_get_native_port_mdev+0x73/0xe0 [mlx5_ib] do_get_hw_stats.constprop.0+0x109/0x160 [mlx5_ib] mlx5_ib_get_hw_stats+0xad/0x180 [mlx5_ib] ib_setup_device_attrs+0xf0/0x290 [ib_core] ib_register_device+0x3bb/0x510 [ib_core] ? atomic_notifier_chain_register+0x67/0x80 __mlx5_ib_add+0x2b/0x80 [mlx5_ib] mlx5r_probe+0xb8/0x150 [mlx5_ib] ? auxiliary_match_id+0x6a/0x90 auxiliary_bus_probe+0x3c/0x70 ? driver_sysfs_add+0x6b/0x90 really_probe+0xcd/0x380 __driver_probe_device+0x80/0x170 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 ? driver_allows_async_probing+0x60/0x60 ? driver_allows_async_probing+0x60/0x60 bus_for_each_drv+0x7b/0xc0 __device_attach+0xbc/0x200 bus_probe_device+0x87/0xa0 device_add+0x404/0x940 ? dev_set_name+0x53/0x70 __auxiliary_device_add+0x43/0x60 add_adev+0x99/0xe0 [mlx5_core] mlx5_attach_device+0xc8/0x120 [mlx5_core] mlx5_load_one_devl_locked+0xb2/0xe0 [mlx5_core] devlink_reload+0x133/0x250 devlink_nl_cmd_reload+0x480/0x570 ? devlink_nl_pre_doit+0x44/0x2b0 genl_family_rcv_msg_doit.isra.0+0xc2/0x110 genl_rcv_msg+0x180/0x2b0 ? devlink_nl_cmd_region_read_dumpit+0x540/0x540 ? devlink_reload+0x250/0x250 ? devlink_put+0x50/0x50 ? genl_family_rcv_msg_doit.isra.0+0x110/0x110 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1f6/0x2c0 netlink_sendmsg+0x237/0x490 sock_sendmsg+0x33/0x40 __sys_sendto+0x103/0x160 ? handle_mm_fault+0x10e/0x290 ? do_user_addr_fault+0x1c0/0x5f0 __x64_sys_sendto+0x25/0x30 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fix it by setting port_num to 1 in order to get device status and remove unused variable. Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Link: https://lore.kernel.org/r/98b82994c3cd3fa593b8a75ed3f3901e208beb0f.1672231736.git.leonro@nvidia.com Signed-off-by: Shay Drory Reviewed-by: Patrisious Haddad Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit 4d112f001612c79927c1ecf29522b34c4fa292e0 Author: Miaoqian Lin Date: Thu Dec 29 10:29:25 2022 +0400 net: phy: xgmiitorgmii: Fix refcount leak in xgmiitorgmii_probe [ Upstream commit d039535850ee47079d59527e96be18d8e0daa84b ] of_phy_find_device() return device node with refcount incremented. Call put_device() to relese it when not needed anymore. Fixes: ab4e6ee578e8 ("net: phy: xgmiitorgmii: Check phy_driver ready before accessing") Signed-off-by: Miaoqian Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e5fbeb3d16b44c91063fe0bc951d0182b5699388 Author: David Arinzon Date: Thu Dec 29 07:30:11 2022 +0000 net: ena: Update NUMA TPH hint register upon NUMA node update [ Upstream commit a8ee104f986e720cea52133885cc822d459398c7 ] The device supports a PCIe optimization hint, which indicates on which NUMA the queue is currently processed. This hint is utilized by PCIe in order to reduce its access time by accessing the correct NUMA resources and maintaining cache coherence. The driver calls the register update for the hint (called TPH - TLP Processing Hint) during the NAPI loop. Though the update is expected upon a NUMA change (when a queue is moved from one NUMA to the other), the current logic performs a register update when the queue is moved to a different CPU, but the CPU is not necessarily in a different NUMA. The changes include: 1. Performing the TPH update only when the queue has switched a NUMA node. 2. Moving the TPH update call to be triggered only when NAPI was scheduled from interrupt context, as opposed to a busy-polling loop. This is due to the fact that during busy-polling, the frequency of CPU switches for a particular queue is significantly higher, thus, the likelihood to switch NUMA is much higher. Therefore, providing the frequent updates to the device upon a NUMA update are unlikely to be beneficial. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7840b93cfd4c445e2f5701e9d815853ad1c9a650 Author: David Arinzon Date: Thu Dec 29 07:30:10 2022 +0000 net: ena: Set default value for RX interrupt moderation [ Upstream commit e712f3e4920b3a1a5e6b536827d118e14862896c ] RX ring can be NULL in XDP use cases where only TX queues are configured. In this scenario, the RX interrupt moderation value sent to the device remains in its default value of 0. In this change, setting the default value of the RX interrupt moderation to be the same as of the TX. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d09b7a9d2f347a0d8387d1a0ad857b4adb92ebca Author: David Arinzon Date: Thu Dec 29 07:30:09 2022 +0000 net: ena: Fix rx_copybreak value update [ Upstream commit c7062aaee099f2f43d6f07a71744b44b94b94b34 ] Make the upper bound on rx_copybreak tighter, by making sure it is smaller than the minimum of mtu and ENA_PAGE_SIZE. With the current upper bound of mtu, rx_copybreak can be larger than a page. Such large rx_copybreak will not bring any performance benefit to the user and therefore makes no sense. In addition, the value update was only reflected in the adapter structure, but not applied for each ring, causing it to not take effect. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Osama Abboud Signed-off-by: Arthur Kiyanovski Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0e7ad9b006d703c03dd3dcf618f898b7db3b81c9 Author: David Arinzon Date: Thu Dec 29 07:30:08 2022 +0000 net: ena: Use bitmask to indicate packet redirection [ Upstream commit 59811faa2c54dbcf44d575b5a8f6e7077da88dc2 ] Redirecting packets with XDP Redirect is done in two phases: 1. A packet is passed by the driver to the kernel using xdp_do_redirect(). 2. After finishing polling for new packets the driver lets the kernel know that it can now process the redirected packet using xdp_do_flush_map(). The packets' redirection is handled in the napi context of the queue that called xdp_do_redirect() To avoid calling xdp_do_flush_map() each time the driver first checks whether any packets were redirected, using xdp_flags |= xdp_verdict; and if (xdp_flags & XDP_REDIRECT) xdp_do_flush_map() essentially treating XDP instructions as a bitmask, which isn't the case: enum xdp_action { XDP_ABORTED = 0, XDP_DROP, XDP_PASS, XDP_TX, XDP_REDIRECT, }; Given the current possible values of xdp_action, the current design doesn't have a bug (since XDP_REDIRECT = 100b), but it is still flawed. This patch makes the driver use a bitmask instead, to avoid future issues. Fixes: a318c70ad152 ("net: ena: introduce XDP redirect implementation") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5d4964984b994400d11621e3be32f7e00521f399 Author: David Arinzon Date: Thu Dec 29 07:30:07 2022 +0000 net: ena: Account for the number of processed bytes in XDP [ Upstream commit c7f5e34d906320fdc996afa616676161c029cc02 ] The size of packets that were forwarded or dropped by XDP wasn't added to the total processed bytes statistic. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f17d9aec07dec852e5988988fc002ba9c9dae48d Author: David Arinzon Date: Thu Dec 29 07:30:06 2022 +0000 net: ena: Don't register memory info on XDP exchange [ Upstream commit 9c9e539956fa67efb8a65e32b72a853740b33445 ] Since the queues aren't destroyed when we only exchange XDP programs, there's no need to re-register them again. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a4aa727ad0b29813be09038283204db04ee689dd Author: David Arinzon Date: Thu Dec 29 07:30:05 2022 +0000 net: ena: Fix toeplitz initial hash value [ Upstream commit 332b49ff637d6c1a75b971022a8b992cf3c57db1 ] On driver initialization, RSS hash initial value is set to zero, instead of the default value. This happens because we pass NULL as the RSS key parameter, which caused us to never initialize the RSS hash value. This patch fixes it by making sure the initial value is set, no matter what the value of the RSS key is. Fixes: 91a65b7d3ed8 ("net: ena: fix potential crash when rxfh key is NULL") Signed-off-by: Nati Koler Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0bec17f1ce31c6c948396d8c633810a3e0e4a242 Author: Jiguang Xiao Date: Wed Dec 28 16:14:47 2022 +0800 net: amd-xgbe: add missed tasklet_kill [ Upstream commit d530ece70f16f912e1d1bfeea694246ab78b0a4b ] The driver does not call tasklet_kill in several places. Add the calls to fix it. Fixes: 85b85c853401 ("amd-xgbe: Re-issue interrupt if interrupt status not cleared") Signed-off-by: Jiguang Xiao Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cb2f74685f76388b16f4750faaf5dafed6163dc0 Author: Adham Faris Date: Wed Dec 14 16:02:57 2022 +0200 net/mlx5e: Fix hw mtu initializing at XDP SQ allocation [ Upstream commit 1e267ab88dc44c48f556218f7b7f14c76f7aa066 ] Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to hw mtu (configured at xdp sq allocation) before xmiting it. This check does not account for ethernet fcs length (calculated and filled by the nic). Hence, when we try sending packets with length > (hw-mtu - ethernet-fcs-size), the device port drops it and tx_errors_phy is incremented. Desired behavior is to catch these packets and drop them by the driver. Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by subtracting ethernet FCS header size (4 Bytes) from current hw mtu value, since ethernet FCS is calculated and written to ethernet frames by the nic. Fixes: d8bec2b29a82 ("net/mlx5e: Support bpf_xdp_adjust_head()") Signed-off-by: Adham Faris Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 6c72abb78b01cc286e8440b0e8ce1cffa86645fe Author: Chris Mi Date: Mon Dec 5 09:22:50 2022 +0800 net/mlx5e: Always clear dest encap in neigh-update-del [ Upstream commit 2951b2e142ecf6e0115df785ba91e91b6da74602 ] The cited commit introduced a bug for multiple encapsulations flow. If one dest encap becomes invalid, the flow is set slow path flag. But when other dests encap become invalid, they are not cleared due to slow path flag of the flow. When neigh-update-add is running, it will use invalid encap. Fix it by checking slow path flag after clearing dest encap. Fixes: 9a5f9cc794e1 ("net/mlx5e: Fix possible use-after-free deleting fdb rule") Signed-off-by: Chris Mi Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit b36783bc11d1683f40c81da2824529a5d9e5c6bc Author: Roi Dayan Date: Thu Nov 25 14:32:58 2021 +0200 net/mlx5e: TC, Refactor mlx5e_tc_add_flow_mod_hdr() to get flow attr [ Upstream commit ff99316700799b84e842f819a44db608557bae3e ] In later commit we are going to instantiate multiple attr instances for flow instead of single attr. Make sure mlx5e_tc_add_flow_mod_hdr() use the correct attr and not flow->attr. Signed-off-by: Roi Dayan Reviewed-by: Oz Shlomo Signed-off-by: Saeed Mahameed Stable-dep-of: 2951b2e142ec ("net/mlx5e: Always clear dest encap in neigh-update-del") Signed-off-by: Sasha Levin commit f8c10eeba31bcaafbe84f6d03dd29d3070e2bc36 Author: Dragos Tatulea Date: Mon Nov 28 15:24:21 2022 +0200 net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default [ Upstream commit b12d581e83e3ae1080c32ab83f123005bd89a840 ] mlx5e_build_nic_params will turn CQE compression on if the hardware capability is enabled and the slow_pci_heuristic condition is detected. As IPoIB doesn't support CQE compression, make sure to disable the feature in the IPoIB profile init. Please note that the feature is not exposed to the user for IPoIB interfaces, so it can't be subsequently turned on. Fixes: b797a684b0dd ("net/mlx5e: Enable CQE compression when PCI is slower than link") Signed-off-by: Dragos Tatulea Reviewed-by: Gal Pressman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 7227bbb7c140ed8516c4678b64e39a5a76150e3b Author: Shay Drory Date: Thu Nov 24 13:34:12 2022 +0200 net/mlx5: Avoid recovery in probe flows [ Upstream commit 9078e843efec530f279a155f262793c58b0746bd ] Currently, recovery is done without considering whether the device is still in probe flow. This may lead to recovery before device have finished probed successfully. e.g.: while mlx5_init_one() is running. Recovery flow is using functionality that is loaded only by mlx5_init_one(), and there is no point in running recovery without mlx5_init_one() finished successfully. Fix it by waiting for probe flow to finish and checking whether the device is probed before trying to perform recovery. Fixes: 51d138c2610a ("net/mlx5: Fix health error state handling") Signed-off-by: Shay Drory Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 9369b9afa8b0012d92f70b68803d67d457211c02 Author: Jiri Pirko Date: Tue Oct 18 12:51:52 2022 +0200 net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path [ Upstream commit 2a35b2c2e6a252eda2134aae6a756861d9299531 ] There are two cleanup calls missing in mlx5_init_once() error path. Add them making the error path flow to be the same as mlx5_cleanup_once(). Fixes: 52ec462eca9b ("net/mlx5: Add reserved-gids support") Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section") Signed-off-by: Jiri Pirko Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit d966f2ee4b8e4d11efed465ddfdbf641a11dd9ca Author: Moshe Shemesh Date: Mon Dec 12 10:42:15 2022 +0200 net/mlx5: E-Switch, properly handle ingress tagged packets on VST [ Upstream commit 1f0ae22ab470946143485a02cc1cd7e05c0f9120 ] Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already present in the frame. Previous VST mode behavior was to drop packets or override existing tag, depending on the device version. In this patch we fix this behavior by correctly building the HW steering rule with a push vlan action, or for older devices we ask the FW to stack the vlan when a vlan is already present. Fixes: 07bab9502641 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes") Fixes: dfcb1ed3c331 ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode") Signed-off-by: Moshe Shemesh Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 6a37a01aba5df1bbee5afff559a37727852c2726 Author: Stefano Garzarella Date: Thu Nov 10 15:13:35 2022 +0100 vdpa_sim: fix vringh initialization in vdpasim_queue_ready() [ Upstream commit 794ec498c9fa79e6bfd71b931410d5897a9c00d4 ] When we initialize vringh, we should pass the features and the number of elements in the virtqueue negotiated with the driver, otherwise operations with vringh may fail. This was discovered in a case where the driver sets a number of elements in the virtqueue different from the value returned by .get_vq_num_max(). In vdpasim_vq_reset() is safe to initialize the vringh with default values, since the virtqueue will not be used until vdpasim_queue_ready() is called again. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Signed-off-by: Stefano Garzarella Message-Id: <20221110141335.62171-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Acked-by: Jason Wang Acked-by: Eugenio Pérez Signed-off-by: Sasha Levin commit e3462410c36d7fa265e62255809ce55ae4bd0326 Author: Stefano Garzarella Date: Wed Nov 9 11:25:03 2022 +0100 vhost: fix range used in translate_desc() [ Upstream commit 98047313cdb46828093894d0ac8b1183b8b317f9 ] vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In translate_desc() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: a9709d6874d5 ("vhost: convert pre sorted vhost memory array to interval tree") Acked-by: Jason Wang Signed-off-by: Stefano Garzarella Message-Id: <20221109102503.18816-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 13871f60ec2f603752451f3c2e88be671d7ea56d Author: Stefano Garzarella Date: Wed Nov 9 11:25:02 2022 +0100 vringh: fix range used in iotlb_translate() [ Upstream commit f85efa9b0f5381874f727bd98f56787840313f0b ] vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In iotlb_translate() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: 9ad9c49cfe97 ("vringh: IOTLB support") Acked-by: Jason Wang Signed-off-by: Stefano Garzarella Message-Id: <20221109102503.18816-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit e05d4c8c287a0deb99494ab4bf4aac5af123aa64 Author: Yuan Can Date: Tue Nov 8 10:17:05 2022 +0000 vhost/vsock: Fix error handling in vhost_vsock_init() [ Upstream commit 7a4efe182ca61fb3e5307e69b261c57cbf434cd4 ] A problem about modprobe vhost_vsock failed is triggered with the following log given: modprobe: ERROR: could not insert 'vhost_vsock': Device or resource busy The reason is that vhost_vsock_init() returns misc_register() directly without checking its return value, if misc_register() failed, it returns without calling vsock_core_unregister() on vhost_transport, resulting the vhost_vsock can never be installed later. A simple call graph is shown as below: vhost_vsock_init() vsock_core_register() # register vhost_transport misc_register() device_create_with_groups() device_create_groups_vargs() dev = kzalloc(...) # OOM happened # return without unregister vhost_transport Fix by calling vsock_core_unregister() when misc_register() returns error. Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Yuan Can Message-Id: <20221108101705.45981-1-yuancan@huawei.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Stefano Garzarella Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 586e6fd7d581f987f7d0d2592edf0b26397e783e Author: ruanjinjie Date: Thu Nov 10 16:23:48 2022 +0800 vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init() [ Upstream commit aeca7ff254843d49a8739f07f7dab1341450111d ] Inject fault while probing module, if device_register() fails in vdpasim_net_init() or vdpasim_blk_init(), but the refcount of kobject is not decreased to 0, the name allocated in dev_set_name() is leaked. Fix this by calling put_device(), so that name can be freed in callback function kobject_cleanup(). (vdpa_sim_net) unreferenced object 0xffff88807eebc370 (size 16): comm "modprobe", pid 3848, jiffies 4362982860 (age 18.153s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 6e 65 74 00 6b 6b 6b a5 vdpasim_net.kkk. backtrace: [] __kmalloc_node_track_caller+0x4e/0x150 [] kstrdup+0x33/0x60 [] kobject_set_name_vargs+0x41/0x110 [] dev_set_name+0xab/0xe0 [] device_add+0xe3/0x1a80 [] 0xffffffffa0270013 [] do_one_initcall+0x87/0x2e0 [] do_init_module+0x1ab/0x640 [] load_module+0x5d00/0x77f0 [] __do_sys_finit_module+0x110/0x1b0 [] do_syscall_64+0x35/0x80 [] entry_SYSCALL_64_after_hwframe+0x46/0xb0 (vdpa_sim_blk) unreferenced object 0xffff8881070c1250 (size 16): comm "modprobe", pid 6844, jiffies 4364069319 (age 17.572s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 62 6c 6b 00 6b 6b 6b a5 vdpasim_blk.kkk. backtrace: [] __kmalloc_node_track_caller+0x4e/0x150 [] kstrdup+0x33/0x60 [] kobject_set_name_vargs+0x41/0x110 [] dev_set_name+0xab/0xe0 [] device_add+0xe3/0x1a80 [] 0xffffffffa0220013 [] do_one_initcall+0x87/0x2e0 [] do_init_module+0x1ab/0x640 [] load_module+0x5d00/0x77f0 [] __do_sys_finit_module+0x110/0x1b0 [] do_syscall_64+0x35/0x80 [] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 899c4d187f6a ("vdpa_sim_blk: add support for vdpa management tool") Fixes: a3c06ae158dd ("vdpa_sim_net: Add support for user supported devices") Signed-off-by: ruanjinjie Reviewed-by: Stefano Garzarella Message-Id: <20221110082348.4105476-1-ruanjinjie@huawei.com> Signed-off-by: Michael S. Tsirkin Acked-by: Jason Wang Signed-off-by: Sasha Levin commit b63bc2db244c1b57e36f16ea5f2a1becda413f68 Author: Miaoqian Lin Date: Fri Dec 23 11:37:18 2022 +0400 nfc: Fix potential resource leaks [ Upstream commit df49908f3c52d211aea5e2a14a93bbe67a2cb3af ] nfc_get_device() take reference for the device, add missing nfc_put_device() to release it when not need anymore. Also fix the style warnning by use error EOPNOTSUPP instead of ENOTSUPP. Fixes: 5ce3f32b5264 ("NFC: netlink: SE API implementation") Fixes: 29e76924cf08 ("nfc: netlink: Add capability to reply to vendor_cmd with data") Signed-off-by: Miaoqian Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 945e58bdaf6faf6e3f957d182244fa830acddab4 Author: Johnny S. Lee Date: Thu Dec 22 22:34:05 2022 +0800 net: dsa: mv88e6xxx: depend on PTP conditionally [ Upstream commit 30e725537546248bddc12eaac2fe0a258917f190 ] PTP hardware timestamping related objects are not linked when PTP support for MV88E6xxx (NET_DSA_MV88E6XXX_PTP) is disabled, therefore NET_DSA_MV88E6XXX should not depend on PTP_1588_CLOCK_OPTIONAL regardless of NET_DSA_MV88E6XXX_PTP. Instead, condition more strictly on how NET_DSA_MV88E6XXX_PTP's dependencies are met, making sure that it cannot be enabled when NET_DSA_MV88E6XXX=y and PTP_1588_CLOCK=m. In other words, this commit allows NET_DSA_MV88E6XXX to be built-in while PTP_1588_CLOCK is a module, as long as NET_DSA_MV88E6XXX_PTP is prevented from being enabled. Fixes: e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Johnny S. Lee Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 95df720e64a6409d8152827a776c43f615e3321a Author: Daniil Tatianin Date: Thu Dec 22 14:52:28 2022 +0300 qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure [ Upstream commit 13a7c8964afcd8ca43c0b6001ebb0127baa95362 ] adapter->dcb would get silently freed inside qlcnic_dcb_enable() in case qlcnic_dcb_attach() would return an error, which always happens under OOM conditions. This would lead to use-after-free because both of the existing callers invoke qlcnic_dcb_get_info() on the obtained pointer, which is potentially freed at that point. Propagate errors from qlcnic_dcb_enable(), and instead free the dcb pointer at callsite using qlcnic_dcb_free(). This also removes the now unused qlcnic_clear_dcb_ops() helper, which was a simple wrapper around kfree() also causing memory leaks for partially initialized dcb. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: 3c44bba1d270 ("qlcnic: Disable DCB operations from SR-IOV VFs") Reviewed-by: Michal Swiatkowski Signed-off-by: Daniil Tatianin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6c55953e232ea668731091d111066521f3b7719b Author: Hawkins Jiawei Date: Thu Dec 22 11:51:19 2022 +0800 net: sched: fix memory leak in tcindex_set_parms [ Upstream commit 399ab7fe0fa0d846881685fd4e57e9a8ef7559f7 ] Syzkaller reports a memory leak as follows: ==================================== BUG: memory leak unreferenced object 0xffff88810c287f00 (size 256): comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 [] kmalloc include/linux/slab.h:576 [inline] [] kmalloc_array include/linux/slab.h:627 [inline] [] kcalloc include/linux/slab.h:659 [inline] [] tcf_exts_init include/net/pkt_cls.h:250 [inline] [] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342 [] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553 [] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147 [] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082 [] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540 [] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] [] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345 [] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921 [] sock_sendmsg_nosec net/socket.c:714 [inline] [] sock_sendmsg+0x56/0x80 net/socket.c:734 [] ____sys_sendmsg+0x178/0x410 net/socket.c:2482 [] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536 [] __sys_sendmmsg+0x105/0x330 net/socket.c:2622 [] __do_sys_sendmmsg net/socket.c:2651 [inline] [] __se_sys_sendmmsg net/socket.c:2648 [inline] [] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648 [] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [] entry_SYSCALL_64_after_hwframe+0x63/0xcd ==================================== Kernel uses tcindex_change() to change an existing filter properties. Yet the problem is that, during the process of changing, if `old_r` is retrieved from `p->perfect`, then kernel uses tcindex_alloc_perfect_hash() to newly allocate filter results, uses tcindex_filter_result_init() to clear the old filter result, without destroying its tcf_exts structure, which triggers the above memory leak. To be more specific, there are only two source for the `old_r`, according to the tcindex_lookup(). `old_r` is retrieved from `p->perfect`, or `old_r` is retrieved from `p->h`. * If `old_r` is retrieved from `p->perfect`, kernel uses tcindex_alloc_perfect_hash() to newly allocate the filter results. Then `r` is assigned with `cp->perfect + handle`, which is newly allocated. So condition `old_r && old_r != r` is true in this situation, and kernel uses tcindex_filter_result_init() to clear the old filter result, without destroying its tcf_exts structure * If `old_r` is retrieved from `p->h`, then `p->perfect` is NULL according to the tcindex_lookup(). Considering that `cp->h` is directly copied from `p->h` and `p->perfect` is NULL, `r` is assigned with `tcindex_lookup(cp, handle)`, whose value should be the same as `old_r`, so condition `old_r && old_r != r` is false in this situation, kernel ignores using tcindex_filter_result_init() to clear the old filter result. So only when `old_r` is retrieved from `p->perfect` does kernel use tcindex_filter_result_init() to clear the old filter result, which triggers the above memory leak. Considering that there already exists a tc_filter_wq workqueue to destroy the old tcindex_data by tcindex_partial_destroy_work() at the end of tcindex_set_parms(), this patch solves this memory leak bug by removing this old filter result clearing part and delegating it to the tc_filter_wq workqueue. Note that this patch doesn't introduce any other issues. If `old_r` is retrieved from `p->perfect`, this patch just delegates old filter result clearing part to the tc_filter_wq workqueue; If `old_r` is retrieved from `p->h`, kernel doesn't reach the old filter result clearing part, so removing this part has no effect. [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni and Dmitry Vyukov] Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/ Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Cc: Cong Wang Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Dmitry Vyukov Acked-by: Paolo Abeni Signed-off-by: Hawkins Jiawei Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d14a4b24d58eb2f34a3ccbbe88c078b7710ecacf Author: Jian Shen Date: Thu Dec 22 14:43:43 2022 +0800 net: hns3: fix VF promisc mode not update when mac table full [ Upstream commit 8ee57c7b8406c7aa8ca31e014440c87c6383f429 ] Currently, it missed set HCLGE_VPORT_STATE_PROMISC_CHANGE flag for VF when vport->overflow_promisc_flags changed. So the VF won't check whether to update promisc mode in this case. So add it. Fixes: 1e6e76101fd9 ("net: hns3: configure promisc mode for VF asynchronously") Signed-off-by: Jian Shen Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 7ed205b9478d9428b80cb8db075069f0589d67ad Author: Jian Shen Date: Thu Dec 22 14:43:42 2022 +0800 net: hns3: fix miss L3E checking for rx packet [ Upstream commit 7d89b53cea1a702f97117fb4361523519bb1e52c ] For device supports RXD advanced layout, the driver will return directly if the hardware finish the checksum calculate. It cause missing L3E checking for ip packets. Fixes it. Fixes: 1ddc028ac849 ("net: hns3: refactor out RX completion checksum") Signed-off-by: Jian Shen Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 47868cb77f8ff81ab7235fb39eb9672a30a7607c Author: Peng Li Date: Thu Dec 2 16:35:55 2021 +0800 net: hns3: extract macro to simplify ring stats update code [ Upstream commit e6d72f6ac2ad4965491354d74b48e35a60abf298 ] As the code to update ring stats is alike for different ring stats type, this patch extract macro to simplify ring stats update code. Signed-off-by: Peng Li Signed-off-by: Guangbin Huang Signed-off-by: David S. Miller Stable-dep-of: 7d89b53cea1a ("net: hns3: fix miss L3E checking for rx packet") Signed-off-by: Sasha Levin commit 7457c5a7761ad88b6a8452d5f875555aaf76e013 Author: Hao Chen Date: Mon Nov 29 22:00:19 2021 +0800 net: hns3: refactor hns3_nic_reuse_page() [ Upstream commit e74a726da2c4dcedb8b0631f423d0044c7901a20 ] Split rx copybreak handle into a separate function from function hns3_nic_reuse_page() to improve code simplicity. Signed-off-by: Hao Chen Signed-off-by: Guangbin Huang Signed-off-by: David S. Miller Stable-dep-of: 7d89b53cea1a ("net: hns3: fix miss L3E checking for rx packet") Signed-off-by: Sasha Levin commit 4a6e9fb534c526f21b40e310fc0e15aa75d51512 Author: Jie Wang Date: Thu Dec 22 14:43:41 2022 +0800 net: hns3: add interrupts re-initialization while doing VF FLR [ Upstream commit 09e6b30eeb254f1818a008cace3547159e908dfd ] Currently keep alive message between PF and VF may be lost and the VF is unalive in PF. So the VF will not do reset during PF FLR reset process. This would make the allocated interrupt resources of VF invalid and VF would't receive or respond to PF any more. So this patch adds VF interrupts re-initialization during VF FLR for VF recovery in above cases. Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR") Signed-off-by: Jie Wang Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5e48ed805c4f9150c97297efb61e52faae60edd6 Author: Jeff Layton Date: Thu Dec 22 09:51:30 2022 -0500 nfsd: shut down the NFSv4 state objects before the filecache [ Upstream commit 789e1e10f214c00ca18fc6610824c5b9876ba5f2 ] Currently, we shut down the filecache before trying to clean up the stateids that depend on it. This leads to the kernel trying to free an nfsd_file twice, and a refcount overput on the nf_mark. Change the shutdown procedure to tear down all of the stateids prior to shutting down the filecache. Reported-and-tested-by: Wang Yugui Signed-off-by: Jeff Layton Fixes: 5e113224c17e ("nfsd: nfsd_file cache entries should be per net namespace") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 7e2825f5fb84bd34105b4a19d847657e31a5faf6 Author: Shawn Bohrer Date: Tue Dec 20 12:59:03 2022 -0600 veth: Fix race with AF_XDP exposing old or uninitialized descriptors [ Upstream commit fa349e396e4886d742fd6501c599ec627ef1353b ] When AF_XDP is used on on a veth interface the RX ring is updated in two steps. veth_xdp_rcv() removes packet descriptors from the FILL ring fills them and places them in the RX ring updating the cached_prod pointer. Later xdp_do_flush() syncs the RX ring prod pointer with the cached_prod pointer allowing user-space to see the recently filled in descriptors. The rings are intended to be SPSC, however the existing order in veth_poll allows the xdp_do_flush() to run concurrently with another CPU creating a race condition that allows user-space to see old or uninitialized descriptors in the RX ring. This bug has been observed in production systems. To summarize, we are expecting this ordering: CPU 0 __xsk_rcv_zc() CPU 0 __xsk_map_flush() CPU 2 __xsk_rcv_zc() CPU 2 __xsk_map_flush() But we are seeing this order: CPU 0 __xsk_rcv_zc() CPU 2 __xsk_rcv_zc() CPU 0 __xsk_map_flush() CPU 2 __xsk_map_flush() This occurs because we rely on NAPI to ensure that only one napi_poll handler is running at a time for the given veth receive queue. napi_schedule_prep() will prevent multiple instances from getting scheduled. However calling napi_complete_done() signals that this napi_poll is complete and allows subsequent calls to napi_schedule_prep() and __napi_schedule() to succeed in scheduling a concurrent napi_poll before the xdp_do_flush() has been called. For the veth driver a concurrent call to napi_schedule_prep() and __napi_schedule() can occur on a different CPU because the veth xmit path can additionally schedule a napi_poll creating the race. The fix as suggested by Magnus Karlsson, is to simply move the xdp_do_flush() call before napi_complete_done(). This syncs the producer ring pointers before another instance of napi_poll can be scheduled on another CPU. It will also slightly improve performance by moving the flush closer to when the descriptors were placed in the RX ring. Fixes: d1396004dd86 ("veth: Add XDP TX and REDIRECT") Suggested-by: Magnus Karlsson Signed-off-by: Shawn Bohrer Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit ac95cdafaca8d5ffdb44319ae8a23d9720acab3b Author: Pablo Neira Ayuso Date: Mon Dec 19 20:10:12 2022 +0100 netfilter: nf_tables: honor set timeout and garbage collection updates [ Upstream commit 123b99619cca94bdca0bf7bde9abe28f0a0dfe06 ] Set timeout and garbage collection interval updates are ignored on updates. Add transaction to update global set element timeout and garbage collection interval. Fixes: 96518518cc41 ("netfilter: add nftables") Suggested-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 49677ea1513e753bd90ed1a1e2c206076b7da9ec Author: Ronak Doshi Date: Tue Dec 20 12:25:55 2022 -0800 vmxnet3: correctly report csum_level for encapsulated packet [ Upstream commit 3d8f2c4269d08f8793e946279dbdf5e972cc4911 ] Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") added support for encapsulation offload. However, the pathc did not report correctly the csum_level for encapsulated packet. This patch fixes this issue by reporting correct csum level for the encapsulated packet. Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") Signed-off-by: Ronak Doshi Acked-by: Peng Li Link: https://lore.kernel.org/r/20221220202556.24421-1-doshir@vmware.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9d30cb442156ebca5524927b73358cb5b8baf5be Author: Pablo Neira Ayuso Date: Mon Dec 19 20:09:00 2022 +0100 netfilter: nf_tables: perform type checking for existing sets [ Upstream commit f6594c372afd5cec8b1e9ee9ea8f8819d59c6fb1 ] If a ruleset declares a set name that matches an existing set in the kernel, then validate that this declaration really refers to the same set, otherwise bail out with EEXIST. Currently, the kernel reports success when adding a set that already exists in the kernel. This usually results in EINVAL errors at a later stage, when the user adds elements to the set, if the set declaration mismatches the existing set representation in the kernel. Add a new function to check that the set declaration really refers to the same existing set in the kernel. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit c3bfb7784a09a10bca3e42f6d20ff02346a5bd40 Author: Pablo Neira Ayuso Date: Mon Dec 19 18:00:10 2022 +0100 netfilter: nf_tables: add function to create set stateful expressions [ Upstream commit a8fe4154fa5a1bae590b243ed60f871e5a5e1378 ] Add a helper function to allocate and initialize the stateful expressions that are defined in a set. This patch allows to reuse this code from the set update path, to check that type of the update matches the existing set in the kernel. Signed-off-by: Pablo Neira Ayuso Stable-dep-of: f6594c372afd ("netfilter: nf_tables: perform type checking for existing sets") Signed-off-by: Sasha Levin commit 996cd779c2a43e59a8717ad74b5088ab239bc54d Author: Pablo Neira Ayuso Date: Mon Dec 19 20:07:52 2022 +0100 netfilter: nf_tables: consolidate set description [ Upstream commit bed4a63ea4ae77cfe5aae004ef87379f0655260a ] Add the following fields to the set description: - key type - data type - object type - policy - gc_int: garbage collection interval) - timeout: element timeout This prepares for stricter set type checks on updates in a follow up patch. Signed-off-by: Pablo Neira Ayuso Stable-dep-of: f6594c372afd ("netfilter: nf_tables: perform type checking for existing sets") Signed-off-by: Sasha Levin commit 4f1105ee72d8c7c35d90e3491b31b2d9d6b7e33a Author: Steven Price Date: Mon Dec 19 14:01:30 2022 +0000 drm/panfrost: Fix GEM handle creation ref-counting [ Upstream commit 4217c6ac817451d5116687f3cc6286220dc43d49 ] panfrost_gem_create_with_handle() previously returned a BO but with the only reference being from the handle, which user space could in theory guess and release, causing a use-after-free. Additionally if the call to panfrost_gem_mapping_get() in panfrost_ioctl_create_bo() failed then a(nother) reference on the BO was dropped. The _create_with_handle() is a problematic pattern, so ditch it and instead create the handle in panfrost_ioctl_create_bo(). If the call to panfrost_gem_mapping_get() fails then this means that user space has indeed gone behind our back and freed the handle. In which case just return an error code. Reported-by: Rob Clark Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Steven Price Reviewed-by: Rob Clark Link: https://patchwork.freedesktop.org/patch/msgid/20221219140130.410578-1-steven.price@arm.com Signed-off-by: Sasha Levin commit df493f676fb0b204d4cb4dc4699990f83c295fff Author: Jakub Kicinski Date: Mon Dec 19 16:47:00 2022 -0800 bpf: pull before calling skb_postpull_rcsum() [ Upstream commit 54c3f1a81421f85e60ae2eaae7be3727a09916ee ] Anand hit a BUG() when pulling off headers on egress to a SW tunnel. We get to skb_checksum_help() with an invalid checksum offset (commit d7ea0d9df2a6 ("net: remove two BUG() from skb_checksum_help()") converted those BUGs to WARN_ONs()). He points out oddness in how skb_postpull_rcsum() gets used. Indeed looks like we should pull before "postpull", otherwise the CHECKSUM_PARTIAL fixup from skb_postpull_rcsum() will not be able to do its job: if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_start_offset(skb) < 0) skb->ip_summed = CHECKSUM_NONE; Reported-by: Anand Parthasarathy Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Signed-off-by: Jakub Kicinski Acked-by: Stanislav Fomichev Link: https://lore.kernel.org/r/20221220004701.402165-1-kuba@kernel.org Signed-off-by: Martin KaFai Lau Signed-off-by: Sasha Levin commit d7e817e689b12238901fe0fa20162bf906f954dd Author: Sasha Levin Date: Sun Jan 8 08:24:19 2023 -0500 btrfs: fix an error handling path in btrfs_defrag_leaves() [ Upstream commit db0a4a7b8e95f9312a59a67cbd5bc589f090e13d ] All error handling paths end to 'out', except this memory allocation failure. This is spurious. So branch to the error handling path also in this case. It will add a call to: memset(&root->defrag_progress, 0, sizeof(root->defrag_progress)); Fixes: 6702ed490ca0 ("Btrfs: Add run time btree defrag, and an ioctl to force btree defrag") Signed-off-by: Christophe JAILLET Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 4d69cdba2c27c805d474365f6cbc2a3207f058da Author: minoura makoto Date: Tue Dec 13 13:14:31 2022 +0900 SUNRPC: ensure the matching upcall is in-flight upon downcall [ Upstream commit b18cba09e374637a0a3759d856a6bca94c133952 ] Commit 9130b8dbc6ac ("SUNRPC: allow for upcalls for the same uid but different gss service") introduced `auth` argument to __gss_find_upcall(), but in gss_pipe_downcall() it was left as NULL since it (and auth->service) was not (yet) determined. When multiple upcalls with the same uid and different service are ongoing, it could happen that __gss_find_upcall(), which returns the first match found in the pipe->in_downcall list, could not find the correct gss_msg corresponding to the downcall we are looking for. Moreover, it might return a msg which is not sent to rpc.gssd yet. We could see mount.nfs process hung in D state with multiple mount.nfs are executed in parallel. The call trace below is of CentOS 7.9 kernel-3.10.0-1160.24.1.el7.x86_64 but we observed the same hang w/ elrepo kernel-ml-6.0.7-1.el7. PID: 71258 TASK: ffff91ebd4be0000 CPU: 36 COMMAND: "mount.nfs" #0 [ffff9203ca3234f8] __schedule at ffffffffa3b8899f #1 [ffff9203ca323580] schedule at ffffffffa3b88eb9 #2 [ffff9203ca323590] gss_cred_init at ffffffffc0355818 [auth_rpcgss] #3 [ffff9203ca323658] rpcauth_lookup_credcache at ffffffffc0421ebc [sunrpc] #4 [ffff9203ca3236d8] gss_lookup_cred at ffffffffc0353633 [auth_rpcgss] #5 [ffff9203ca3236e8] rpcauth_lookupcred at ffffffffc0421581 [sunrpc] #6 [ffff9203ca323740] rpcauth_refreshcred at ffffffffc04223d3 [sunrpc] #7 [ffff9203ca3237a0] call_refresh at ffffffffc04103dc [sunrpc] #8 [ffff9203ca3237b8] __rpc_execute at ffffffffc041e1c9 [sunrpc] #9 [ffff9203ca323820] rpc_execute at ffffffffc0420a48 [sunrpc] The scenario is like this. Let's say there are two upcalls for services A and B, A -> B in pipe->in_downcall, B -> A in pipe->pipe. When rpc.gssd reads pipe to get the upcall msg corresponding to service B from pipe->pipe and then writes the response, in gss_pipe_downcall the msg corresponding to service A will be picked because only uid is used to find the msg and it is before the one for B in pipe->in_downcall. And the process waiting for the msg corresponding to service A will be woken up. Actual scheduing of that process might be after rpc.gssd processes the next msg. In rpc_pipe_generic_upcall it clears msg->errno (for A). The process is scheduled to see gss_msg->ctx == NULL and gss_msg->msg.errno == 0, therefore it cannot break the loop in gss_create_upcall and is never woken up after that. This patch adds a simple check to ensure that a msg which is not sent to rpc.gssd yet is not chosen as the matching upcall upon receiving a downcall. Signed-off-by: minoura makoto Signed-off-by: Hiroshi Shimamoto Tested-by: Hiroshi Shimamoto Cc: Trond Myklebust Fixes: 9130b8dbc6ac ("SUNRPC: allow for upcalls for same uid but different gss service") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit af0265dfeffa42256a4d52b73509f91c7e9ed113 Author: Matthew Auld Date: Mon Dec 6 11:25:38 2021 +0000 drm/i915/migrate: fix length calculation [ Upstream commit 31d70749bfe110593fbe8bf45e7c7788c7d85035 ] No need to insert PTEs for the PTE window itself, also foreach expects a length not an end offset, which could be gigantic here with a second engine. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Ramalingam C Reviewed-by: Ramalingam C Link: https://patchwork.freedesktop.org/patch/msgid/20211206112539.3149779-3-matthew.auld@intel.com Signed-off-by: Sasha Levin commit 8b25a526a5e9b7040e8faaf2e54e9c331e8047fe Author: Matthew Auld Date: Mon Dec 6 11:25:37 2021 +0000 drm/i915/migrate: fix offset calculation [ Upstream commit 08c7c122ad90799cc3ae674e7f29f236f91063ce ] Ensure we add the engine base only after we calculate the qword offset into the PTE window. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Ramalingam C Reviewed-by: Ramalingam C Link: https://patchwork.freedesktop.org/patch/msgid/20211206112539.3149779-2-matthew.auld@intel.com Signed-off-by: Sasha Levin commit a3d1e6f9b67808286f09ee6a905b5a2926182729 Author: Matthew Auld Date: Mon Dec 6 11:25:36 2021 +0000 drm/i915/migrate: don't check the scratch page [ Upstream commit 8eb7fcce34d16f77ac8efa80e8dfecec2503e8c5 ] The scratch page might not be allocated in LMEM(like on DG2), so instead of using that as the deciding factor for where the paging structures live, let's just query the pt before mapping it. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Ramalingam C Reviewed-by: Ramalingam C Link: https://patchwork.freedesktop.org/patch/msgid/20211206112539.3149779-1-matthew.auld@intel.com Signed-off-by: Sasha Levin commit 5bc0b2fda4b47c86278f7c6d30c211f425bf51cf Author: Jan Kara Date: Wed Nov 23 20:39:50 2022 +0100 ext4: fix deadlock due to mbcache entry corruption [ Upstream commit a44e84a9b7764c72896f7241a0ec9ac7e7ef38dd ] When manipulating xattr blocks, we can deadlock infinitely looping inside ext4_xattr_block_set() where we constantly keep finding xattr block for reuse in mbcache but we are unable to reuse it because its reference count is too big. This happens because cache entry for the xattr block is marked as reusable (e_reusable set) although its reference count is too big. When this inconsistency happens, this inconsistent state is kept indefinitely and so ext4_xattr_block_set() keeps retrying indefinitely. The inconsistent state is caused by non-atomic update of e_reusable bit. e_reusable is part of a bitfield and e_reusable update can race with update of e_referenced bit in the same bitfield resulting in loss of one of the updates. Fix the problem by using atomic bitops instead. This bug has been around for many years, but it became *much* easier to hit after commit 65f8b80053a1 ("ext4: fix race when reusing xattr blocks"). Cc: stable@vger.kernel.org Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries") Fixes: 65f8b80053a1 ("ext4: fix race when reusing xattr blocks") Reported-and-tested-by: Jeremi Piotrowski Reported-by: Thilo Fromm Link: https://lore.kernel.org/r/c77bf00f-4618-7149-56f1-b8d1664b9d07@linux.microsoft.com/ Signed-off-by: Jan Kara Reviewed-by: Andreas Dilger Link: https://lore.kernel.org/r/20221123193950.16758-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit a6e4094faf3cc192795d19bed2a6e6e9763f93a2 Author: Jan Kara Date: Tue Jul 12 12:54:29 2022 +0200 mbcache: automatically delete entries from cache on freeing [ Upstream commit 307af6c879377c1c63e71cbdd978201f9c7ee8df ] Use the fact that entries with elevated refcount are not removed from the hash and just move removal of the entry from the hash to the entry freeing time. When doing this we also change the generic code to hold one reference to the cache entry, not two of them, which makes code somewhat more obvious. Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220712105436.32204-10-jack@suse.cz Signed-off-by: Theodore Ts'o Stable-dep-of: a44e84a9b776 ("ext4: fix deadlock due to mbcache entry corruption") Signed-off-by: Sasha Levin commit 187254912971c9e1154e9b0bd3516ecc05d8dbe5 Author: Baokun Li Date: Wed Nov 9 15:43:43 2022 +0800 ext4: correct inconsistent error msg in nojournal mode [ Upstream commit 89481b5fa8c0640e62ba84c6020cee895f7ac643 ] When we used the journal_async_commit mounting option in nojournal mode, the kernel told me that "can't mount with journal_checksum", was very confusing. I find that when we mount with journal_async_commit, both the JOURNAL_ASYNC_COMMIT and EXPLICIT_JOURNAL_CHECKSUM flags are set. However, in the error branch, CHECKSUM is checked before ASYNC_COMMIT. As a result, the above inconsistency occurs, and the ASYNC_COMMIT branch becomes dead code that cannot be executed. Therefore, we exchange the positions of the two judgments to make the error msg more accurate. Signed-off-by: Baokun Li Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221109074343.4184862-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Sasha Levin commit 761f88f82e0fd2701356bc958933cfd5ac167004 Author: Jason Yan Date: Fri Sep 16 22:15:12 2022 +0800 ext4: goto right label 'failed_mount3a' [ Upstream commit 43bd6f1b49b61f43de4d4e33661b8dbe8c911f14 ] Before these two branches neither loaded the journal nor created the xattr cache. So the right label to goto is 'failed_mount3a'. Although this did not cause any issues because the error handler validated if the pointer is null. However this still made me confused when reading the code. So it's still worth to modify to goto the right label. Signed-off-by: Jason Yan Reviewed-by: Jan Kara Reviewed-by: Ritesh Harjani (IBM) Link: https://lore.kernel.org/r/20220916141527.1012715-2-yanaijie@huawei.com Signed-off-by: Theodore Ts'o Stable-dep-of: 89481b5fa8c0 ("ext4: correct inconsistent error msg in nojournal mode") Signed-off-by: Sasha Levin commit eb16602140f0acec393b7e37021c95665afd6edc Author: Biju Das Date: Wed Dec 14 10:51:18 2022 +0000 ravb: Fix "failed to switch device to config mode" message during unbind [ Upstream commit c72a7e42592b2e18d862cf120876070947000d7a ] This patch fixes the error "ravb 11c20000.ethernet eth0: failed to switch device to config mode" during unbind. We are doing register access after pm_runtime_put_sync(). We usually do cleanup in reverse order of init. Currently in remove(), the "pm_runtime_put_sync" is not in reverse order. Probe reset_control_deassert(rstc); pm_runtime_enable(&pdev->dev); pm_runtime_get_sync(&pdev->dev); remove pm_runtime_put_sync(&pdev->dev); unregister_netdev(ndev); .. ravb_mdio_release(priv); pm_runtime_disable(&pdev->dev); Consider the call to unregister_netdev() unregister_netdev->unregister_netdevice_queue->rollback_registered_many that calls the below functions which access the registers after pm_runtime_put_sync() 1) ravb_get_stats 2) ravb_close Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Cc: stable@vger.kernel.org Signed-off-by: Biju Das Reviewed-by: Leon Romanovsky Link: https://lore.kernel.org/r/20221214105118.2495313-1-biju.das.jz@bp.renesas.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 4216995dbd934bfcb69fef94852f7a63ac92a2dc Author: Masami Hiramatsu (Google) Date: Sat Nov 5 12:01:14 2022 +0900 perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data [ Upstream commit a9dfc46c67b52ad43b8e335e28f4cf8002c67793 ] DWARF version 5 standard Sec 2.14 says that Any debugging information entry representing the declaration of an object, module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and DW_AT_decl_column attributes, each of whose value is an unsigned integer constant. So it should be an unsigned integer data. Also, even though the standard doesn't clearly say the DW_AT_call_file is signed or unsigned, the elfutils (eu-readelf) interprets it as unsigned integer data and it is natural to handle it as unsigned integer data as same as DW_AT_decl_file. This changes the DW_AT_call_file as unsigned integer data too. Fixes: 3f4460a28fb2f73d ("perf probe: Filter out redundant inline-instances") Signed-off-by: Masami Hiramatsu Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: stable@vger.kernel.org Cc: Steven Rostedt (VMware) Link: https://lore.kernel.org/r/166761727445.480106.3738447577082071942.stgit@devnote3 Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit d8bbbf2b52b2e1bf68b2fc137d2e87492e461bf2 Author: Masami Hiramatsu (Google) Date: Tue Nov 1 22:48:39 2022 +0900 perf probe: Use dwarf_attr_integrate as generic DWARF attr accessor [ Upstream commit f828929ab7f0dc3353e4a617f94f297fa8f3dec3 ] Use dwarf_attr_integrate() instead of dwarf_attr() for generic attribute acccessor functions, so that it can find the specified attribute from abstact origin DIE etc. Signed-off-by: Masami Hiramatsu Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Peter Zijlstra Cc: Steven Rostedt (VMware) Link: https://lore.kernel.org/r/166731051988.2100653.13595339994343449770.stgit@devnote3 Signed-off-by: Arnaldo Carvalho de Melo Stable-dep-of: a9dfc46c67b5 ("perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data") Signed-off-by: Sasha Levin commit b131b5f1361e686a0f11e7b3c3ea4c332fe6d277 Author: Smitha T Murthy Date: Wed Sep 7 16:02:25 2022 +0530 media: s5p-mfc: Fix in register read and write for H264 [ Upstream commit 06710cd5d2436135046898d7e4b9408c8bb99446 ] Few of the H264 encoder registers written were not getting reflected since the read values were not stored and getting overwritten. Fixes: 6a9c6f681257 ("[media] s5p-mfc: Add variants to access mfc registers") Cc: stable@vger.kernel.org Cc: linux-fsd@tesla.com Signed-off-by: Smitha T Murthy Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit ff27800c0a6d81571671b33f696109804d015409 Author: Smitha T Murthy Date: Wed Sep 7 16:02:26 2022 +0530 media: s5p-mfc: Clear workbit to handle error condition [ Upstream commit d3f3c2fe54e30b0636496d842ffbb5ad3a547f9b ] During error on CLOSE_INSTANCE command, ctx_work_bits was not getting cleared. During consequent mfc execution NULL pointer dereferencing of this context led to kernel panic. This patch fixes this issue by making sure to clear ctx_work_bits always. Fixes: 818cd91ab8c6 ("[media] s5p-mfc: Extract open/close MFC instance commands") Cc: stable@vger.kernel.org Cc: linux-fsd@tesla.com Signed-off-by: Smitha T Murthy Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit 4653ba32adcd6bfdd04465c3c8d409a292be523c Author: Smitha T Murthy Date: Wed Sep 7 16:02:27 2022 +0530 media: s5p-mfc: Fix to handle reference queue during finishing [ Upstream commit d8a46bc4e1e0446459daa77c4ce14218d32dacf9 ] On receiving last buffer driver puts MFC to MFCINST_FINISHING state which in turn skips transferring of frame from SRC to REF queue. This causes driver to stop MFC encoding and last frame is lost. This patch guarantees safe handling of frames during MFCINST_FINISHING and correct clearing of workbit to avoid early stopping of encoding. Fixes: af9357467810 ("[media] MFC: Add MFC 5.1 V4L2 driver") Cc: stable@vger.kernel.org Cc: linux-fsd@tesla.com Signed-off-by: Smitha T Murthy Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit 1bd7283dc0bee2067398a8f1a4847449cb9dfbee Author: Yazen Ghannam Date: Tue Jun 21 15:59:43 2022 +0000 x86/MCE/AMD: Clear DFR errors found in THR handler [ Upstream commit bc1b705b0eee4c645ad8b3bbff3c8a66e9688362 ] AMD's MCA Thresholding feature counts errors of all severity levels, not just correctable errors. If a deferred error causes the threshold limit to be reached (it was the error that caused the overflow), then both a deferred error interrupt and a thresholding interrupt will be triggered. The order of the interrupts is not guaranteed. If the threshold interrupt handler is executed first, then it will clear MCA_STATUS for the error. It will not check or clear MCA_DESTAT which also holds a copy of the deferred error. When the deferred error interrupt handler runs it will not find an error in MCA_STATUS, but it will find the error in MCA_DESTAT. This will cause two errors to be logged. Check for deferred errors when handling a threshold interrupt. If a bank contains a deferred error, then clear the bank's MCA_DESTAT register. Define a new helper function to do the deferred error check and clearing of MCA_DESTAT. [ bp: Simplify, convert comment to passive voice. ] Fixes: 37d43acfd79f ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers") Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220621155943.33623-1-yazen.ghannam@amd.com Signed-off-by: Sasha Levin commit 5ddcd349d9af5e3c4298340e1b6a21dab478bf3c Author: Borislav Petkov Date: Thu Sep 2 13:33:22 2021 +0200 x86/mce: Get rid of msr_ops [ Upstream commit 8121b8f947be0033f567619be204639a50cad298 ] Avoid having indirect calls and use a normal function which returns the proper MSR address based on ->smca setting. No functional changes. Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Link: https://lkml.kernel.org/r/20210922165101.18951-4-bp@alien8.de Stable-dep-of: bc1b705b0eee ("x86/MCE/AMD: Clear DFR errors found in THR handler") Signed-off-by: Sasha Levin commit b8e7ed42bc3ca0d0e4191ee394d34962d3624c22 Author: void0red Date: Wed Nov 23 22:39:45 2022 +0800 btrfs: fix extent map use-after-free when handling missing device in read_one_chunk [ Upstream commit 1742e1c90c3da344f3bb9b1f1309b3f47482756a ] Store the error code before freeing the extent_map. Though it's reference counted structure, in that function it's the first and last allocation so this would lead to a potential use-after-free. The error can happen eg. when chunk is stored on a missing device and the degraded mount option is missing. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216721 Reported-by: eriri <1527030098@qq.com> Fixes: adfb69af7d8c ("btrfs: add_missing_dev() should return the actual error") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: void0red Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 9c3beebd21f33f6e5bb48e5d0c3c6d3f0074847e Author: Nikolay Borisov Date: Tue Jan 11 18:00:26 2022 +0200 btrfs: move missing device handling in a dedicate function [ Upstream commit ff37c89f94be14b0e22a532d1e6d57187bfd5bb8 ] This simplifies the code flow in read_one_chunk and makes error handling when handling missing devices a bit simpler by reducing it to a single check if something went wrong. No functional changes. Reviewed-by: Su Yue Signed-off-by: Nikolay Borisov Reviewed-by: David Sterba Signed-off-by: David Sterba Stable-dep-of: 1742e1c90c3d ("btrfs: fix extent map use-after-free when handling missing device in read_one_chunk") Signed-off-by: Sasha Levin commit 7528b21cebe0a470a39dc49c2f37f4fed64e108b Author: Sasha Levin Date: Wed Jan 4 11:14:45 2023 -0500 btrfs: replace strncpy() with strscpy() [ Upstream commit 63d5429f68a3d4c4aa27e65a05196c17f86c41d6 ] Using strncpy() on NUL-terminated strings are deprecated. To avoid possible forming of non-terminated string strscpy() should be used. Found by Linux Verification Center (linuxtesting.org) with SVACE. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Artem Chernyshev Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 4cef44525f4fe5c993108d11785a4a1c21d511f2 Author: Sasha Levin Date: Wed Jan 4 11:35:11 2023 -0500 phy: qcom-qmp-combo: fix out-of-bounds clock access [ Upstream commit d8a5b59c5fc75c99ba17e3eb1a8f580d8d172b28 ] The SM8250 only uses three clocks but the DP configuration erroneously described four clocks. In case the DP part of the PHY is initialised before the USB part, this would lead to uninitialised memory beyond the bulk-clocks array to be treated as a clock pointer as the clocks are requested based on the USB configuration. Fixes: aff188feb5e1 ("phy: qcom-qmp: add support for sm8250-usb3-dp phy") Cc: stable@vger.kernel.org # 5.13 Reviewed-by: Dmitry Baryshkov Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20221114081346.5116-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 855edc4ec64c197427e492c029dbf3230cc5f94c Author: Jens Axboe Date: Wed Jan 4 07:48:37 2023 -0700 ARM: renumber bits related to _TIF_WORK_MASK commit 191f8453fc99a537ea78b727acea739782378b0d upstream. We want to ensure that the mask related to calling do_work_pending() is within the first 16 bits. Move bits unrelated to that outside of that range, to avoid spuriously calling do_work_pending() when we don't need to. Cc: stable@vger.kernel.org Fixes: 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL") Reported-and-tested-by: Hui Tang Suggested-by: Russell King (Oracle) Link: https://lore.kernel.org/lkml/7ecb8f3c-2aeb-a905-0d4a-aa768b9649b5@huawei.com/ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 18f28f13301d1afb8cea9c4ddcecdbff14488ec6 Author: Eric Biggers Date: Wed Jan 4 23:13:59 2023 -0800 ext4: fix off-by-one errors in fast-commit block filling From: Eric Biggers commit 48a6a66db82b8043d298a630f22c62d43550cae5 upstream. Due to several different off-by-one errors, or perhaps due to a late change in design that wasn't fully reflected in the code that was actually merged, there are several very strange constraints on how fast-commit blocks are filled with tlv entries: - tlvs must start at least 10 bytes before the end of the block, even though the minimum tlv length is 8. Otherwise, the replay code will ignore them. (BUG: ext4_fc_reserve_space() could violate this requirement if called with a len of blocksize - 9 or blocksize - 8. Fortunately, this doesn't seem to happen currently.) - tlvs must end at least 1 byte before the end of the block. Otherwise the replay code will consider them to be invalid. This quirk contributed to a bug (fixed by an earlier commit) where uninitialized memory was being leaked to disk in the last byte of blocks. Also, strangely these constraints don't apply to the replay code in e2fsprogs, which will accept any tlvs in the blocks (with no bounds checks at all, but that is a separate issue...). Given that this all seems to be a bug, let's fix it by just filling blocks with tlv entries in the natural way. Note that old kernels will be unable to replay fast-commit journals created by kernels that have this commit. Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-7-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit b205332b6b87dacec803925342a1e662be1053e7 Author: Eric Biggers Date: Wed Jan 4 23:13:58 2023 -0800 ext4: fix unaligned memory access in ext4_fc_reserve_space() From: Eric Biggers commit 8415ce07ecf0cc25efdd5db264a7133716e503cf upstream. As is done elsewhere in the file, build the struct ext4_fc_tl on the stack and memcpy() it into the buffer, rather than directly writing it to a potentially-unaligned location in the buffer. Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-6-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 9c197dcbacc4662aea893b2440c60908e5228e79 Author: Eric Biggers Date: Wed Jan 4 23:13:57 2023 -0800 ext4: add missing validation of fast-commit record lengths From: Eric Biggers commit 64b4a25c3de81a69724e888ec2db3533b43816e2 upstream. Validate the inode and filename lengths in fast-commit journal records so that a malicious fast-commit journal cannot cause a crash by having invalid values for these. Also validate EXT4_FC_TAG_DEL_RANGE. Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-5-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 6220ec405571ded17efedc56587190b542adf246 Author: Eric Biggers Date: Wed Jan 4 23:13:56 2023 -0800 ext4: don't set up encryption key during jbd2 transaction From: Eric Biggers commit 4c0d5778385cb3618ff26a561ce41de2b7d9de70 upstream. Commit a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") extended the scope of the transaction in ext4_unlink() too far, making it include the call to ext4_find_entry(). However, ext4_find_entry() can deadlock when called from within a transaction because it may need to set up the directory's encryption key. Fix this by restoring the transaction to its original scope. Reported-by: syzbot+1a748d0007eeac3ab079@syzkaller.appspotmail.com Fixes: a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-3-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 6482d42baff5626299e90bc1d5ca28c08ddd1e0f Author: Eric Biggers Date: Wed Jan 4 23:13:55 2023 -0800 ext4: disable fast-commit of encrypted dir operations From: Eric Biggers commit 0fbcb5251fc81b58969b272c4fb7374a7b922e3e upstream. fast-commit of create, link, and unlink operations in encrypted directories is completely broken because the unencrypted filenames are being written to the fast-commit journal instead of the encrypted filenames. These operations can't be replayed, as encryption keys aren't present at journal replay time. It is also an information leak. Until if/when we can get this working properly, make encrypted directory operations ineligible for fast-commit. Note that fast-commit operations on encrypted regular files continue to be allowed, as they seem to work. Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-2-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 6969367c1500c15eddc38fda12f6d15518ad6d03 Author: Eric Biggers Date: Wed Jan 4 23:13:54 2023 -0800 ext4: fix potential out of bound read in ext4_fc_replay_scan() From: Ye Bin commit 1b45cc5c7b920fd8bf72e5a888ec7abeadf41e09 upstream. For scan loop must ensure that at least EXT4_FC_TAG_BASE_LEN space. If remain space less than EXT4_FC_TAG_BASE_LEN which will lead to out of bound read when mounting corrupt file system image. ADD_RANGE/HEAD/TAIL is needed to add extra check when do journal scan, as this three tags will read data during scan, tag length couldn't less than data length which will read. Cc: stable@kernel.org Signed-off-by: Ye Bin Link: https://lore.kernel.org/r/20220924075233.2315259-4-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit 818175ae3bd2ca1af5a3950b4c062898e2e92a1f Author: Eric Biggers Date: Wed Jan 4 23:13:53 2023 -0800 ext4: factor out ext4_fc_get_tl() From: Ye Bin commit dcc5827484d6e53ccda12334f8bbfafcc593ceda upstream. Factor out ext4_fc_get_tl() to fill 'tl' with host byte order. Signed-off-by: Ye Bin Link: https://lore.kernel.org/r/20220924075233.2315259-3-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit ffd84d0bc5dcc8f13739ba678bcbd7ac25e14a0c Author: Eric Biggers Date: Wed Jan 4 23:13:52 2023 -0800 ext4: introduce EXT4_FC_TAG_BASE_LEN helper From: Ye Bin commit fdc2a3c75dd8345c5b48718af90bad1a7811bedb upstream. Introduce EXT4_FC_TAG_BASE_LEN helper for calculate length of struct ext4_fc_tl. Signed-off-by: Ye Bin Link: https://lore.kernel.org/r/20220924075233.2315259-2-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit 37914e029bec3cc87c52cdf0384a91ab19cd8b0e Author: Eric Biggers Date: Wed Jan 4 23:13:51 2023 -0800 ext4: use ext4_debug() instead of jbd_debug() From: Jan Kara commit 4978c659e7b5c1926cdb4b556e4ca1fd2de8ad42 upstream. We use jbd_debug() in some places in ext4. It seems a bit strange to use jbd2 debugging output function for ext4 code. Also these days ext4_debug() uses dynamic printk so each debug message can be enabled / disabled on its own so the time when it made some sense to have these combined (to allow easier common selecting of messages to report) has passed. Just convert all jbd_debug() uses in ext4 to ext4_debug(). Signed-off-by: Jan Kara Reviewed-by: Lukas Czerner Link: https://lore.kernel.org/r/20220608112355.4397-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit b0ed9a032e52a175683d18e2e2e8eec0f9ba1ff9 Author: Eric Biggers Date: Wed Jan 4 23:13:50 2023 -0800 ext4: remove unused enum EXT4_FC_COMMIT_FAILED From: Ritesh Harjani commit c864ccd182d6ff2730a0f5b636c6b7c48f6f4f7f upstream. Below commit removed all references of EXT4_FC_COMMIT_FAILED. commit 0915e464cb274 ("ext4: simplify updating of fast commit stats") Just remove it since it is not used anymore. Signed-off-by: Ritesh Harjani Reviewed-by: Jan Kara Reviewed-by: Harshad Shirwadkar Link: https://lore.kernel.org/r/c941357e476be07a1138c7319ca5faab7fb80fc6.1647057583.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit 394514ddf90e48ccdedaef4b7085262c9dd0ed0d Author: Zheng Yejian Date: Wed Dec 7 17:15:57 2022 +0800 tracing: Fix issue of missing one synthetic field commit ff4837f7fe59ff018eca4705a70eca5e0b486b97 upstream. The maximum number of synthetic fields supported is defined as SYNTH_FIELDS_MAX which value currently is 64, but it actually fails when try to generate a synthetic event with 64 fields by executing like: # echo "my_synth_event int v1; int v2; int v3; int v4; int v5; int v6;\ int v7; int v8; int v9; int v10; int v11; int v12; int v13; int v14;\ int v15; int v16; int v17; int v18; int v19; int v20; int v21; int v22;\ int v23; int v24; int v25; int v26; int v27; int v28; int v29; int v30;\ int v31; int v32; int v33; int v34; int v35; int v36; int v37; int v38;\ int v39; int v40; int v41; int v42; int v43; int v44; int v45; int v46;\ int v47; int v48; int v49; int v50; int v51; int v52; int v53; int v54;\ int v55; int v56; int v57; int v58; int v59; int v60; int v61; int v62;\ int v63; int v64" >> /sys/kernel/tracing/synthetic_events Correct the field counting to fix it. Link: https://lore.kernel.org/linux-trace-kernel/20221207091557.3137904-1-zhengyejian1@huawei.com Cc: Cc: Cc: stable@vger.kernel.org Fixes: c9e759b1e845 ("tracing: Rework synthetic event command parsing") Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) [Fix conflict due to lack of c24be24aed405d64ebcf04526614c13b2adfb1d2] Signed-off-by: Zheng Yejian Signed-off-by: Greg Kroah-Hartman commit 5234dd5d205b3b7ff4f636f705ec0b162749a0e0 Author: Damien Le Moal Date: Thu Nov 24 11:12:07 2022 +0900 block: mq-deadline: Fix dd_finish_request() for zoned devices commit 2820e5d0820ac4daedff1272616a53d9c7682fd2 upstream. dd_finish_request() tests if the per prio fifo_list is not empty to determine if request dispatching must be restarted for handling blocked write requests to zoned devices with a call to blk_mq_sched_mark_restart_hctx(). While simple, this implementation has 2 problems: 1) Only the priority level of the completed request is considered. However, writes to a zone may be blocked due to other writes to the same zone using a different priority level. While this is unlikely to happen in practice, as writing a zone with different IO priorirites does not make sense, nothing in the code prevents this from happening. 2) The use of list_empty() is dangerous as dd_finish_request() does not take dd->lock and may run concurrently with the insert and dispatch code. Fix these 2 problems by testing the write fifo list of all priority levels using the new helper dd_has_write_work(), and by testing each fifo list using list_empty_careful(). Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support") Cc: Signed-off-by: Damien Le Moal Reviewed-by: Johannes Thumshirn Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.wdc.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 78623b10fc9f8231802536538c85527dc54640a0 Author: Alex Deucher Date: Wed Dec 7 11:08:53 2022 -0500 drm/amdgpu: make display pinning more flexible (v2) commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb upstream. Only apply the static threshold for Stoney and Carrizo. This hardware has certain requirements that don't allow mixing of GTT and VRAM. Newer asics do not have these requirements so we should be able to be more flexible with where buffers end up. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255 Acked-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6363da2c854a826ff812f62a79ba6a78a5338cac Author: Alex Deucher Date: Mon Nov 21 15:52:19 2022 -0500 drm/amdgpu: handle polaris10/11 overlap asics (v2) commit 1d4624cd72b912b2680c08d0be48338a1629a858 upstream. Some special polaris 10 chips overlap with the polaris11 DID range. Handle this properly in the driver. v2: use local flags for other function calls. Acked-by: Luben Tuikov Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 2771c7a0eedc43e86cf69c355dc7f044a738bc69 Author: Ye Bin Date: Thu Dec 8 10:32:31 2022 +0800 ext4: allocate extended attribute value in vmalloc area commit cc12a6f25e07ed05d5825a1664b67a970842b2ca upstream. Now, extended attribute value maximum length is 64K. The memory requested here does not need continuous physical addresses, so it is appropriate to use kvmalloc to request memory. At the same time, it can also cope with the situation that the extended attribute will become longer in the future. Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221208023233.1231330-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit e995ff918e660cd8bc8e763f3e47a641996d35a7 Author: Jan Kara Date: Wed Dec 7 12:59:28 2022 +0100 ext4: avoid unaccounted block allocation when expanding inode commit 8994d11395f8165b3deca1971946f549f0822630 upstream. When expanding inode space in ext4_expand_extra_isize_ea() we may need to allocate external xattr block. If quota is not initialized for the inode, the block allocation will not be accounted into quota usage. Make sure the quota is initialized before we try to expand inode space. Reported-by: Pengfei Xu Link: https://lore.kernel.org/all/Y5BT+k6xWqthZc1P@xpf.sh.intel.com Signed-off-by: Jan Kara Cc: stable@kernel.org Link: https://lore.kernel.org/r/20221207115937.26601-2-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 877247222a0c5a67ac41a1d04efe41471c47ce67 Author: Jan Kara Date: Wed Dec 7 12:59:27 2022 +0100 ext4: initialize quota before expanding inode in setproject ioctl commit 1485f726c6dec1a1f85438f2962feaa3d585526f upstream. Make sure we initialize quotas before possibly expanding inode space (and thus maybe needing to allocate external xattr block) in ext4_ioctl_setproject(). This prevents not accounting the necessary block allocation. Signed-off-by: Jan Kara Cc: stable@kernel.org Link: https://lore.kernel.org/r/20221207115937.26601-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 322cf639b0b7f137543072c55545adab782b3a25 Author: Ye Bin Date: Thu Dec 8 10:32:33 2022 +0800 ext4: fix inode leak in ext4_xattr_inode_create() on an error path commit e4db04f7d3dbbe16680e0ded27ea2a65b10f766a upstream. There is issue as follows when do setxattr with inject fault: [localhost]# fsck.ext4 -fn /dev/sda e2fsck 1.46.6-rc1 (12-Sep-2022) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Unattached zero-length inode 15. Clear? no Unattached inode 15 Connect to /lost+found? no Pass 5: Checking group summary information /dev/sda: ********** WARNING: Filesystem still has errors ********** /dev/sda: 15/655360 files (0.0% non-contiguous), 66755/2621440 blocks This occurs in 'ext4_xattr_inode_create()'. If 'ext4_mark_inode_dirty()' fails, dropping i_nlink of the inode is needed. Or will lead to inode leak. Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221208023233.1231330-5-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 6380a93b57dbdea6df19e550491de4dd78ff4c63 Author: Ye Bin Date: Tue Dec 6 22:41:34 2022 +0800 ext4: fix kernel BUG in 'ext4_write_inline_data_end()' commit 5c099c4fdc438014d5893629e70a8ba934433ee8 upstream. Syzbot report follow issue: ------------[ cut here ]------------ kernel BUG at fs/ext4/inline.c:227! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3629 Comm: syz-executor212 Not tainted 6.1.0-rc5-syzkaller-00018-g59d0d52c30d4 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:ext4_write_inline_data+0x344/0x3e0 fs/ext4/inline.c:227 RSP: 0018:ffffc90003b3f368 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8880704e16c0 RCX: 0000000000000000 RDX: ffff888021763a80 RSI: ffffffff821e31a4 RDI: 0000000000000006 RBP: 000000000006818e R08: 0000000000000006 R09: 0000000000068199 R10: 0000000000000079 R11: 0000000000000000 R12: 000000000000000b R13: 0000000000068199 R14: ffffc90003b3f408 R15: ffff8880704e1c82 FS: 000055555723e3c0(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fffe8ac9080 CR3: 0000000079f81000 CR4: 0000000000350ee0 Call Trace: ext4_write_inline_data_end+0x2a3/0x12f0 fs/ext4/inline.c:768 ext4_write_end+0x242/0xdd0 fs/ext4/inode.c:1313 ext4_da_write_end+0x3ed/0xa30 fs/ext4/inode.c:3063 generic_perform_write+0x316/0x570 mm/filemap.c:3764 ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285 ext4_file_write_iter+0x8bc/0x16e0 fs/ext4/file.c:700 call_write_iter include/linux/fs.h:2191 [inline] do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 do_iter_write+0x182/0x700 fs/read_write.c:861 vfs_iter_write+0x74/0xa0 fs/read_write.c:902 iter_file_splice_write+0x745/0xc90 fs/splice.c:686 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0x114/0x180 fs/splice.c:931 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886 do_splice_direct+0x1ab/0x280 fs/splice.c:974 do_sendfile+0xb19/0x1270 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x1d0/0x210 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ---[ end trace 0000000000000000 ]--- Above issue may happens as follows: ext4_da_write_begin ext4_da_write_inline_data_begin ext4_da_convert_inline_data_to_extent ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); ext4_da_write_end ext4_run_li_request ext4_mb_prefetch ext4_read_block_bitmap_nowait ext4_validate_block_bitmap ext4_mark_group_bitmap_corrupted(sb, block_group, EXT4_GROUP_INFO_BBITMAP_CORRUPT) percpu_counter_sub(&sbi->s_freeclusters_counter,grp->bb_free); -> sbi->s_freeclusters_counter become zero ext4_da_write_begin if (ext4_nonda_switch(inode->i_sb)) -> As freeclusters_counter is zero will return true *fsdata = (void *)FALL_BACK_TO_NONDELALLOC; ext4_write_begin ext4_da_write_end if (write_mode == FALL_BACK_TO_NONDELALLOC) ext4_write_end if (inline_data) ext4_write_inline_data_end ext4_write_inline_data BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); -> As inode is already convert to extent, so 'pos + len' > inline_size -> then trigger BUG. To solve this issue, instead of checking ext4_has_inline_data() which is only cleared after data has been written back, check the EXT4_STATE_MAY_INLINE_DATA flag in ext4_write_end(). Fixes: f19d5870cbf7 ("ext4: add normal write support for inline data") Reported-by: syzbot+4faa160fa96bfba639f8@syzkaller.appspotmail.com Reported-by: Jun Nie Signed-off-by: Ye Bin Link: https://lore.kernel.org/r/20221206144134.1919987-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit dc3bbc9753f44970e1b7066a0d84559851d2f83a Author: Jan Kara Date: Mon Nov 21 14:09:29 2022 +0100 ext4: avoid BUG_ON when creating xattrs commit b40ebaf63851b3a401b0dc9263843538f64f5ce6 upstream. Commit fb0a387dcdcd ("ext4: limit block allocations for indirect-block files to < 2^32") added code to try to allocate xattr block with 32-bit block number for indirect block based files on the grounds that these files cannot use larger block numbers. It also added BUG_ON when allocated block could not fit into 32 bits. This is however bogus reasoning because xattr block is stored in inode->i_file_acl and inode->i_file_acl_hi and as such even indirect block based files can happily use full 48 bits for xattr block number. The proper handling seems to be there basically since 64-bit block number support was added. So remove the bogus limitation and BUG_ON. Cc: Eric Sandeen Fixes: fb0a387dcdcd ("ext4: limit block allocations for indirect-block files to < 2^32") Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20221121130929.32031-1-jack@suse.cz Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 844c405552976bd739ba7677f2862a9f7f63f3ba Author: Luís Henriques Date: Wed Nov 9 18:14:45 2022 +0000 ext4: fix error code return to user-space in ext4_get_branch() commit 26d75a16af285a70863ba6a81f85d81e7e65da50 upstream. If a block is out of range in ext4_get_branch(), -ENOMEM will be returned to user-space. Obviously, this error code isn't really useful. This patch fixes it by making sure the right error code (-EFSCORRUPTED) is propagated to user-space. EUCLEAN is more informative than ENOMEM. Signed-off-by: Luís Henriques Link: https://lore.kernel.org/r/20221109181445.17843-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit b870b28e29f604c2f996f843392d8c8201ea6e93 Author: Baokun Li Date: Thu Nov 17 12:03:41 2022 +0800 ext4: fix corruption when online resizing a 1K bigalloc fs commit 0aeaa2559d6d53358fca3e3fce73807367adca74 upstream. When a backup superblock is updated in update_backups(), the primary superblock's offset in the group (that is, sbi->s_sbh->b_blocknr) is used as the backup superblock's offset in its group. However, when the block size is 1K and bigalloc is enabled, the two offsets are not equal. This causes the backup group descriptors to be overwritten by the superblock in update_backups(). Moreover, if meta_bg is enabled, the file system will be corrupted because this feature uses backup group descriptors. To solve this issue, we use a more accurate ext4_group_first_block_no() as the offset of the backup superblock in its group. Fixes: d77147ff443b ("ext4: add support for online resizing with bigalloc") Signed-off-by: Baokun Li Reviewed-by: Jan Kara Cc: stable@kernel.org Link: https://lore.kernel.org/r/20221117040341.1380702-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit d440d6427a5e3a877c1c259b8d2b216ddb65e185 Author: Eric Whitney Date: Thu Nov 17 10:22:07 2022 -0500 ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline commit 131294c35ed6f777bd4e79d42af13b5c41bf2775 upstream. When converting files with inline data to extents, delayed allocations made on a file system created with both the bigalloc and inline options can result in invalid extent status cache content, incorrect reserved cluster counts, kernel memory leaks, and potential kernel panics. With bigalloc, the code that determines whether a block must be delayed allocated searches the extent tree to see if that block maps to a previously allocated cluster. If not, the block is delayed allocated, and otherwise, it isn't. However, if the inline option is also used, and if the file containing the block is marked as able to store data inline, there isn't a valid extent tree associated with the file. The current code in ext4_clu_mapped() calls ext4_find_extent() to search the non-existent tree for a previously allocated cluster anyway, which typically finds nothing, as desired. However, a side effect of the search can be to cache invalid content from the non-existent tree (garbage) in the extent status tree, including bogus entries in the pending reservation tree. To fix this, avoid searching the extent tree when allocating blocks for bigalloc + inline files that are being converted from inline to extent mapped. Signed-off-by: Eric Whitney Link: https://lore.kernel.org/r/20221117152207.2424-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit def7a39091e60e1c4a2f623629082a00092602be Author: Ye Bin Date: Mon Nov 7 09:53:35 2022 +0800 ext4: init quota for 'old.inode' in 'ext4_rename' commit fae381a3d79bb94aa2eb752170d47458d778b797 upstream. Syzbot found the following issue: ext4_parse_param: s_want_extra_isize=128 ext4_inode_info_init: s_want_extra_isize=32 ext4_rename: old.inode=ffff88823869a2c8 old.dir=ffff888238699828 new.inode=ffff88823869d7e8 new.dir=ffff888238699828 __ext4_mark_inode_dirty: inode=ffff888238699828 ea_isize=32 want_ea_size=128 __ext4_mark_inode_dirty: inode=ffff88823869a2c8 ea_isize=32 want_ea_size=128 ext4_xattr_block_set: inode=ffff88823869a2c8 ------------[ cut here ]------------ WARNING: CPU: 13 PID: 2234 at fs/ext4/xattr.c:2070 ext4_xattr_block_set.cold+0x22/0x980 Modules linked in: RIP: 0010:ext4_xattr_block_set.cold+0x22/0x980 RSP: 0018:ffff888227d3f3b0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffff88823007a000 RCX: 0000000000000000 RDX: 0000000000000a03 RSI: 0000000000000040 RDI: ffff888230078178 RBP: 0000000000000000 R08: 000000000000002c R09: ffffed1075c7df8e R10: ffff8883ae3efc6b R11: ffffed1075c7df8d R12: 0000000000000000 R13: ffff88823869a2c8 R14: ffff8881012e0460 R15: dffffc0000000000 FS: 00007f350ac1f740(0000) GS:ffff8883ae200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f350a6ed6a0 CR3: 0000000237456000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ext4_xattr_set_entry+0x3b7/0x2320 ? ext4_xattr_block_set+0x0/0x2020 ? ext4_xattr_set_entry+0x0/0x2320 ? ext4_xattr_check_entries+0x77/0x310 ? ext4_xattr_ibody_set+0x23b/0x340 ext4_xattr_move_to_block+0x594/0x720 ext4_expand_extra_isize_ea+0x59a/0x10f0 __ext4_expand_extra_isize+0x278/0x3f0 __ext4_mark_inode_dirty.cold+0x347/0x410 ext4_rename+0xed3/0x174f vfs_rename+0x13a7/0x2510 do_renameat2+0x55d/0x920 __x64_sys_rename+0x7d/0xb0 do_syscall_64+0x3b/0xa0 entry_SYSCALL_64_after_hwframe+0x72/0xdc As 'ext4_rename' will modify 'old.inode' ctime and mark inode dirty, which may trigger expand 'extra_isize' and allocate block. If inode didn't init quota will lead to warning. To solve above issue, init 'old.inode' firstly in 'ext4_rename'. Reported-by: syzbot+98346927678ac3059c77@syzkaller.appspotmail.com Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221107015335.2524319-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 3c31d8d3ad95aef8cc17a4fcf317e46217148439 Author: Ye Bin Date: Thu Nov 17 15:36:03 2022 +0800 ext4: fix uninititialized value in 'ext4_evict_inode' commit 7ea71af94eaaaf6d9aed24bc94a05b977a741cb9 upstream. Syzbot found the following issue: ===================================================== BUG: KMSAN: uninit-value in ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180 ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180 evict+0x365/0x9a0 fs/inode.c:664 iput_final fs/inode.c:1747 [inline] iput+0x985/0xdd0 fs/inode.c:1773 __ext4_new_inode+0xe54/0x7ec0 fs/ext4/ialloc.c:1361 ext4_mknod+0x376/0x840 fs/ext4/namei.c:2844 vfs_mknod+0x79d/0x830 fs/namei.c:3914 do_mknodat+0x47d/0xaa0 __do_sys_mknodat fs/namei.c:3992 [inline] __se_sys_mknodat fs/namei.c:3989 [inline] __ia32_sys_mknodat+0xeb/0x150 fs/namei.c:3989 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 Uninit was created at: __alloc_pages+0x9f1/0xe80 mm/page_alloc.c:5578 alloc_pages+0xaae/0xd80 mm/mempolicy.c:2285 alloc_slab_page mm/slub.c:1794 [inline] allocate_slab+0x1b5/0x1010 mm/slub.c:1939 new_slab mm/slub.c:1992 [inline] ___slab_alloc+0x10c3/0x2d60 mm/slub.c:3180 __slab_alloc mm/slub.c:3279 [inline] slab_alloc_node mm/slub.c:3364 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc_lru+0x6f3/0xb30 mm/slub.c:3429 alloc_inode_sb include/linux/fs.h:3117 [inline] ext4_alloc_inode+0x5f/0x860 fs/ext4/super.c:1321 alloc_inode+0x83/0x440 fs/inode.c:259 new_inode_pseudo fs/inode.c:1018 [inline] new_inode+0x3b/0x430 fs/inode.c:1046 __ext4_new_inode+0x2a7/0x7ec0 fs/ext4/ialloc.c:959 ext4_mkdir+0x4d5/0x1560 fs/ext4/namei.c:2992 vfs_mkdir+0x62a/0x870 fs/namei.c:4035 do_mkdirat+0x466/0x7b0 fs/namei.c:4060 __do_sys_mkdirat fs/namei.c:4075 [inline] __se_sys_mkdirat fs/namei.c:4073 [inline] __ia32_sys_mkdirat+0xc4/0x120 fs/namei.c:4073 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 CPU: 1 PID: 4625 Comm: syz-executor.2 Not tainted 6.1.0-rc4-syzkaller-62821-gcb231e2f67ec #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 ===================================================== Now, 'ext4_alloc_inode()' didn't init 'ei->i_flags'. If new inode failed before set 'ei->i_flags' in '__ext4_new_inode()', then do 'iput()'. As after 6bc0d63dad7f commit will access 'ei->i_flags' in 'ext4_evict_inode()' which will lead to access uninit-value. To solve above issue just init 'ei->i_flags' in 'ext4_alloc_inode()'. Reported-by: syzbot+57b25da729eb0b88177d@syzkaller.appspotmail.com Signed-off-by: Ye Bin Fixes: 6bc0d63dad7f ("ext4: remove EA inode entry from mbcache on inode eviction") Reviewed-by: Jan Kara Reviewed-by: Eric Biggers Link: https://lore.kernel.org/r/20221117073603.2598882-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 871800770d7f2f952c7249ad52485c3564dab44e Author: Eric Biggers Date: Sun Nov 6 14:48:37 2022 -0800 ext4: fix leaking uninitialized memory in fast-commit journal commit 594bc43b410316d70bb42aeff168837888d96810 upstream. When space at the end of fast-commit journal blocks is unused, make sure to zero it out so that uninitialized memory is not leaked to disk. Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: # v5.10+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20221106224841.279231-4-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit d480a49c15c465cb9a16db1379f4996e9b5bb9cc Author: Baokun Li Date: Wed Oct 26 12:23:10 2022 +0800 ext4: fix bug_on in __es_tree_search caused by bad boot loader inode commit 991ed014de0840c5dc405b679168924afb2952ac upstream. We got a issue as fllows: ================================================================== kernel BUG at fs/ext4/extents_status.c:203! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 945 Comm: cat Not tainted 6.0.0-next-20221007-dirty #349 RIP: 0010:ext4_es_end.isra.0+0x34/0x42 RSP: 0018:ffffc9000143b768 EFLAGS: 00010203 RAX: 0000000000000000 RBX: ffff8881769cd0b8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8fc27cf7 RDI: 00000000ffffffff RBP: ffff8881769cd0bc R08: 0000000000000000 R09: ffffc9000143b5f8 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881769cd0a0 R13: ffff8881768e5668 R14: 00000000768e52f0 R15: 0000000000000000 FS: 00007f359f7f05c0(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f359f5a2000 CR3: 000000017130c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __es_tree_search.isra.0+0x6d/0xf5 ext4_es_cache_extent+0xfa/0x230 ext4_cache_extents+0xd2/0x110 ext4_find_extent+0x5d5/0x8c0 ext4_ext_map_blocks+0x9c/0x1d30 ext4_map_blocks+0x431/0xa50 ext4_mpage_readpages+0x48e/0xe40 ext4_readahead+0x47/0x50 read_pages+0x82/0x530 page_cache_ra_unbounded+0x199/0x2a0 do_page_cache_ra+0x47/0x70 page_cache_ra_order+0x242/0x400 ondemand_readahead+0x1e8/0x4b0 page_cache_sync_ra+0xf4/0x110 filemap_get_pages+0x131/0xb20 filemap_read+0xda/0x4b0 generic_file_read_iter+0x13a/0x250 ext4_file_read_iter+0x59/0x1d0 vfs_read+0x28f/0x460 ksys_read+0x73/0x160 __x64_sys_read+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ================================================================== In the above issue, ioctl invokes the swap_inode_boot_loader function to swap inode<5> and inode<12>. However, inode<5> contain incorrect imode and disordered extents, and i_nlink is set to 1. The extents check for inode in the ext4_iget function can be bypassed bacause 5 is EXT4_BOOT_LOADER_INO. While links_count is set to 1, the extents are not initialized in swap_inode_boot_loader. After the ioctl command is executed successfully, the extents are swapped to inode<12>, in this case, run the `cat` command to view inode<12>. And Bug_ON is triggered due to the incorrect extents. When the boot loader inode is not initialized, its imode can be one of the following: 1) the imode is a bad type, which is marked as bad_inode in ext4_iget and set to S_IFREG. 2) the imode is good type but not S_IFREG. 3) the imode is S_IFREG. The BUG_ON may be triggered by bypassing the check in cases 1 and 2. Therefore, when the boot loader inode is bad_inode or its imode is not S_IFREG, initialize the inode to avoid triggering the BUG. Signed-off-by: Baokun Li Reviewed-by: Jason Yan Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221026042310.3839669-5-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 91009e361e8cb2cbd1dc9496cb5fb4f8de3f4b11 Author: Zhang Yi Date: Wed Jun 29 19:26:47 2022 +0800 ext4: check and assert if marking an no_delete evicting inode dirty commit 318cdc822c63b6e2befcfdc2088378ae6fa18def upstream. In ext4_evict_inode(), if we evicting an inode in the 'no_delete' path, it cannot be raced by another mark_inode_dirty(). If it happens, someone else may accidentally dirty it without holding inode refcount and probably cause use-after-free issues in the writeback procedure. It's indiscoverable and hard to debug, so add an WARN_ON_ONCE() to check and detect this issue in advance. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20220629112647.4141034-2-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 820eacbc4e4f66ffb82385aa98eb920f6d61a59c Author: Ye Bin Date: Thu Dec 8 11:34:24 2022 +0800 ext4: fix reserved cluster accounting in __es_remove_extent() commit 1da18e38cb97e9521e93d63034521a9649524f64 upstream. When bigalloc is enabled, reserved cluster accounting for delayed allocation is handled in extent_status.c. With a corrupted file system, it's possible for this accounting to be incorrect, dsicovered by Syzbot: EXT4-fs error (device loop0): ext4_validate_block_bitmap:398: comm rep: bg 0: block 5: invalid block bitmap EXT4-fs (loop0): Delayed block allocation failed for inode 18 at logical offset 0 with max blocks 32 with error 28 EXT4-fs (loop0): This should not happen!! Data will be lost EXT4-fs (loop0): Total free blocks count 0 EXT4-fs (loop0): Free/Dirty block details EXT4-fs (loop0): free_blocks=0 EXT4-fs (loop0): dirty_blocks=32 EXT4-fs (loop0): Block reservation details EXT4-fs (loop0): i_reserved_data_blocks=2 EXT4-fs (loop0): Inode 18 (00000000845cd634): i_reserved_data_blocks (1) not cleared! Above issue happens as follows: Assume: sbi->s_cluster_ratio = 16 Step1: Insert delay block [0, 31] -> ei->i_reserved_data_blocks=2 Step2: ext4_writepages mpage_map_and_submit_extent -> return failed mpage_release_unused_pages -> to release [0, 30] ext4_es_remove_extent -> remove lblk=0 end=30 __es_remove_extent -> len1=0 len2=31-30=1 __es_remove_extent: ... if (len2 > 0) { ... if (len1 > 0) { ... } else { es->es_lblk = end + 1; es->es_len = len2; ... } if (count_reserved) count_rsvd(inode, lblk, ...); goto out; -> will return but didn't calculate 'reserved' ... Step3: ext4_destroy_inode -> trigger "i_reserved_data_blocks (1) not cleared!" To solve above issue if 'len2>0' call 'get_rsvd()' before goto out. Reported-by: syzbot+05a0f0ccab4a25626e38@syzkaller.appspotmail.com Fixes: 8fcc3a580651 ("ext4: rework reserved cluster accounting when invalidating pages") Signed-off-by: Ye Bin Reviewed-by: Eric Whitney Link: https://lore.kernel.org/r/20221208033426.1832460-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 0dcbf4dc3d54aab5990952cfd832042fb300dbe3 Author: Baokun Li Date: Wed Oct 26 12:23:07 2022 +0800 ext4: fix bug_on in __es_tree_search caused by bad quota inode commit d323877484765aaacbb2769b06e355c2041ed115 upstream. We got a issue as fllows: ================================================================== kernel BUG at fs/ext4/extents_status.c:202! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 810 Comm: mount Not tainted 6.1.0-rc1-next-g9631525255e3 #352 RIP: 0010:__es_tree_search.isra.0+0xb8/0xe0 RSP: 0018:ffffc90001227900 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000077512a0f RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000002a10 RDI: ffff8881004cd0c8 RBP: ffff888177512ac8 R08: 47ffffffffffffff R09: 0000000000000001 R10: 0000000000000001 R11: 00000000000679af R12: 0000000000002a10 R13: ffff888177512d88 R14: 0000000077512a10 R15: 0000000000000000 FS: 00007f4bd76dbc40(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005653bf993cf8 CR3: 000000017bfdf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ext4_es_cache_extent+0xe2/0x210 ext4_cache_extents+0xd2/0x110 ext4_find_extent+0x5d5/0x8c0 ext4_ext_map_blocks+0x9c/0x1d30 ext4_map_blocks+0x431/0xa50 ext4_getblk+0x82/0x340 ext4_bread+0x14/0x110 ext4_quota_read+0xf0/0x180 v2_read_header+0x24/0x90 v2_check_quota_file+0x2f/0xa0 dquot_load_quota_sb+0x26c/0x760 dquot_load_quota_inode+0xa5/0x190 ext4_enable_quotas+0x14c/0x300 __ext4_fill_super+0x31cc/0x32c0 ext4_fill_super+0x115/0x2d0 get_tree_bdev+0x1d2/0x360 ext4_get_tree+0x19/0x30 vfs_get_tree+0x26/0xe0 path_mount+0x81d/0xfc0 do_mount+0x8d/0xc0 __x64_sys_mount+0xc0/0x160 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ================================================================== Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup ext4_enable_quotas ext4_quota_enable ext4_iget --> get error inode <5> ext4_ext_check_inode --> Wrong imode makes it escape inspection make_bad_inode(inode) --> EXT4_BOOT_LOADER_INO set imode dquot_load_quota_inode vfs_setup_quota_inode --> check pass dquot_load_quota_sb v2_check_quota_file v2_read_header ext4_quota_read ext4_bread ext4_getblk ext4_map_blocks ext4_ext_map_blocks ext4_find_extent ext4_cache_extents ext4_es_cache_extent __es_tree_search.isra.0 ext4_es_end --> Wrong extents trigger BUG_ON In the above issue, s_usr_quota_inum is set to 5, but inode<5> contains incorrect imode and disordered extents. Because 5 is EXT4_BOOT_LOADER_INO, the ext4_ext_check_inode check in the ext4_iget function can be bypassed, finally, the extents that are not checked trigger the BUG_ON in the __es_tree_search function. To solve this issue, check whether the inode is bad_inode in vfs_setup_quota_inode(). Signed-off-by: Baokun Li Reviewed-by: Chaitanya Kulkarni Reviewed-by: Jason Yan Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221026042310.3839669-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 06a20a68bb6d76153a0831cd33051285e0444245 Author: Baokun Li Date: Wed Oct 26 12:23:08 2022 +0800 ext4: add helper to check quota inums commit 07342ec259df2a35d6a34aebce010567a80a0e15 upstream. Before quota is enabled, a check on the preset quota inums in ext4_super_block is added to prevent wrong quota inodes from being loaded. In addition, when the quota fails to be enabled, the quota type and quota inum are printed to facilitate fault locating. Signed-off-by: Baokun Li Reviewed-by: Jason Yan Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221026042310.3839669-3-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit f7e6b5548f915d7aa435d0764d41eacfb49c6e09 Author: Baokun Li Date: Wed Oct 26 12:23:09 2022 +0800 ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode commit 63b1e9bccb71fe7d7e3ddc9877dbdc85e5d2d023 upstream. There are many places that will get unhappy (and crash) when ext4_iget() returns a bad inode. However, if iget the boot loader inode, allows a bad inode to be returned, because the inode may not be initialized. This mechanism can be used to bypass some checks and cause panic. To solve this problem, we add a special iget flag EXT4_IGET_BAD. Only with this flag we'd be returning bad inode from ext4_iget(), otherwise we always return the error code if the inode is bad inode.(suggested by Jan Kara) Signed-off-by: Baokun Li Reviewed-by: Jason Yan Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221026042310.3839669-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 205ac16628aca9093931fcbdb4bcd00d0eb94132 Author: Gaosheng Cui Date: Mon Oct 31 13:58:33 2022 +0800 ext4: fix undefined behavior in bit shift for ext4_check_flag_values commit 3bf678a0f9c017c9ba7c581541dbc8453452a7ae upstream. Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below: UBSAN: shift-out-of-bounds in fs/ext4/ext4.h:591:2 left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: dump_stack_lvl+0x7d/0xa5 dump_stack+0x15/0x1b ubsan_epilogue+0xe/0x4e __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c ext4_init_fs+0x5a/0x277 do_one_initcall+0x76/0x430 kernel_init_freeable+0x3b3/0x422 kernel_init+0x24/0x1e0 ret_from_fork+0x1f/0x30 Fixes: 9a4c80194713 ("ext4: ensure Inode flags consistency are checked at build time") Signed-off-by: Gaosheng Cui Link: https://lore.kernel.org/r/20221031055833.3966222-1-cuigaosheng1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit cf0e0817b0f925b70d101d7014ea81b7094e1159 Author: Baokun Li Date: Wed Nov 2 16:06:33 2022 +0800 ext4: fix use-after-free in ext4_orphan_cleanup commit a71248b1accb2b42e4980afef4fa4a27fa0e36f5 upstream. I caught a issue as follows: ================================================================== BUG: KASAN: use-after-free in __list_add_valid+0x28/0x1a0 Read of size 8 at addr ffff88814b13f378 by task mount/710 CPU: 1 PID: 710 Comm: mount Not tainted 6.1.0-rc3-next #370 Call Trace: dump_stack_lvl+0x73/0x9f print_report+0x25d/0x759 kasan_report+0xc0/0x120 __asan_load8+0x99/0x140 __list_add_valid+0x28/0x1a0 ext4_orphan_cleanup+0x564/0x9d0 [ext4] __ext4_fill_super+0x48e2/0x5300 [ext4] ext4_fill_super+0x19f/0x3a0 [ext4] get_tree_bdev+0x27b/0x450 ext4_get_tree+0x19/0x30 [ext4] vfs_get_tree+0x49/0x150 path_mount+0xaae/0x1350 do_mount+0xe2/0x110 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] ================================================================== Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup --- loop1: assume last_orphan is 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan) ext4_truncate --> return 0 ext4_inode_attach_jinode --> return -ENOMEM iput(inode) --> free inode<12> --- loop2: last_orphan is still 12 --- list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan); // use inode<12> and trigger UAF To solve this issue, we need to propagate the return value of ext4_inode_attach_jinode() appropriately. Signed-off-by: Baokun Li Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221102080633.1630225-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 970bfd7a4188dc1e6cdb82bad0690f12ffb5c79b Author: Alexander Potapenko Date: Mon Nov 21 12:21:30 2022 +0100 fs: ext4: initialize fsdata in pagecache_write() commit 956510c0c7439e90b8103aaeaf4da92878c622f0 upstream. When aops->write_begin() does not initialize fsdata, KMSAN reports an error passing the latter to aops->write_end(). Fix this by unconditionally initializing fsdata. Cc: Eric Biggers Fixes: c93d8f885809 ("ext4: add basic fs-verity support") Reported-by: syzbot+9767be679ef5016b6082@syzkaller.appspotmail.com Signed-off-by: Alexander Potapenko Reviewed-by: Eric Biggers Link: https://lore.kernel.org/r/20221121112134.407362-1-glider@google.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 744bbde378a53b90287bb7111daaa46a2c849e88 Author: Luís Henriques Date: Tue Oct 11 16:57:58 2022 +0100 ext4: remove trailing newline from ext4_msg() message commit 78742d4d056df7d2fad241c90185d281bf924844 upstream. The ext4_msg() function adds a new line to the message. Remove extra '\n' from call to ext4_msg() in ext4_orphan_cleanup(). Signed-off-by: Luís Henriques Link: https://lore.kernel.org/r/20221011155758.15287-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 7192afa5e4bfa1316a3ad4875562e9b123af7c06 Author: Baokun Li Date: Wed Aug 17 21:27:01 2022 +0800 ext4: add inode table check in __ext4_get_inode_loc to aovid possible infinite loop commit eee22187b53611e173161e38f61de1c7ecbeb876 upstream. In do_writepages, if the value returned by ext4_writepages is "-ENOMEM" and "wbc->sync_mode == WB_SYNC_ALL", retry until the condition is not met. In __ext4_get_inode_loc, if the bh returned by sb_getblk is NULL, the function returns -ENOMEM. In __getblk_slow, if the return value of grow_buffers is less than 0, the function returns NULL. When the three processes are connected in series like the following stack, an infinite loop may occur: do_writepages <--- keep retrying ext4_writepages mpage_map_and_submit_extent mpage_map_one_extent ext4_map_blocks ext4_ext_map_blocks ext4_ext_handle_unwritten_extents ext4_ext_convert_to_initialized ext4_split_extent ext4_split_extent_at __ext4_ext_dirty __ext4_mark_inode_dirty ext4_reserve_inode_write ext4_get_inode_loc __ext4_get_inode_loc <--- return -ENOMEM sb_getblk __getblk_gfp __getblk_slow <--- return NULL grow_buffers grow_dev_page <--- return -ENXIO ret = (block < end_block) ? 1 : -ENXIO; In this issue, bg_inode_table_hi is overwritten as an incorrect value. As a result, `block < end_block` cannot be met in grow_dev_page. Therefore, __ext4_get_inode_loc always returns '-ENOMEM' and do_writepages keeps retrying. As a result, the writeback process is in the D state due to an infinite loop. Add a check on inode table block in the __ext4_get_inode_loc function by referring to ext4_read_inode_bitmap to avoid this infinite loop. Cc: stable@kernel.org Signed-off-by: Baokun Li Reviewed-by: Ritesh Harjani (IBM) Link: https://lore.kernel.org/r/20220817132701.3015912-3-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 0d041b7251c13679a0f6c7926751ce1d8a7237c1 Author: Zhang Yi Date: Wed Jun 29 19:26:46 2022 +0800 ext4: silence the warning when evicting inode with dioread_nolock commit bc12ac98ea2e1b70adc6478c8b473a0003b659d3 upstream. When evicting an inode with default dioread_nolock, it could be raced by the unwritten extents converting kworker after writeback some new allocated dirty blocks. It convert unwritten extents to written, the extents could be merged to upper level and free extent blocks, so it could mark the inode dirty again even this inode has been marked I_FREEING. But the inode->i_io_list check and warning in ext4_evict_inode() missing this corner case. Fortunately, ext4_evict_inode() will wait all extents converting finished before this check, so it will not lead to inode use-after-free problem, every thing is OK besides this warning. The WARN_ON_ONCE was originally designed for finding inode use-after-free issues in advance, but if we add current dioread_nolock case in, it will become not quite useful, so fix this warning by just remove this check. ====== WARNING: CPU: 7 PID: 1092 at fs/ext4/inode.c:227 ext4_evict_inode+0x875/0xc60 ... RIP: 0010:ext4_evict_inode+0x875/0xc60 ... Call Trace: evict+0x11c/0x2b0 iput+0x236/0x3a0 do_unlinkat+0x1b4/0x490 __x64_sys_unlinkat+0x4c/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fa933c1115b ====== rm kworker ext4_end_io_end() vfs_unlink() ext4_unlink() ext4_convert_unwritten_io_end_vec() ext4_convert_unwritten_extents() ext4_map_blocks() ext4_ext_map_blocks() ext4_ext_try_to_merge_up() __mark_inode_dirty() check !I_FREEING locked_inode_to_wb_and_lock_list() iput() iput_final() evict() ext4_evict_inode() truncate_inode_pages_final() //wait release io_end inode_io_list_move_locked() ext4_release_io_end() trigger WARN_ON_ONCE() Cc: stable@kernel.org Fixes: ceff86fddae8 ("ext4: Avoid freeing inodes on dirty list") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20220629112647.4141034-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit af4ceb00ebeae5cc025ebeaf858e0a9785acee47 Author: Yuan Can Date: Fri Nov 4 06:45:12 2022 +0000 drm/ingenic: Fix missing platform_driver_unregister() call in ingenic_drm_init() commit 47078311b8efebdefd5b3b2f87e2b02b14f49c66 upstream. A problem about modprobe ingenic-drm failed is triggered with the following log given: [ 303.561088] Error: Driver 'ingenic-ipu' is already registered, aborting... modprobe: ERROR: could not insert 'ingenic_drm': Device or resource busy The reason is that ingenic_drm_init() returns platform_driver_register() directly without checking its return value, if platform_driver_register() failed, it returns without unregistering ingenic_ipu_driver_ptr, resulting the ingenic-drm can never be installed later. A simple call graph is shown as below: ingenic_drm_init() platform_driver_register() # ingenic_ipu_driver_ptr are registered platform_driver_register() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without unregister ingenic_ipu_driver_ptr Fixing this problem by checking the return value of platform_driver_register() and do platform_unregister_drivers() if error happened. Fixes: fc1acf317b01 ("drm/ingenic: Add support for the IPU") Signed-off-by: Yuan Can Cc: stable@vger.kernel.org Signed-off-by: Paul Cercueil Link: https://patchwork.freedesktop.org/patch/msgid/20221104064512.8569-1-yuancan@huawei.com Signed-off-by: Greg Kroah-Hartman commit c919e1154b8c25d0da7b45655f6026761dcef5c6 Author: Mikko Kovanen Date: Sat Nov 26 13:27:13 2022 +0000 drm/i915/dsi: fix VBT send packet port selection for dual link DSI commit f9cdf4130671d767071607d0a7568c9bd36a68d0 upstream. intel_dsi->ports contains bitmask of enabled ports and correspondingly logic for selecting port for VBT packet sending must use port specific bitmask when deciding appropriate port. Fixes: 08c59dde71b7 ("drm/i915/dsi: fix VBT send packet port selection for ICL+") Cc: stable@vger.kernel.org Signed-off-by: Mikko Kovanen Reviewed-by: Jani Nikula Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/DBBPR09MB466592B16885D99ABBF2393A91119@DBBPR09MB4665.eurprd09.prod.outlook.com (cherry picked from commit 8d58bb7991c45f6b60710cc04c9498c6ea96db90) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 6948e570f54f2044dd4da444b10471373a047eeb Author: Zack Rusin Date: Tue Oct 25 23:19:35 2022 -0400 drm/vmwgfx: Validate the box size for the snooped cursor commit 4cf949c7fafe21e085a4ee386bb2dade9067316e upstream. Invalid userspace dma surface copies could potentially overflow the memcpy from the surface to the snooped image leading to crashes. To fix it the dimensions of the copybox have to be validated against the expected size of the snooped cursor. Signed-off-by: Zack Rusin Fixes: 2ac863719e51 ("vmwgfx: Snoop DMA transfers with non-covering sizes") Cc: # v3.2+ Reviewed-by: Michael Banack Reviewed-by: Martin Krastev Link: https://patchwork.freedesktop.org/patch/msgid/20221026031936.1004280-1-zack@kde.org Signed-off-by: Greg Kroah-Hartman commit 5594fde1ef533ffb2fe8a132d9edb5120148389d Author: Simon Ser Date: Mon Oct 17 15:32:01 2022 +0000 drm/connector: send hotplug uevent on connector cleanup commit 6fdc2d490ea1369d17afd7e6eb66fecc5b7209bc upstream. A typical DP-MST unplug removes a KMS connector. However care must be taken to properly synchronize with user-space. The expected sequence of events is the following: 1. The kernel notices that the DP-MST port is gone. 2. The kernel marks the connector as disconnected, then sends a uevent to make user-space re-scan the connector list. 3. User-space notices the connector goes from connected to disconnected, disables it. 4. Kernel handles the IOCTL disabling the connector. On success, the very last reference to the struct drm_connector is dropped and drm_connector_cleanup() is called. 5. The connector is removed from the list, and a uevent is sent to tell user-space that the connector disappeared. The very last step was missing. As a result, user-space thought the connector still existed and could try to disable it again. Since the kernel no longer knows about the connector, that would end up with EINVAL and confused user-space. Fix this by sending a hotplug uevent from drm_connector_cleanup(). Signed-off-by: Simon Ser Cc: stable@vger.kernel.org Cc: Daniel Vetter Cc: Lyude Paul Cc: Jonas Ådahl Tested-by: Jonas Ådahl Reviewed-by: Lyude Paul Link: https://patchwork.freedesktop.org/patch/msgid/20221017153150.60675-2-contact@emersion.fr Signed-off-by: Greg Kroah-Hartman commit 317ebe61a6d4681c63e90a679740b6aca0aff498 Author: Wang Weiyang Date: Tue Oct 25 19:31:01 2022 +0800 device_cgroup: Roll back to original exceptions after copy failure commit e68bfbd3b3c3a0ec3cf8c230996ad8cabe90322f upstream. When add the 'a *:* rwm' entry to devcgroup A's whitelist, at first A's exceptions will be cleaned and A's behavior is changed to DEVCG_DEFAULT_ALLOW. Then parent's exceptions will be copyed to A's whitelist. If copy failure occurs, just return leaving A to grant permissions to all devices. And A may grant more permissions than parent. Backup A's whitelist and recover original exceptions after copy failure. Cc: stable@vger.kernel.org Fixes: 4cef7299b478 ("device_cgroup: add proper checking when changing default behavior") Signed-off-by: Wang Weiyang Reviewed-by: Aristeu Rozanski Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit ac838c663ba1fd6bff35a817fd89a47ab55e88e0 Author: Shang XiaoJing Date: Thu Nov 17 10:45:14 2022 +0800 parisc: led: Fix potential null-ptr-deref in start_task() commit 41f563ab3c33698bdfc3403c7c2e6c94e73681e4 upstream. start_task() calls create_singlethread_workqueue() and not checked the ret value, which may return NULL. And a null-ptr-deref may happen: start_task() create_singlethread_workqueue() # failed, led_wq is NULL queue_delayed_work() queue_delayed_work_on() __queue_delayed_work() # warning here, but continue __queue_work() # access wq->flags, null-ptr-deref Check the ret value and return -ENOMEM if it is NULL. Fixes: 3499495205a6 ("[PARISC] Use work queue in LED/LCD driver instead of tasklet.") Signed-off-by: Shang XiaoJing Signed-off-by: Helge Deller Cc: Signed-off-by: Greg Kroah-Hartman commit 2c1881f0816aa1a35f808808bde94fbf9cd491e2 Author: Maria Yu Date: Tue Dec 6 09:59:57 2022 +0800 remoteproc: core: Do pm_relax when in RPROC_OFFLINE state commit 11c7f9e3131ad14b27a957496088fa488b153a48 upstream. Make sure that pm_relax() happens even when the remoteproc is stopped before the crash handler work is scheduled. Signed-off-by: Maria Yu Cc: stable Fixes: a781e5aa5911 ("remoteproc: core: Prevent system suspend during remoteproc recovery") Link: https://lore.kernel.org/r/20221206015957.2616-2-quic_aiquny@quicinc.com Signed-off-by: Mathieu Poirier Signed-off-by: Greg Kroah-Hartman commit 9b615f957ca78dbe1864676fc699331e63c70e4b Author: Kim Phillips Date: Mon Sep 19 10:56:37 2022 -0500 iommu/amd: Fix ivrs_acpihid cmdline parsing code commit 5f18e9f8868c6d4eae71678e7ebd4977b7d8c8cf upstream. The second (UID) strcmp in acpi_dev_hid_uid_match considers "0" and "00" different, which can prevent device registration. Have the AMD IOMMU driver's ivrs_acpihid parsing code remove any leading zeroes to make the UID strcmp succeed. Now users can safely specify "AMDxxxxx:00" or "AMDxxxxx:0" and expect the same behaviour. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Signed-off-by: Kim Phillips Cc: stable@vger.kernel.org Cc: Suravee Suthikulpanit Cc: Joerg Roedel Link: https://lore.kernel.org/r/20220919155638.391481-1-kim.phillips@amd.com Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 35b792179b108000ed5e8cae047f3f6bf75eea2f Author: Johan Hovold Date: Mon Nov 14 09:13:43 2022 +0100 phy: qcom-qmp-combo: fix sc8180x reset commit 910dd4883d757af5faac92590f33f0f7da963032 upstream. The SC8180X has two resets but the DP configuration erroneously described only one. In case the DP part of the PHY is initialised before the USB part (e.g. depending on probe order), then only the first reset would be asserted. Fixes: 1633802cd4ac ("phy: qcom: qmp: Add SC8180x USB/DP combo") Cc: stable@vger.kernel.org # 5.15 Reviewed-by: Dmitry Baryshkov Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20221114081346.5116-4-johan+linaro@kernel.org Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit dfd05a13355693dd55beea21a49eed0d9dff1217 Author: Isaac J. Manjarres Date: Tue Sep 20 17:14:13 2022 -0700 driver core: Fix bus_type.match() error handling in __driver_attach() commit 27c0d217340e47ec995557f61423ef415afba987 upstream. When a driver registers with a bus, it will attempt to match with every device on the bus through the __driver_attach() function. Currently, if the bus_type.match() function encounters an error that is not -EPROBE_DEFER, __driver_attach() will return a negative error code, which causes the driver registration logic to stop trying to match with the remaining devices on the bus. This behavior is not correct; a failure while matching a driver to a device does not mean that the driver won't be able to match and bind with other devices on the bus. Update the logic in __driver_attach() to reflect this. Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()") Cc: stable@vger.kernel.org Cc: Saravana Kannan Signed-off-by: Isaac J. Manjarres Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com Signed-off-by: Greg Kroah-Hartman commit 44618a3397412f202972c2f2be1bdeacb74fd615 Author: Mario Limonciello Date: Wed Sep 28 13:45:05 2022 -0500 crypto: ccp - Add support for TEE for PCI ID 0x14CA commit 10da230a4df1dfe32a58eb09246f5ffe82346f27 upstream. SoCs containing 0x14CA are present both in datacenter parts that support SEV as well as client parts that support TEE. Cc: stable@vger.kernel.org # 5.15+ Tested-by: Rijo-john Thomas Signed-off-by: Mario Limonciello Acked-by: Tom Lendacky Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit c55507a94bc66b3a4b15de3dcc8480c67d17771d Author: Corentin Labbe Date: Thu Oct 6 04:34:19 2022 +0000 crypto: n2 - add missing hash statesize commit 76a4e874593543a2dff91d249c95bac728df2774 upstream. Add missing statesize to hash templates. This is mandatory otherwise no algorithms can be registered as the core requires statesize to be set. CC: stable@kernel.org # 4.3+ Reported-by: Rolf Eike Beer Tested-by: Rolf Eike Beer Fixes: 0a625fd2abaa ("crypto: n2 - Add Niagara2 crypto driver") Signed-off-by: Corentin Labbe Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 48307506964ef1f8bff0d2dd949ade3d89778dfc Author: Sergey Matyukevich Date: Mon Aug 29 23:52:19 2022 +0300 riscv: mm: notify remote harts about mmu cache updates commit 4bd1d80efb5af640f99157f39b50fb11326ce641 upstream. Current implementation of update_mmu_cache function performs local TLB flush. It does not take into account ASID information. Besides, it does not take into account other harts currently running the same mm context or possible migration of the running context to other harts. Meanwhile TLB flush is not performed for every context switch if ASID support is enabled. Patch [1] proposed to add ASID support to update_mmu_cache to avoid flushing local TLB entirely. This patch takes into account other harts currently running the same mm context as well as possible migration of this context to other harts. For this purpose the approach from flush_icache_mm is reused. Remote harts currently running the same mm context are informed via SBI calls that they need to flush their local TLBs. All the other harts are marked as needing a deferred TLB flush when this mm context runs on them. [1] https://lore.kernel.org/linux-riscv/20220821013926.8968-1-tjytimi@163.com/ Signed-off-by: Sergey Matyukevich Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/#t Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 16b6d9525da60c0fc5f0a333cc500693e61ba9b3 Author: Guo Ren Date: Wed Nov 9 01:49:36 2022 -0500 riscv: stacktrace: Fixup ftrace_graph_ret_addr retp argument commit 5c3022e4a616d800cf5f4c3a981d7992179e44a1 upstream. The 'retp' is a pointer to the return address on the stack, so we must pass the current return address pointer as the 'retp' argument to ftrace_push_return_trace(). Not parent function's return address on the stack. Fixes: b785ec129bd9 ("riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support") Signed-off-by: Guo Ren Signed-off-by: Guo Ren Link: https://lore.kernel.org/r/20221109064937.3643993-2-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 657b440a270c094fda9fe3a35f4e183bf61c59a2 Author: Sascha Hauer Date: Tue Nov 8 17:05:59 2022 -0600 PCI/sysfs: Fix double free in error path commit aa382ffa705bea9931ec92b6f3c70e1fdb372195 upstream. When pci_create_attr() fails, pci_remove_resource_files() is called which will iterate over the res_attr[_wc] arrays and frees every non NULL entry. To avoid a double free here set the array entry only after it's clear we successfully initialized it. Fixes: b562ec8f74e4 ("PCI: Don't leak memory if sysfs_create_bin_file() fails") Link: https://lore.kernel.org/r/20221007070735.GX986@pengutronix.de/ Signed-off-by: Sascha Hauer Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 67fd41bbb0f51aa648a47f728b99e6f1fa2ccc34 Author: Michael S. Tsirkin Date: Wed Oct 26 02:11:21 2022 -0400 PCI: Fix pci_device_is_present() for VFs by checking PF commit 98b04dd0b4577894520493d96bc4623387767445 upstream. pci_device_is_present() previously didn't work for VFs because it reads the Vendor and Device ID, which are 0xffff for VFs, which looks like they aren't present. Check the PF instead. Wei Gong reported that if virtio I/O is in progress when the driver is unbound or "0" is written to /sys/.../sriov_numvfs, the virtio I/O operation hangs, which may result in output like this: task:bash state:D stack: 0 pid: 1773 ppid: 1241 flags:0x00004002 Call Trace: schedule+0x4f/0xc0 blk_mq_freeze_queue_wait+0x69/0xa0 blk_mq_freeze_queue+0x1b/0x20 blk_cleanup_queue+0x3d/0xd0 virtblk_remove+0x3c/0xb0 [virtio_blk] virtio_dev_remove+0x4b/0x80 ... device_unregister+0x1b/0x60 unregister_virtio_device+0x18/0x30 virtio_pci_remove+0x41/0x80 pci_device_remove+0x3e/0xb0 This happened because pci_device_is_present(VF) returned "false" in virtio_pci_remove(), so it called virtio_break_device(). The broken vq meant that vring_interrupt() skipped the vq.callback() that would have completed the virtio I/O operation via virtblk_done(). [bhelgaas: commit log, simplify to always use pci_physfn(), add stable tag] Link: https://lore.kernel.org/r/20221026060912.173250-1-mst@redhat.com Reported-by: Wei Gong Tested-by: Wei Gong Signed-off-by: Michael S. Tsirkin Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit bfce073089cb81482521c65061835aaa6d1a6cc0 Author: Dan Carpenter Date: Tue Nov 15 16:17:43 2022 +0300 ipmi: fix use after free in _ipmi_destroy_user() commit a92ce570c81dc0feaeb12a429b4bc65686d17967 upstream. The intf_free() function frees the "intf" pointer so we cannot dereference it again on the next line. Fixes: cbb79863fc31 ("ipmi: Don't allow device module unload when in use") Signed-off-by: Dan Carpenter Message-Id: Cc: # 5.5+ Signed-off-by: Corey Minyard Signed-off-by: Greg Kroah-Hartman commit 3b4984035c404dc20eec0e4b3a02f47aa68866ba Author: Huaxin Lu Date: Thu Nov 3 00:09:49 2022 +0800 ima: Fix a potential NULL pointer access in ima_restore_measurement_list commit 11220db412edae8dba58853238f53258268bdb88 upstream. In restore_template_fmt, when kstrdup fails, a non-NULL value will still be returned, which causes a NULL pointer access in template_desc_init_fields. Fixes: c7d09367702e ("ima: support restoring multiple template formats") Cc: stable@kernel.org Co-developed-by: Jiaming Li Signed-off-by: Jiaming Li Signed-off-by: Huaxin Lu Reviewed-by: Stefan Berger Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit a843699f1665835666c711eef2f002c81fa22daa Author: Alexander Sverdlin Date: Fri Nov 19 09:14:12 2021 +0100 mtd: spi-nor: Check for zero erase size in spi_nor_find_best_erase_type() commit 2ebc336be08160debfe27f87660cf550d710f3e9 upstream. Erase can be zeroed in spi_nor_parse_4bait() or spi_nor_init_non_uniform_erase_map(). In practice it happened with mt25qu256a, which supports 4K, 32K, 64K erases with 3b address commands, but only 4K and 64K erase with 4b address commands. Fixes: dc92843159a7 ("mtd: spi-nor: fix erase_type array to indicate current map conf") Signed-off-by: Alexander Sverdlin Signed-off-by: Tudor Ambarus Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211119081412.29732-1-alexander.sverdlin@nokia.com Signed-off-by: Greg Kroah-Hartman commit 24f4649cd8fcbb9ab988933890257f450260fd3d Author: Zhang Yuchen Date: Fri Oct 7 17:26:16 2022 +0800 ipmi: fix long wait in unload when IPMI disconnect commit f6f1234d98cce69578bfac79df147a1f6660596c upstream. When fixing the problem mentioned in PATCH1, we also found the following problem: If the IPMI is disconnected and in the sending process, the uninstallation driver will be stuck for a long time. The main problem is that uninstalling the driver waits for curr_msg to be sent or HOSED. After stopping tasklet, the only place to trigger the timeout mechanism is the circular poll in shutdown_smi. The poll function delays 10us and calls smi_event_handler(smi_info,10). Smi_event_handler deducts 10us from kcs->ibf_timeout. But the poll func is followed by schedule_timeout_uninterruptible(1). The time consumed here is not counted in kcs->ibf_timeout. So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has actually passed. The waiting time has increased by more than a hundredfold. Now instead of calling poll(). call smi_event_handler() directly and calculate the elapsed time. For verification, you can directly use ebpf to check the kcs-> ibf_timeout for each call to kcs_event() when IPMI is disconnected. Decrement at normal rate before unloading. The decrement rate becomes very slow after unloading. $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n", *(arg0+584));}' Signed-off-by: Zhang Yuchen Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com> Signed-off-by: Corey Minyard Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit fa6bbb4894b9b947063c6ff90018a954c5f9f4b3 Author: Maximilian Luz Date: Thu Sep 8 00:44:09 2022 +0200 ipu3-imgu: Fix NULL pointer dereference in imgu_subdev_set_selection() commit dc608edf7d45ba0c2ad14c06eccd66474fec7847 upstream. Calling v4l2_subdev_get_try_crop() and v4l2_subdev_get_try_compose() with a subdev state of NULL leads to a NULL pointer dereference. This can currently happen in imgu_subdev_set_selection() when the state passed in is NULL, as this method first gets pointers to both the "try" and "active" states and only then decides which to use. The same issue has been addressed for imgu_subdev_get_selection() with commit 30d03a0de650 ("ipu3-imgu: Fix NULL pointer dereference in active selection access"). However the issue still persists in imgu_subdev_set_selection(). Therefore, apply a similar fix as done in the aforementioned commit to imgu_subdev_set_selection(). To keep things a bit cleaner, introduce helper functions for "crop" and "compose" access and use them in both imgu_subdev_set_selection() and imgu_subdev_get_selection(). Fixes: 0d346d2a6f54 ("media: v4l2-subdev: add subdev-wide state struct") Cc: stable@vger.kernel.org # for v5.14 and later Signed-off-by: Maximilian Luz Signed-off-by: Sakari Ailus Signed-off-by: Greg Kroah-Hartman commit cdb208b090f3249b27ca12b3beb27dc284da9dc4 Author: Aidan MacDonald Date: Sun Oct 23 15:33:20 2022 +0100 ASoC: jz4740-i2s: Handle independent FIFO flush bits commit 8b3a9ad86239f80ed569e23c3954a311f66481d6 upstream. On the JZ4740, there is a single bit that flushes (empties) both the transmit and receive FIFO. Later SoCs have independent flush bits for each FIFO. Independent FIFOs can be flushed before the snd_soc_dai_active() check because it won't disturb other active streams. This ensures that the FIFO we're about to use is always flushed before starting up. With shared FIFOs we can't do that because if another substream is active, flushing its FIFO would cause underrun errors. This also fixes a bug: since we were only setting the JZ4740's flush bit, which corresponds to the TX FIFO flush bit on other SoCs, other SoCs were not having their RX FIFO flushed at all. Fixes: 967beb2e8777 ("ASoC: jz4740: Add jz4780 support") Reviewed-by: Paul Cercueil Cc: stable@vger.kernel.org Signed-off-by: Aidan MacDonald Link: https://lore.kernel.org/r/20221023143328.160866-2-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 2d0d083d8ae6d3ed6d44358e3f2f60d93e40ed28 Author: Michael Walle Date: Thu Oct 27 19:12:21 2022 +0200 wifi: wilc1000: sdio: fix module autoloading commit 57d545b5a3d6ce3a8fb6b093f02bfcbb908973f3 upstream. There are no SDIO module aliases included in the driver, therefore, module autoloading isn't working. Add the proper MODULE_DEVICE_TABLE(). Cc: stable@vger.kernel.org Signed-off-by: Michael Walle Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20221027171221.491937-1-michael@walle.cc Signed-off-by: Greg Kroah-Hartman commit 2e4a088804c1268d2614c5c9b1eb5019129b8636 Author: Aditya Garg Date: Thu Oct 27 10:01:43 2022 +0000 efi: Add iMac Pro 2017 to uefi skip cert quirk commit 0be56a116220f9e5731a6609e66a11accfe8d8e2 upstream. The iMac Pro 2017 is also a T2 Mac. Thus add it to the list of uefi skip cert. Cc: stable@vger.kernel.org Fixes: 155ca952c7ca ("efi: Do not import certificates from UEFI Secure Boot for T2 Macs") Link: https://lore.kernel.org/linux-integrity/9D46D92F-1381-4F10-989C-1A12CD2FFDD8@live.com/ Signed-off-by: Aditya Garg Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit c49fb9b760d34cde062afcee0654e9e8f9e3286b Author: Florian-Ewald Mueller Date: Tue Oct 25 09:37:05 2022 +0200 md/bitmap: Fix bitmap chunk size overflow issues commit 4555211190798b6b6fa2c37667d175bf67945c78 upstream. - limit bitmap chunk size internal u64 variable to values not overflowing the u32 bitmap superblock structure variable stored on persistent media - assign bitmap chunk size internal u64 variable from unsigned values to avoid possible sign extension artifacts when assigning from a s32 value The bug has been there since at least kernel 4.0. Steps to reproduce it: 1: mdadm -C /dev/mdx -l 1 --bitmap=internal --bitmap-chunk=256M -e 1.2 -n2 /dev/rnbd1 /dev/rnbd2 2 resize member device rnbd1 and rnbd2 to 8 TB 3 mdadm --grow /dev/mdx --size=max The bitmap_chunksize will overflow without patch. Cc: stable@vger.kernel.org Signed-off-by: Florian-Ewald Mueller Signed-off-by: Jack Wang Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman commit 94fe975d54ab8877d215a0b14543f4383f70c461 Author: Damien Le Moal Date: Thu Nov 24 11:12:08 2022 +0900 block: mq-deadline: Do not break sequential write streams to zoned HDDs commit 015d02f48537cf2d1a65eeac50717566f9db6eec upstream. mq-deadline ensures an in order dispatching of write requests to zoned block devices using a per zone lock (a bit). This implies that for any purely sequential write workload, the drive is exercised most of the time at a maximum queue depth of one. However, when such sequential write workload crosses a zone boundary (when sequentially writing multiple contiguous zones), zone write locking may prevent the last write to one zone to be issued (as the previous write is still being executed) but allow the first write to the following zone to be issued (as that zone is not yet being writen and not locked). This result in an out of order delivery of the sequential write commands to the device every time a zone boundary is crossed. While such behavior does not break the sequential write constraint of zoned block devices (and does not generate any write error), some zoned hard-disks react badly to seeing these out of order writes, resulting in lower write throughput. This problem can be addressed by always dispatching the first request of a stream of sequential write requests, regardless of the zones targeted by these sequential writes. To do so, the function deadline_skip_seq_writes() is introduced and used in deadline_next_request() to select the next write command to issue if the target device is an HDD (blk_queue_nonrot() being false). deadline_fifo_request() is modified using the new deadline_earlier_request() and deadline_is_seq_write() helpers to ignore requests in the fifo list that have a preceding request in lba order that is sequential. With this fix, a sequential write workload executed with the following fio command: fio --name=seq-write --filename=/dev/sda --zonemode=zbd --direct=1 \ --size=68719476736 --ioengine=libaio --iodepth=32 --rw=write \ --bs=65536 results in an increase from 225 MB/s to 250 MB/s of the write throughput of an SMR HDD (11% increase). Cc: Signed-off-by: Damien Le Moal Reviewed-by: Johannes Thumshirn Link: https://lore.kernel.org/r/20221124021208.242541-3-damien.lemoal@opensource.wdc.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 8e91679f7bd2fb05089040b1a2727518aea4c1a9 Author: Ian Abbott Date: Thu Oct 27 17:32:49 2022 +0100 rtc: ds1347: fix value written to century register commit 4dfe05bdc1ade79b943d4979a2e2a8b5ef68fbb5 upstream. In `ds1347_set_time()`, the wrong value is being written to the `DS1347_CENTURY_REG` register. It needs to be converted to BCD. Fix it. Fixes: 147dae76dbb9 ("rtc: ds1347: handle century register") Cc: # v5.5+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20221027163249.447416-1-abbotti@mev.co.uk Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 5eb8296d73da5319e908591560498ad377f4573f Author: Steve French Date: Sun Dec 11 13:54:21 2022 -0600 cifs: fix missing display of three mount options commit 2bfd81043e944af0e52835ef6d9b41795af22341 upstream. Three mount options: "tcpnodelay" and "noautotune" and "noblocksend" were not displayed when passed in on cifs/smb3 mounts (e.g. displayed in /proc/mounts e.g.). No change to defaults so these are not displayed if not specified on mount. Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit cfa9f66f917290412af77204a2de63caab4143b4 Author: Paulo Alcantara Date: Fri Dec 16 22:03:41 2022 -0300 cifs: fix confusing debug message commit a85ceafd41927e41a4103d228a993df7edd8823b upstream. Since rc was initialised to -ENOMEM in cifs_get_smb_ses(), when an existing smb session was found, free_xid() would be called and then print CIFS: fs/cifs/connect.c: Existing tcp session with server found CIFS: fs/cifs/connect.c: VFS: in cifs_get_smb_ses as Xid: 44 with uid: 0 CIFS: fs/cifs/connect.c: Existing smb sess found (status=1) CIFS: fs/cifs/connect.c: VFS: leaving cifs_get_smb_ses (xid = 44) rc = -12 Fix this by initialising rc to 0 and then let free_xid() print this instead CIFS: fs/cifs/connect.c: Existing tcp session with server found CIFS: fs/cifs/connect.c: VFS: in cifs_get_smb_ses as Xid: 14 with uid: 0 CIFS: fs/cifs/connect.c: Existing smb sess found (status=1) CIFS: fs/cifs/connect.c: VFS: leaving cifs_get_smb_ses (xid = 14) rc = 0 Signed-off-by: Paulo Alcantara (SUSE) Cc: stable@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 8b45a3b19a2e909e830d09a90a7e1ec8601927d9 Author: Takashi Iwai Date: Mon Oct 31 11:02:45 2022 +0100 media: dvb-core: Fix UAF due to refcount races at releasing commit fd3d91ab1c6ab0628fe642dd570b56302c30a792 upstream. The dvb-core tries to sync the releases of opened files at dvb_dmxdev_release() with two refcounts: dvbdev->users and dvr_dvbdev->users. A problem is present in those two syncs: when yet another dvb_demux_open() is called during those sync waits, dvb_demux_open() continues to process even if the device is being closed. This includes the increment of the former refcount, resulting in the leftover refcount after the sync of the latter refcount at dvb_dmxdev_release(). It ends up with use-after-free, since the function believes that all usages were gone and releases the resources. This patch addresses the problem by adding the check of dmxdev->exit flag at dvb_demux_open(), just like dvb_dvr_open() already does. With the exit flag check, the second call of dvb_demux_open() fails, hence the further corruption can be avoided. Also for avoiding the races of the dmxdev->exit flag reference, this patch serializes the dmxdev->exit set up and the sync waits with the dmxdev->mutex lock at dvb_dmxdev_release(). Without the mutex lock, dvb_demux_open() (or dvb_dvr_open()) may run concurrently with dvb_dmxdev_release(), which allows to skip the exit flag check and continue the open process that is being closed. CVE-2022-41218 is assigned to those bugs above. Reported-by: Hyunwoo Kim Cc: Link: https://lore.kernel.org/20220908132754.30532-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit acf984a3718c2458eb9e08b6714490a04f213c58 Author: Keita Suzuki Date: Tue Apr 26 06:29:19 2022 +0100 media: dvb-core: Fix double free in dvb_register_device() commit 6b0d0477fce747d4137aa65856318b55fba72198 upstream. In function dvb_register_device() -> dvb_register_media_device() -> dvb_create_media_entity(), dvb->entity is allocated and initialized. If the initialization fails, it frees the dvb->entity, and return an error code. The caller takes the error code and handles the error by calling dvb_media_device_free(), which unregisters the entity and frees the field again if it is not NULL. As dvb->entity may not NULLed in dvb_create_media_entity() when the allocation of dvbdev->pad fails, a double free may occur. This may also cause an Use After free in media_device_unregister_entity(). Fix this by storing NULL to dvb->entity when it is freed. Link: https://lore.kernel.org/linux-media/20220426052921.2088416-1-keitasuzuki.park@sslab.ics.keio.ac.jp Fixes: fcd5ce4b3936 ("media: dvb-core: fix a memory leak bug") Cc: stable@vger.kernel.org Cc: Wenwen Wang Signed-off-by: Keita Suzuki Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 5fac317bee18ec02eae59d5d135ecd952bb5a0c4 Author: Nick Desaulniers Date: Tue Oct 11 20:00:12 2022 +0100 ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod commit 3220022038b9a3845eea762af85f1c5694b9f861 upstream. clang-15's ability to elide loops completely became more aggressive when it can deduce how a variable is being updated in a loop. Counting down one variable by an increment of another can be replaced by a modulo operation. For 64b variables on 32b ARM EABI targets, this can result in the compiler generating calls to __aeabi_uldivmod, which it does for a do while loop in float64_rem(). For the kernel, we'd generally prefer that developers not open code 64b division via binary / operators and instead use the more explicit helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can result in linkage failures due to undefined references to __aeabi_uldivmod(). While developers can avoid open coding divisions on 64b variables, the compiler doesn't know that the Linux kernel has a partial implementation of a compiler runtime (--rtlib) to enforce this convention. It's also undecidable for the compiler whether the code in question would be faster to execute the loop vs elide it and do the 64b division. While I actively avoid using the internal -mllvm command line flags, I think we get better code than using barrier() here, which will force reloads+spills in the loop for all toolchains. Link: https://github.com/ClangBuiltLinux/linux/issues/1666 Reported-by: Nathan Chancellor Reviewed-by: Arnd Bergmann Signed-off-by: Nick Desaulniers Tested-by: Nathan Chancellor Cc: stable@vger.kernel.org Signed-off-by: Russell King (Oracle) Signed-off-by: Greg Kroah-Hartman commit ce50c612458091d926ccb05d7db11d9f93532db2 Author: Luca Ceresoli Date: Wed Nov 2 12:01:02 2022 +0100 staging: media: tegra-video: fix device_node use after free commit c4d344163c3a7f90712525f931a6c016bbb35e18 upstream. At probe time this code path is followed: * tegra_csi_init * tegra_csi_channels_alloc * for_each_child_of_node(node, channel) -- iterates over channels * automatically gets 'channel' * tegra_csi_channel_alloc() * saves into chan->of_node a pointer to the channel OF node * automatically gets and puts 'channel' * now the node saved in chan->of_node has refcount 0, can disappear * tegra_csi_channels_init * iterates over channels * tegra_csi_channel_init -- uses chan->of_node After that, chan->of_node keeps storing the node until the device is removed. of_node_get() the node and of_node_put() it during teardown to avoid any risk. Fixes: 1ebaeb09830f ("media: tegra-video: Add support for external sensor capture") Cc: stable@vger.kernel.org Cc: Sowjanya Komatineni Signed-off-by: Luca Ceresoli Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 6b16758215f668f7792abb66beb5ba0f7e5bd955 Author: Luca Ceresoli Date: Wed Nov 2 12:01:01 2022 +0100 staging: media: tegra-video: fix chan->mipi value on error commit 10b5ce6743c839fa75336042c64e2479caec9430 upstream. chan->mipi takes the return value of tegra_mipi_request() which can be a valid pointer or an error. However chan->mipi is checked in several places, including error-cleanup code in tegra_csi_channels_cleanup(), as 'if (chan->mipi)', which suggests the initial intent was that chan->mipi should be either NULL or a valid pointer, never an error. As a consequence, cleanup code in case of tegra_mipi_request() errors would dereference an invalid pointer. Fix by ensuring chan->mipi always contains either NULL or a void pointer. Also add that to the documentation. Fixes: 523c857e34ce ("media: tegra-video: Add CSI MIPI pads calibration") Cc: stable@vger.kernel.org Reported-by: Dan Carpenter Signed-off-by: Luca Ceresoli Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 4f5de49d8c52c223a35c062a93c66ddfd3f3d5a4 Author: Yang Jihong Date: Tue Nov 29 19:30:09 2022 +0800 tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line commit c1ac03af6ed45d05786c219d102f37eb44880f28 upstream. print_trace_line may overflow seq_file buffer. If the event is not consumed, the while loop keeps peeking this event, causing a infinite loop. Link: https://lkml.kernel.org/r/20221129113009.182425-1-yangjihong1@huawei.com Cc: Masami Hiramatsu Cc: stable@vger.kernel.org Fixes: 088b1e427dbba ("ftrace: pipe fixes") Signed-off-by: Yang Jihong Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 17becbc4dd67979623eb2e5a3198057af4c61039 Author: Steven Rostedt (Google) Date: Tue Nov 22 12:23:45 2022 -0500 tracing/probes: Handle system names with hyphens commit 575b76cb885532aae13a9d979fd476bb2b156cb9 upstream. When creating probe names, a check is done to make sure it matches basic C standard variable naming standards. Basically, starts with alphabetic or underline, and then the rest of the characters have alpha-numeric or underline in them. But system names do not have any true naming conventions, as they are created by the TRACE_SYSTEM macro and nothing tests to see what they are. The "xhci-hcd" trace events has a '-' in the system name. When trying to attach a eprobe to one of these trace points, it fails because the system name does not follow the variable naming convention because of the hyphen, and the eprobe checks fail on this. Allow hyphens in the system name so that eprobes can attach to the "xhci-hcd" trace events. Link: https://lore.kernel.org/all/Y3eJ8GiGnEvVd8%2FN@macondo/ Link: https://lore.kernel.org/linux-trace-kernel/20221122122345.160f5077@gandalf.local.home Cc: Masami Hiramatsu Cc: stable@vger.kernel.org Fixes: 5b7a96220900e ("tracing/probe: Check event/group naming rule at parsing") Reported-by: Rafael Mendonca Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 2442e655a6932542c78b2e6b8e9a544c59be6ba7 Author: Zheng Yejian Date: Wed Dec 7 11:46:35 2022 +0800 tracing/hist: Fix wrong return value in parse_action_params() commit 2cc6a528882d0e0ccbc1bca5f95b8c963cedac54 upstream. When number of synth fields is more than SYNTH_FIELDS_MAX, parse_action_params() should return -EINVAL. Link: https://lore.kernel.org/linux-trace-kernel/20221207034635.2253990-1-zhengyejian1@huawei.com Cc: Cc: Cc: stable@vger.kernel.org Fixes: c282a386a397 ("tracing: Add 'onmatch' hist trigger action support") Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 2a81ff5ce893cf2c5875e55e9558b2c408846e3a Author: Masami Hiramatsu (Google) Date: Tue Dec 6 23:18:01 2022 +0900 tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE commit e25e43a4e5d8cb2323553d8b6a7ba08d2ebab21f upstream. Both CONFIG_OSNOISE_TRACER and CONFIG_HWLAT_TRACER partially enables the CONFIG_TRACER_MAX_TRACE code, but that is complicated and has introduced a bug; It declares tracing_max_lat_fops data structure outside of #ifdefs, but since it is defined only when CONFIG_TRACER_MAX_TRACE=y or CONFIG_HWLAT_TRACER=y, if only CONFIG_OSNOISE_TRACER=y, that declaration comes to a definition(!). To fix this issue, and do not repeat the similar problem, makes CONFIG_OSNOISE_TRACER and CONFIG_HWLAT_TRACER enables the CONFIG_TRACER_MAX_TRACE always. It has there benefits; - Fix the tracing_max_lat_fops bug - Simplify the #ifdefs - CONFIG_TRACER_MAX_TRACE code is fully enabled, or not. Link: https://lore.kernel.org/linux-trace-kernel/167033628155.4111793.12185405690820208159.stgit@devnote3 Fixes: 424b650f35c7 ("tracing: Fix missing osnoise tracer on max_latency") Cc: Daniel Bristot de Oliveira Cc: stable@vger.kernel.org Reported-by: David Howells Reported-by: kernel test robot Signed-off-by: Masami Hiramatsu (Google) Link: https://lore.kernel.org/all/166992525941.1716618.13740663757583361463.stgit@warthog.procyon.org.uk/ (original thread and v1) Link: https://lore.kernel.org/all/202212052253.VuhZ2ulJ-lkp@intel.com/T/#u (v1 error report) Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit fe8c35c6ffa2add6372d0ee4c3ec1332cba618b5 Author: Steven Rostedt (Google) Date: Thu Nov 17 21:42:49 2022 -0500 tracing: Fix race where eprobes can be called before the event commit d5f30a7da8ea8e6450250275cec5670cee3c4264 upstream. The flag that tells the event to call its triggers after reading the event is set for eprobes after the eprobe is enabled. This leads to a race where the eprobe may be triggered at the beginning of the event where the record information is NULL. The eprobe then dereferences the NULL record causing a NULL kernel pointer bug. Test for a NULL record to keep this from happening. Link: https://lore.kernel.org/linux-trace-kernel/20221116192552.1066630-1-rafaelmendsr@gmail.com/ Link: https://lore.kernel.org/all/20221117214249.2addbe10@gandalf.local.home/ Cc: stable@vger.kernel.org Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events") Reported-by: Rafael Mendonca Signed-off-by: Steven Rostedt (Google) Acked-by: Masami Hiramatsu (Google) Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Greg Kroah-Hartman commit eb20f6ed373304a0a613f024bdc9b03e6ba94fcc Author: Masami Hiramatsu (Google) Date: Mon Dec 19 23:35:19 2022 +0900 x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK commit 63dc6325ff41ee9e570bde705ac34a39c5dbeb44 upstream. Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping speculative execution after function return, kprobe jump optimization always fails on the functions with such INT3 inside the function body. (It already checks the INT3 padding between functions, but not inside the function) To avoid this issue, as same as kprobes, check whether the INT3 comes from kgdb or not, and if so, stop decoding and make it fail. The other INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be treated as a one-byte instruction. Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation") Suggested-by: Peter Zijlstra Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/167146051929.1374301.7419382929328081706.stgit@devnote3 Signed-off-by: Greg Kroah-Hartman commit 3e0fbc06db12807103f66693854dd4900cdf0048 Author: Masami Hiramatsu (Google) Date: Mon Dec 19 23:35:10 2022 +0900 x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK commit 1993bf97992df2d560287f3c4120eda57426843d upstream. Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping speculative execution after RET instruction, kprobes always failes to check the probed instruction boundary by decoding the function body if the probed address is after such sequence. (Note that some conditional code blocks will be placed after function return, if compiler decides it is not on the hot path.) This is because kprobes expects kgdb puts the INT3 as a software breakpoint and it will replace the original instruction. But these INT3 are not such purpose, it doesn't need to recover the original instruction. To avoid this issue, kprobes checks whether the INT3 is owned by kgdb or not, and if so, stop decoding and make it fail. The other INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be treated as a one-byte instruction. Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation") Suggested-by: Peter Zijlstra Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/167146051026.1374301.392728975473572291.stgit@devnote3 Signed-off-by: Greg Kroah-Hartman commit 6268a0704b97e65d2753de9bbc7c8a43ef383494 Author: Steven Rostedt (Google) Date: Fri Dec 9 10:52:47 2022 -0500 ftrace/x86: Add back ftrace_expected for ftrace bug reports commit fd3dc56253acbe9c641a66d312d8393cd55eb04c upstream. After someone reported a bug report with a failed modification due to the expected value not matching what was found, it came to my attention that the ftrace_expected is no longer set when that happens. This makes for debugging the issue a bit more difficult. Set ftrace_expected to the expected code before calling ftrace_bug, so that it shows what was expected and why it failed. Link: https://lore.kernel.org/all/CA+wXwBQ-VhK+hpBtYtyZP-NiX4g8fqRRWithFOHQW-0coQ3vLg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20221209105247.01d4e51d@gandalf.local.home Cc: Masami Hiramatsu Cc: Andrew Morton Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: "x86@kernel.org" Cc: Borislav Petkov Cc: Ingo Molnar Cc: stable@vger.kernel.org Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit c95cf30dd44723016abc490b3f5af48f798819b3 Author: Ashok Raj Date: Tue Nov 29 13:08:27 2022 -0800 x86/microcode/intel: Do not retry microcode reloading on the APs commit be1b670f61443aa5d0d01782e9b8ea0ee825d018 upstream. The retries in load_ucode_intel_ap() were in place to support systems with mixed steppings. Mixed steppings are no longer supported and there is only one microcode image at a time. Any retries will simply reattempt to apply the same image over and over without making progress. [ bp: Zap the circumstantial reasoning from the commit message. ] Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading") Signed-off-by: Ashok Raj Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221129210832.107850-3-ashok.raj@intel.com Signed-off-by: Greg Kroah-Hartman commit f8fe2f41784b690d4daf902f86c58de3041213f3 Author: Sean Christopherson Date: Tue Dec 13 06:23:03 2022 +0000 KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 commit 31de69f4eea77b28a9724b3fa55aae104fc91fc7 upstream. Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the feature is supported in hardware and enabled in KVM's base, non-nested configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported. This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and obviously allows L1 to enable the feature for L2. KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when updating secondary controls in response to KVM_SET_CPUID(2), but (a) that depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction that the guest value must be a strict subset of the supported host value. Although no past commit explicitly enabled nested support for WAITPKG, doing so is safe and functionally correct from an architectural perspective as no additional KVM support is needed to virtualize TPAUSE, UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit"). Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in hardware, i.e. always runs both L1 and L2 with the host's power management settings for TPAUSE and UMWAIT. See commit bf09fb6cba4f ("KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL") for more details. Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions") Cc: stable@vger.kernel.org Reported-by: Aaron Lewis Reported-by: Yu Zhang Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Message-Id: <20221213062306.667649-2-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit ca3483d71bd56542641345f03e157f621b8895ef Author: Sean Christopherson Date: Thu Oct 6 00:19:56 2022 +0000 KVM: nVMX: Inject #GP, not #UD, if "generic" VMXON CR0/CR4 check fails commit 9cc409325ddd776f6fd6293d5ce93ce1248af6e4 upstream. Inject #GP for if VMXON is attempting with a CR0/CR4 that fails the generic "is CRx valid" check, but passes the CR4.VMXE check, and do the generic checks _after_ handling the post-VMXON VM-Fail. The CR4.VMXE check, and all other #UD cases, are special pre-conditions that are enforced prior to pivoting on the current VMX mode, i.e. occur before interception if VMXON is attempted in VMX non-root mode. All other CR0/CR4 checks generate #GP and effectively have lower priority than the post-VMXON check. Per the SDM: IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ... THEN #UD; ELSIF not in VMX operation THEN IF (CPL > 0) or (in A20M mode) or (the values of CR0 and CR4 are not supported in VMX operation) THEN #GP(0); ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSE VMfail("VMXON executed in VMX root operation"); FI; which, if re-written without ELSIF, yields: IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ... THEN #UD IF in VMX non-root operation THEN VMexit; IF CPL > 0 THEN #GP(0) IF in VMX operation THEN VMfail("VMXON executed in VMX root operation"); IF (in A20M mode) or (the values of CR0 and CR4 are not supported in VMX operation) THEN #GP(0); Note, KVM unconditionally forwards VMXON VM-Exits that occur in L2 to L1, i.e. there is no need to check the vCPU is not in VMX non-root mode. Add a comment to explain why unconditionally forwarding such exits is functionally correct. Reported-by: Eric Li Fixes: c7d855c2aff2 ("KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221006001956.329314-1-seanjc@google.com Signed-off-by: Greg Kroah-Hartman commit 2c73b349fd7846c9776377696ddc18cf298b32c0 Author: Sean Christopherson Date: Fri Sep 30 23:31:32 2022 +0000 KVM: VMX: Resume guest immediately when injecting #GP on ECREATE commit eb3992e833d3a17f9b0a3e0371d0b1d3d566f740 upstream. Resume the guest immediately when injecting a #GP on ECREATE due to an invalid enclave size, i.e. don't attempt ECREATE in the host. The #GP is a terminal fault, e.g. skipping the instruction if ECREATE is successful would result in KVM injecting #GP on the instruction following ECREATE. Fixes: 70210c044b4e ("KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions") Cc: stable@vger.kernel.org Cc: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Link: https://lore.kernel.org/r/20220930233132.1723330-1-seanjc@google.com Signed-off-by: Greg Kroah-Hartman commit 4a19f48bee09396b1eaef8995cb6b692d0e6465d Author: Rob Herring Date: Mon Nov 28 14:24:39 2022 -0600 of/kexec: Fix reading 32-bit "linux,initrd-{start,end}" values commit e553ad8d7957697385e81034bf76db3b2cb2cf27 upstream. "linux,initrd-start" and "linux,initrd-end" can be 32-bit values even on a 64-bit platform. Ideally, the size should be based on '#address-cells', but that has never been enforced in the kernel's FDT boot parsing code (early_init_dt_check_for_initrd()). Bootloader behavior is known to vary. For example, kexec always writes these as 64-bit. The result of incorrectly reading 32-bit values is most likely the reserved memory for the original initrd will still be reserved for the new kernel. The original arm64 equivalent of this code failed to release the initrd reserved memory in *all* cases. Use of_read_number() to mirror the early_init_dt_check_for_initrd() code. Fixes: b30be4dc733e ("of: Add a common kexec FDT setup function") Cc: stable@vger.kernel.org Reported-by: Peter Maydell Link: https://lore.kernel.org/r/20221128202440.1411895-1-robh@kernel.org Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 7eddcdb09f6218f30af89c8af1e84cd1daf902ef Author: Namhyung Kim Date: Tue Dec 20 14:31:40 2022 -0800 perf/core: Call LSM hook after copying perf_event_attr commit 0a041ebca4956292cadfb14a63ace3a9c1dcb0a3 upstream. It passes the attr struct to the security_perf_event_open() but it's not initialized yet. Fixes: da97e18458fb ("perf_event: Add support for LSM and SELinux checks") Signed-off-by: Namhyung Kim Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Joel Fernandes (Google) Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20221220223140.4020470-1-namhyung@kernel.org Signed-off-by: Greg Kroah-Hartman commit 15697f653399253f9be4ed2a1e03d795f3cfee94 Author: Zheng Yejian Date: Wed Dec 7 11:51:43 2022 +0800 tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx' commit 82470f7d9044842618c847a7166de2b7458157a7 upstream. When generate a synthetic event with many params and then create a trace action for it [1], kernel panic happened [2]. It is because that in trace_action_create() 'data->n_params' is up to SYNTH_FIELDS_MAX (current value is 64), and array 'data->var_ref_idx' keeps indices into array 'hist_data->var_refs' for each synthetic event param, but the length of 'data->var_ref_idx' is TRACING_MAP_VARS_MAX (current value is 16), so out-of-bound write happened when 'data->n_params' more than 16. In this case, 'data->match_data.event' is overwritten and eventually cause the panic. To solve the issue, adjust the length of 'data->var_ref_idx' to be SYNTH_FIELDS_MAX and add sanity checks to avoid out-of-bound write. [1] # cd /sys/kernel/tracing/ # echo "my_synth_event int v1; int v2; int v3; int v4; int v5; int v6;\ int v7; int v8; int v9; int v10; int v11; int v12; int v13; int v14;\ int v15; int v16; int v17; int v18; int v19; int v20; int v21; int v22;\ int v23; int v24; int v25; int v26; int v27; int v28; int v29; int v30;\ int v31; int v32; int v33; int v34; int v35; int v36; int v37; int v38;\ int v39; int v40; int v41; int v42; int v43; int v44; int v45; int v46;\ int v47; int v48; int v49; int v50; int v51; int v52; int v53; int v54;\ int v55; int v56; int v57; int v58; int v59; int v60; int v61; int v62;\ int v63" >> synthetic_events # echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="bash"' >> \ events/sched/sched_waking/trigger # echo "hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(\ pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\ pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\ pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\ pid,pid,pid,pid,pid,pid,pid,pid,pid)" >> events/sched/sched_switch/trigger [2] BUG: unable to handle page fault for address: ffff91c900000000 PGD 61001067 P4D 61001067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 322 Comm: bash Tainted: G W 6.1.0-rc8+ #229 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:strcmp+0xc/0x30 Code: 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 cc cc cc cc 0f 1f 00 31 c0 eb 08 48 83 c0 01 84 d2 74 13 <0f> b6 14 07 3a 14 06 74 ef 19 c0 83 c8 01 c3 cc cc cc cc 31 c3 RSP: 0018:ffff9b3b00f53c48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffffba958a68 RCX: 0000000000000000 RDX: 0000000000000010 RSI: ffff91c943d33a90 RDI: ffff91c900000000 RBP: ffff91c900000000 R08: 00000018d604b529 R09: 0000000000000000 R10: ffff91c9483eddb1 R11: ffff91ca483eddab R12: ffff91c946171580 R13: ffff91c9479f0538 R14: ffff91c9457c2848 R15: ffff91c9479f0538 FS: 00007f1d1cfbe740(0000) GS:ffff91c9bdc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff91c900000000 CR3: 0000000006316000 CR4: 00000000000006e0 Call Trace: __find_event_file+0x55/0x90 action_create+0x76c/0x1060 event_hist_trigger_parse+0x146d/0x2060 ? event_trigger_write+0x31/0xd0 trigger_process_regex+0xbb/0x110 event_trigger_write+0x6b/0xd0 vfs_write+0xc8/0x3e0 ? alloc_fd+0xc0/0x160 ? preempt_count_add+0x4d/0xa0 ? preempt_count_add+0x70/0xa0 ksys_write+0x5f/0xe0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f1d1d0cf077 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 RSP: 002b:00007ffcebb0e568 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000143 RCX: 00007f1d1d0cf077 RDX: 0000000000000143 RSI: 00005639265aa7e0 RDI: 0000000000000001 RBP: 00005639265aa7e0 R08: 000000000000000a R09: 0000000000000142 R10: 000056392639c017 R11: 0000000000000246 R12: 0000000000000143 R13: 00007f1d1d1ae6a0 R14: 00007f1d1d1aa4a0 R15: 00007f1d1d1a98a0 Modules linked in: CR2: ffff91c900000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:strcmp+0xc/0x30 Code: 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 cc cc cc cc 0f 1f 00 31 c0 eb 08 48 83 c0 01 84 d2 74 13 <0f> b6 14 07 3a 14 06 74 ef 19 c0 83 c8 01 c3 cc cc cc cc 31 c3 RSP: 0018:ffff9b3b00f53c48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffffba958a68 RCX: 0000000000000000 RDX: 0000000000000010 RSI: ffff91c943d33a90 RDI: ffff91c900000000 RBP: ffff91c900000000 R08: 00000018d604b529 R09: 0000000000000000 R10: ffff91c9483eddb1 R11: ffff91ca483eddab R12: ffff91c946171580 R13: ffff91c9479f0538 R14: ffff91c9457c2848 R15: ffff91c9479f0538 FS: 00007f1d1cfbe740(0000) GS:ffff91c9bdc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff91c900000000 CR3: 0000000006316000 CR4: 00000000000006e0 Link: https://lore.kernel.org/linux-trace-kernel/20221207035143.2278781-1-zhengyejian1@huawei.com Cc: Cc: Cc: stable@vger.kernel.org Fixes: d380dcde9a07 ("tracing: Fix now invalid var_ref_vals assumption in trace action") Signed-off-by: Zheng Yejian Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit fd52b86a72480f35fae2c2cde970bebc7ea173be Author: Mike Snitzer Date: Wed Nov 30 14:02:47 2022 -0500 dm cache: set needs_check flag after aborting metadata commit 6b9973861cb2e96dcd0bb0f1baddc5c034207c5c upstream. Otherwise the commit that will be aborted will be associated with the metadata objects that will be torn down. Must write needs_check flag to metadata with a reset block manager. Found through code-inspection (and compared against dm-thin.c). Cc: stable@vger.kernel.org Fixes: 028ae9f76f29 ("dm cache: add fail io mode and needs_check flag") Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit d2a0b298ebf83ab6236f66788a3541e91ce75a70 Author: Luo Meng Date: Tue Nov 29 10:48:49 2022 +0800 dm cache: Fix UAF in destroy() commit 6a459d8edbdbe7b24db42a5a9f21e6aa9e00c2aa upstream. Dm_cache also has the same UAF problem when dm_resume() and dm_destroy() are concurrent. Therefore, cancelling timer again in destroy(). Cc: stable@vger.kernel.org Fixes: c6b4fcbad044e ("dm: add cache target") Signed-off-by: Luo Meng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 856edd0e92f3fe89606b704c86a93daedddfe6ec Author: Luo Meng Date: Tue Nov 29 10:48:48 2022 +0800 dm clone: Fix UAF in clone_dtr() commit e4b5957c6f749a501c464f92792f1c8e26b61a94 upstream. Dm_clone also has the same UAF problem when dm_resume() and dm_destroy() are concurrent. Therefore, cancelling timer again in clone_dtr(). Cc: stable@vger.kernel.org Fixes: 7431b7835f554 ("dm: add clone target") Signed-off-by: Luo Meng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 9215b25f2e105032114e9b92c9783a2a84ee8af9 Author: Luo Meng Date: Tue Nov 29 10:48:50 2022 +0800 dm integrity: Fix UAF in dm_integrity_dtr() commit f50cb2cbabd6c4a60add93d72451728f86e4791c upstream. Dm_integrity also has the same UAF problem when dm_resume() and dm_destroy() are concurrent. Therefore, cancelling timer again in dm_integrity_dtr(). Cc: stable@vger.kernel.org Fixes: 7eada909bfd7a ("dm: add integrity target") Signed-off-by: Luo Meng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 34cd15d83b7206188d440b29b68084fcafde9395 Author: Luo Meng Date: Tue Nov 29 10:48:47 2022 +0800 dm thin: Fix UAF in run_timer_softirq() commit 88430ebcbc0ec637b710b947738839848c20feff upstream. When dm_resume() and dm_destroy() are concurrent, it will lead to UAF, as follows: BUG: KASAN: use-after-free in __run_timers+0x173/0x710 Write of size 8 at addr ffff88816d9490f0 by task swapper/0/0 Call Trace: dump_stack_lvl+0x73/0x9f print_report.cold+0x132/0xaa2 _raw_spin_lock_irqsave+0xcd/0x160 __run_timers+0x173/0x710 kasan_report+0xad/0x110 __run_timers+0x173/0x710 __asan_store8+0x9c/0x140 __run_timers+0x173/0x710 call_timer_fn+0x310/0x310 pvclock_clocksource_read+0xfa/0x250 kvm_clock_read+0x2c/0x70 kvm_clock_get_cycles+0xd/0x20 ktime_get+0x5c/0x110 lapic_next_event+0x38/0x50 clockevents_program_event+0xf1/0x1e0 run_timer_softirq+0x49/0x90 __do_softirq+0x16e/0x62c __irq_exit_rcu+0x1fa/0x270 irq_exit_rcu+0x12/0x20 sysvec_apic_timer_interrupt+0x8e/0xc0 One of the concurrency UAF can be shown as below: use free do_resume | __find_device_hash_cell | dm_get | atomic_inc(&md->holders) | | dm_destroy | __dm_destroy | if (!dm_suspended_md(md)) | atomic_read(&md->holders) | msleep(1) dm_resume | __dm_resume | dm_table_resume_targets | pool_resume | do_waker #add delay work | dm_put | atomic_dec(&md->holders) | | dm_table_destroy | pool_dtr | __pool_dec | __pool_destroy | destroy_workqueue | kfree(pool) # free pool time out __do_softirq run_timer_softirq # pool has already been freed This can be easily reproduced using: 1. create thin-pool 2. dmsetup suspend pool 3. dmsetup resume pool 4. dmsetup remove_all # Concurrent with 3 The root cause of this UAF bug is that dm_resume() adds timer after dm_destroy() skips cancelling the timer because of suspend status. After timeout, it will call run_timer_softirq(), however pool has already been freed. The concurrency UAF bug will happen. Therefore, cancelling timer again in __pool_destroy(). Cc: stable@vger.kernel.org Fixes: 991d9fa02da0d ("dm: add thin provisioning target") Signed-off-by: Luo Meng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit ac362c40e3e9d10f7c6f04cfad748d3b43720d2c Author: Luo Meng Date: Wed Nov 30 10:09:45 2022 +0800 dm thin: resume even if in FAIL mode commit 19eb1650afeb1aa86151f61900e9e5f1de5d8d02 upstream. If a thinpool set fail_io while suspending, resume will fail with: device-mapper: resume ioctl on vg-thinpool failed: Invalid argument The thin-pool also can't be removed if an in-flight bio is in the deferred list. This can be easily reproduced using: echo "offline" > /sys/block/sda/device/state dd if=/dev/zero of=/dev/mapper/thin bs=4K count=1 dmsetup suspend /dev/mapper/pool mkfs.ext4 /dev/mapper/thin dmsetup resume /dev/mapper/pool The root cause is maybe_resize_data_dev() will check fail_io and return error before called dm_resume. Fix this by adding FAIL mode check at the end of pool_preresume(). Cc: stable@vger.kernel.org Fixes: da105ed5fd7e ("dm thin metadata: introduce dm_pool_abort_metadata") Signed-off-by: Luo Meng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 4b710e8481ade7c9200e94d3018e99dc42a0a0e8 Author: Zhihao Cheng Date: Thu Dec 8 22:28:02 2022 +0800 dm thin: Use last transaction's pmd->root when commit failed commit 7991dbff6849f67e823b7cc0c15e5a90b0549b9f upstream. Recently we found a softlock up problem in dm thin pool btree lookup code due to corrupted metadata: Kernel panic - not syncing: softlockup: hung tasks CPU: 7 PID: 2669225 Comm: kworker/u16:3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: dm-thin do_worker [dm_thin_pool] Call Trace: dump_stack+0x9c/0xd3 panic+0x35d/0x6b9 watchdog_timer_fn.cold+0x16/0x25 __run_hrtimer+0xa2/0x2d0 RIP: 0010:__relink_lru+0x102/0x220 [dm_bufio] __bufio_new+0x11f/0x4f0 [dm_bufio] new_read+0xa3/0x1e0 [dm_bufio] dm_bm_read_lock+0x33/0xd0 [dm_persistent_data] ro_step+0x63/0x100 [dm_persistent_data] btree_lookup_raw.constprop.0+0x44/0x220 [dm_persistent_data] dm_btree_lookup+0x16f/0x210 [dm_persistent_data] dm_thin_find_block+0x12c/0x210 [dm_thin_pool] __process_bio_read_only+0xc5/0x400 [dm_thin_pool] process_thin_deferred_bios+0x1a4/0x4a0 [dm_thin_pool] process_one_work+0x3c5/0x730 Following process may generate a broken btree mixed with fresh and stale btree nodes, which could get dm thin trapped in an infinite loop while looking up data block: Transaction 1: pmd->root = A, A->B->C // One path in btree pmd->root = X, X->Y->Z // Copy-up Transaction 2: X,Z is updated on disk, Y write failed. // Commit failed, dm thin becomes read-only. process_bio_read_only dm_thin_find_block __find_block dm_btree_lookup(pmd->root) The pmd->root points to a broken btree, Y may contain stale node pointing to any block, for example X, which gets dm thin trapped into a dead loop while looking up Z. Fix this by setting pmd->root in __open_metadata(), so that dm thin will use the last transaction's pmd->root if commit failed. Fetch a reproducer in [Link]. Linke: https://bugzilla.kernel.org/show_bug.cgi?id=216790 Cc: stable@vger.kernel.org Fixes: 991d9fa02da0 ("dm: add thin provisioning target") Signed-off-by: Zhihao Cheng Acked-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit f8c26c33fef588ee54852cffa7cbb9f9d9869405 Author: Zhihao Cheng Date: Wed Nov 30 21:31:34 2022 +0800 dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata commit 8111964f1b8524c4bb56b02cd9c7a37725ea21fd upstream. Following concurrent processes: P1(drop cache) P2(kworker) drop_caches_sysctl_handler drop_slab shrink_slab down_read(&shrinker_rwsem) - LOCK A do_shrink_slab super_cache_scan prune_icache_sb dispose_list evict ext4_evict_inode ext4_clear_inode ext4_discard_preallocations ext4_mb_load_buddy_gfp ext4_mb_init_cache ext4_read_block_bitmap_nowait ext4_read_bh_nowait submit_bh dm_submit_bio do_worker process_deferred_bios commit metadata_operation_failed dm_pool_abort_metadata down_write(&pmd->root_lock) - LOCK B __destroy_persistent_data_objects dm_block_manager_destroy dm_bufio_client_destroy unregister_shrinker down_write(&shrinker_rwsem) thin_map | dm_thin_find_block ↓ down_read(&pmd->root_lock) --> ABBA deadlock , which triggers hung task: [ 76.974820] INFO: task kworker/u4:3:63 blocked for more than 15 seconds. [ 76.976019] Not tainted 6.1.0-rc4-00011-g8f17dd350364-dirty #910 [ 76.978521] task:kworker/u4:3 state:D stack:0 pid:63 ppid:2 [ 76.978534] Workqueue: dm-thin do_worker [ 76.978552] Call Trace: [ 76.978564] __schedule+0x6ba/0x10f0 [ 76.978582] schedule+0x9d/0x1e0 [ 76.978588] rwsem_down_write_slowpath+0x587/0xdf0 [ 76.978600] down_write+0xec/0x110 [ 76.978607] unregister_shrinker+0x2c/0xf0 [ 76.978616] dm_bufio_client_destroy+0x116/0x3d0 [ 76.978625] dm_block_manager_destroy+0x19/0x40 [ 76.978629] __destroy_persistent_data_objects+0x5e/0x70 [ 76.978636] dm_pool_abort_metadata+0x8e/0x100 [ 76.978643] metadata_operation_failed+0x86/0x110 [ 76.978649] commit+0x6a/0x230 [ 76.978655] do_worker+0xc6e/0xd90 [ 76.978702] process_one_work+0x269/0x630 [ 76.978714] worker_thread+0x266/0x630 [ 76.978730] kthread+0x151/0x1b0 [ 76.978772] INFO: task test.sh:2646 blocked for more than 15 seconds. [ 76.979756] Not tainted 6.1.0-rc4-00011-g8f17dd350364-dirty #910 [ 76.982111] task:test.sh state:D stack:0 pid:2646 ppid:2459 [ 76.982128] Call Trace: [ 76.982139] __schedule+0x6ba/0x10f0 [ 76.982155] schedule+0x9d/0x1e0 [ 76.982159] rwsem_down_read_slowpath+0x4f4/0x910 [ 76.982173] down_read+0x84/0x170 [ 76.982177] dm_thin_find_block+0x4c/0xd0 [ 76.982183] thin_map+0x201/0x3d0 [ 76.982188] __map_bio+0x5b/0x350 [ 76.982195] dm_submit_bio+0x2b6/0x930 [ 76.982202] __submit_bio+0x123/0x2d0 [ 76.982209] submit_bio_noacct_nocheck+0x101/0x3e0 [ 76.982222] submit_bio_noacct+0x389/0x770 [ 76.982227] submit_bio+0x50/0xc0 [ 76.982232] submit_bh_wbc+0x15e/0x230 [ 76.982238] submit_bh+0x14/0x20 [ 76.982241] ext4_read_bh_nowait+0xc5/0x130 [ 76.982247] ext4_read_block_bitmap_nowait+0x340/0xc60 [ 76.982254] ext4_mb_init_cache+0x1ce/0xdc0 [ 76.982259] ext4_mb_load_buddy_gfp+0x987/0xfa0 [ 76.982263] ext4_discard_preallocations+0x45d/0x830 [ 76.982274] ext4_clear_inode+0x48/0xf0 [ 76.982280] ext4_evict_inode+0xcf/0xc70 [ 76.982285] evict+0x119/0x2b0 [ 76.982290] dispose_list+0x43/0xa0 [ 76.982294] prune_icache_sb+0x64/0x90 [ 76.982298] super_cache_scan+0x155/0x210 [ 76.982303] do_shrink_slab+0x19e/0x4e0 [ 76.982310] shrink_slab+0x2bd/0x450 [ 76.982317] drop_slab+0xcc/0x1a0 [ 76.982323] drop_caches_sysctl_handler+0xb7/0xe0 [ 76.982327] proc_sys_call_handler+0x1bc/0x300 [ 76.982331] proc_sys_write+0x17/0x20 [ 76.982334] vfs_write+0x3d3/0x570 [ 76.982342] ksys_write+0x73/0x160 [ 76.982347] __x64_sys_write+0x1e/0x30 [ 76.982352] do_syscall_64+0x35/0x80 [ 76.982357] entry_SYSCALL_64_after_hwframe+0x63/0xcd Function metadata_operation_failed() is called when operations failed on dm pool metadata, dm pool will destroy and recreate metadata. So, shrinker will be unregistered and registered, which could down write shrinker_rwsem under pmd_write_lock. Fix it by allocating dm_block_manager before locking pmd->root_lock and destroying old dm_block_manager after unlocking pmd->root_lock, then old dm_block_manager is replaced with new dm_block_manager under pmd->root_lock. So, shrinker register/unregister could be done without holding pmd->root_lock. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216676 Cc: stable@vger.kernel.org #v5.2+ Fixes: e49e582965b3 ("dm thin: add read only and fail io modes") Signed-off-by: Zhihao Cheng Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 28d307f380df88a598bc0186d527462902d9bda1 Author: Mike Snitzer Date: Wed Nov 30 13:26:32 2022 -0500 dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort commit 352b837a5541690d4f843819028cf2b8be83d424 upstream. Same ABBA deadlock pattern fixed in commit 4b60f452ec51 ("dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata") to DM-cache's metadata. Reported-by: Zhihao Cheng Cc: stable@vger.kernel.org Fixes: 028ae9f76f29 ("dm cache: add fail io mode and needs_check flag") Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit a9e89a567f48a08405caa7e078f135a39dbdc8f0 Author: Matthieu Baerts Date: Fri Dec 9 16:28:08 2022 -0800 mptcp: remove MPTCP 'ifdef' in TCP SYN cookies commit 3fff88186f047627bb128d65155f42517f8e448f upstream. To ease the maintenance, it is often recommended to avoid having #ifdef preprocessor conditions. Here the section related to CONFIG_MPTCP was quite short but the next commit needs to add more code around. It is then cleaner to move specific MPTCP code to functions located in net/mptcp directory. Now that mptcp_subflow_request_sock_ops structure can be static, it can also be marked as "read only after init". Suggested-by: Paolo Abeni Reviewed-by: Mat Martineau Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 13b9fd0dee936e626245b12f1252cb84961925c0 Author: Florian Westphal Date: Tue Feb 15 18:11:29 2022 -0800 mptcp: mark ops structures as ro_after_init commit 51fa7f8ebf0e25c7a9039fa3988a623d5f3855aa upstream. These structures are initialised from the init hooks, so we can't make them 'const'. But no writes occur afterwards, so we can use ro_after_init. Also, remove bogus EXPORT_SYMBOL, the only access comes from ip stack, not from kernel modules. Signed-off-by: Florian Westphal Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b2120ed7fd75157a6d83d0671ddf2050627718e3 Author: Alexander Aring Date: Thu Oct 27 16:45:12 2022 -0400 fs: dlm: retry accept() until -EAGAIN or error returns commit f0f4bb431bd543ed7bebbaea3ce326cfcd5388bc upstream. This patch fixes a race if we get two times an socket data ready event while the listen connection worker is queued. Currently it will be served only once but we need to do it (in this case twice) until we hit -EAGAIN which tells us there is no pending accept going on. This patch wraps an do while loop until we receive a return value which is different than 0 as it was done before commit d11ccd451b65 ("fs: dlm: listen socket out of connection hash"). Cc: stable@vger.kernel.org Fixes: d11ccd451b65 ("fs: dlm: listen socket out of connection hash") Signed-off-by: Alexander Aring Signed-off-by: David Teigland Signed-off-by: Greg Kroah-Hartman commit 5b4478615f701c71ce9a521283a7858fa326116d Author: Alexander Aring Date: Thu Oct 27 16:45:11 2022 -0400 fs: dlm: fix sock release if listen fails commit 08ae0547e75ec3d062b6b6b9cf4830c730df68df upstream. This patch fixes a double sock_release() call when the listen() is called for the dlm lowcomms listen socket. The caller of dlm_listen_for_all should never care about releasing the socket if dlm_listen_for_all() fails, it's done now only once if listen() fails. Cc: stable@vger.kernel.org Fixes: 2dc6b1158c28 ("fs: dlm: introduce generic listen") Signed-off-by: Alexander Aring Signed-off-by: David Teigland Signed-off-by: Greg Kroah-Hartman commit b7ede8a63dd9a48f2caf4df93564f11fb632c217 Author: Chris Chiu Date: Mon Dec 26 19:43:03 2022 +0800 ALSA: hda/realtek: Apply dual codec fixup for Dell Latitude laptops [ Upstream commit a4517c4f3423c7c448f2c359218f97c1173523a1 ] The Dell Latiture 3340/3440/3540 laptops with Realtek ALC3204 have dual codecs and need the ALC1220_FIXUP_GB_DUAL_CODECS to fix the conflicts of Master controls. The existing headset mic fixup for Dell is also required to enable the jack sense and the headset mic. Introduce a new fixup to fix the dual codec and headset mic issues for particular Dell laptops since other old Dell laptops with the same codec configuration are already well handled by the fixup in alc269_fallback_pin_fixup_tbl[]. Signed-off-by: Chris Chiu Cc: Link: https://lore.kernel.org/r/20221226114303.4027500-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit dbd1f301915f9aa2cdb2f7c2a0c8709f4231fc3b Author: Philipp Jungkamp Date: Mon Dec 5 17:37:13 2022 +0100 ALSA: patch_realtek: Fix Dell Inspiron Plus 16 [ Upstream commit 2912cdda734d9136615ed05636d9fcbca2a7a3c5 ] The Dell Inspiron Plus 16, in both laptop and 2in1 form factor, has top speakers connected on NID 0x17, which the codec reports as unconnected. These speakers should be connected to the DAC on NID 0x03. Signed-off-by: Philipp Jungkamp Link: https://lore.kernel.org/r/20221205163713.7476-1-p.jungkamp@gmx.net Signed-off-by: Takashi Iwai Stable-dep-of: a4517c4f3423 ("ALSA: hda/realtek: Apply dual codec fixup for Dell Latitude laptops") Signed-off-by: Sasha Levin commit 8fb4c98f20dfca1237de2e3dfdbe78d156784fd3 Author: Yongqiang Liu Date: Thu Nov 10 14:23:07 2022 +0000 cpufreq: Init completion before kobject_init_and_add() commit 5c51054896bcce1d33d39fead2af73fec24f40b6 upstream. In cpufreq_policy_alloc(), it will call uninitialed completion in cpufreq_sysfs_release() when kobject_init_and_add() fails. And that will cause a crash such as the following page fault in complete: BUG: unable to handle page fault for address: fffffffffffffff8 [..] RIP: 0010:complete+0x98/0x1f0 [..] Call Trace: kobject_put+0x1be/0x4c0 cpufreq_online.cold+0xee/0x1fd cpufreq_add_dev+0x183/0x1e0 subsys_interface_register+0x3f5/0x4e0 cpufreq_register_driver+0x3b7/0x670 acpi_cpufreq_init+0x56c/0x1000 [acpi_cpufreq] do_one_initcall+0x13d/0x780 do_init_module+0x1c3/0x630 load_module+0x6e67/0x73b0 __do_sys_finit_module+0x181/0x240 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 4ebe36c94aed ("cpufreq: Fix kobject memleak") Signed-off-by: Yongqiang Liu Acked-by: Viresh Kumar Cc: 5.2+ # 5.2+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 876c6ab96782343de2d8ada8b38ea7189c98bbc6 Author: Kant Fan Date: Tue Oct 25 15:21:09 2022 +0800 PM/devfreq: governor: Add a private governor_data for governor commit 5fdded8448924e3631d466eea499b11606c43640 upstream. The member void *data in the structure devfreq can be overwrite by governor_userspace. For example: 1. The device driver assigned the devfreq governor to simple_ondemand by the function devfreq_add_device() and init the devfreq member void *data to a pointer of a static structure devfreq_simple_ondemand_data by the function devfreq_add_device(). 2. The user changed the devfreq governor to userspace by the command "echo userspace > /sys/class/devfreq/.../governor". 3. The governor userspace alloced a dynamic memory for the struct userspace_data and assigend the member void *data of devfreq to this memory by the function userspace_init(). 4. The user changed the devfreq governor back to simple_ondemand by the command "echo simple_ondemand > /sys/class/devfreq/.../governor". 5. The governor userspace exited and assigned the member void *data in the structure devfreq to NULL by the function userspace_exit(). 6. The governor simple_ondemand fetched the static information of devfreq_simple_ondemand_data in the function devfreq_simple_ondemand_func() but the member void *data of devfreq was assigned to NULL by the function userspace_exit(). 7. The information of upthreshold and downdifferential is lost and the governor simple_ondemand can't work correctly. The member void *data in the structure devfreq is designed for a static pointer used in a governor and inited by the function devfreq_add_device(). This patch add an element named governor_data in the devfreq structure which can be used by a governor(E.g userspace) who want to assign a private data to do some private things. Fixes: ce26c5bb9569 ("PM / devfreq: Add basic governors") Cc: stable@vger.kernel.org # 5.10+ Reviewed-by: Chanwoo Choi Acked-by: MyungJoo Ham Signed-off-by: Kant Fan Signed-off-by: Chanwoo Choi Signed-off-by: Greg Kroah-Hartman commit 0e945ea733eaf44007e7eedbfc85a06493c5b5ab Author: Mickaël Salaün Date: Fri Sep 9 12:39:01 2022 +0200 selftests: Use optional USERCFLAGS and USERLDFLAGS commit de3ee3f63400a23954e7c1ad1cb8c20f29ab6fe3 upstream. This change enables to extend CFLAGS and LDFLAGS from command line, e.g. to extend compiler checks: make USERCFLAGS=-Werror USERLDFLAGS=-static USERCFLAGS and USERLDFLAGS are documented in Documentation/kbuild/makefiles.rst and Documentation/kbuild/kbuild.rst This should be backported (down to 5.10) to improve previous kernel versions testing as well. Cc: Shuah Khan Cc: stable@vger.kernel.org Signed-off-by: Mickaël Salaün Link: https://lore.kernel.org/r/20220909103901.1503436-1-mic@digikod.net Signed-off-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman commit 31697c5953ff9fe5fa6a14e689a23e0b86baf84d Author: Krzysztof Kozlowski Date: Fri Sep 30 21:20:37 2022 +0200 arm64: dts: qcom: sdm850-lenovo-yoga-c630: correct I2C12 pins drive strength commit fd49776d8f458bba5499384131eddc0b8bcaf50c upstream. The pin configuration (done with generic pin controller helpers and as expressed by bindings) requires children nodes with either: 1. "pins" property and the actual configuration, 2. another set of nodes with above point. The qup_i2c12_default pin configuration used second method - with a "pinmux" child. Fixes: 44acee207844 ("arm64: dts: qcom: Add Lenovo Yoga C630") Cc: Signed-off-by: Krzysztof Kozlowski Tested-by: Steev Klimaszewski Reviewed-by: Konrad Dybcio Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20220930192039.240486-1-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman commit 16304986603059d8f92116f977da982005101db4 Author: Jason A. Donenfeld Date: Tue Nov 8 13:37:55 2022 +0100 ARM: ux500: do not directly dereference __iomem commit 65b0e307a1a9193571db12910f382f84195a3d29 upstream. Sparse reports that calling add_device_randomness() on `uid` is a violation of address spaces. And indeed the next usage uses readl() properly, but that was left out when passing it toadd_device_ randomness(). So instead copy the whole thing to the stack first. Fixes: 4040d10a3d44 ("ARM: ux500: add DB serial number to entropy pool") Cc: Linus Walleij Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/202210230819.loF90KDh-lkp@intel.com/ Reported-by: kernel test robot Signed-off-by: Jason A. Donenfeld Link: https://lore.kernel.org/r/20221108123755.207438-1-Jason@zx2c4.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 99590f29b2b7567fda2b503aa3d81a0d3e09dce5 Author: Boris Burkov Date: Wed Dec 14 15:05:08 2022 -0800 btrfs: fix resolving backrefs for inline extent followed by prealloc commit 560840afc3e63bbe5d9c5ef6b2ecf8f3589adff6 upstream. If a file consists of an inline extent followed by a regular or prealloc extent, then a legitimate attempt to resolve a logical address in the non-inline region will result in add_all_parents reading the invalid offset field of the inline extent. If the inline extent item is placed in the leaf eb s.t. it is the first item, attempting to access the offset field will not only be meaningless, it will go past the end of the eb and cause this panic: [17.626048] BTRFS warning (device dm-2): bad eb member end: ptr 0x3fd4 start 30834688 member offset 16377 size 8 [17.631693] general protection fault, probably for non-canonical address 0x5088000000000: 0000 [#1] SMP PTI [17.635041] CPU: 2 PID: 1267 Comm: btrfs Not tainted 5.12.0-07246-g75175d5adc74-dirty #199 [17.637969] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [17.641995] RIP: 0010:btrfs_get_64+0xe7/0x110 [17.649890] RSP: 0018:ffffc90001f73a08 EFLAGS: 00010202 [17.651652] RAX: 0000000000000001 RBX: ffff88810c42d000 RCX: 0000000000000000 [17.653921] RDX: 0005088000000000 RSI: ffffc90001f73a0f RDI: 0000000000000001 [17.656174] RBP: 0000000000000ff9 R08: 0000000000000007 R09: c0000000fffeffff [17.658441] R10: ffffc90001f73790 R11: ffffc90001f73788 R12: ffff888106afe918 [17.661070] R13: 0000000000003fd4 R14: 0000000000003f6f R15: cdcdcdcdcdcdcdcd [17.663617] FS: 00007f64e7627d80(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000 [17.666525] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17.668664] CR2: 000055d4a39152e8 CR3: 000000010c596002 CR4: 0000000000770ee0 [17.671253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [17.673634] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [17.676034] PKRU: 55555554 [17.677004] Call Trace: [17.677877] add_all_parents+0x276/0x480 [17.679325] find_parent_nodes+0xfae/0x1590 [17.680771] btrfs_find_all_leafs+0x5e/0xa0 [17.682217] iterate_extent_inodes+0xce/0x260 [17.683809] ? btrfs_inode_flags_to_xflags+0x50/0x50 [17.685597] ? iterate_inodes_from_logical+0xa1/0xd0 [17.687404] iterate_inodes_from_logical+0xa1/0xd0 [17.689121] ? btrfs_inode_flags_to_xflags+0x50/0x50 [17.691010] btrfs_ioctl_logical_to_ino+0x131/0x190 [17.692946] btrfs_ioctl+0x104a/0x2f60 [17.694384] ? selinux_file_ioctl+0x182/0x220 [17.695995] ? __x64_sys_ioctl+0x84/0xc0 [17.697394] __x64_sys_ioctl+0x84/0xc0 [17.698697] do_syscall_64+0x33/0x40 [17.700017] entry_SYSCALL_64_after_hwframe+0x44/0xae [17.701753] RIP: 0033:0x7f64e72761b7 [17.709355] RSP: 002b:00007ffefb067f58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [17.712088] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f64e72761b7 [17.714667] RDX: 00007ffefb067fb0 RSI: 00000000c0389424 RDI: 0000000000000003 [17.717386] RBP: 00007ffefb06d188 R08: 000055d4a390d2b0 R09: 00007f64e7340a60 [17.719938] R10: 0000000000000231 R11: 0000000000000246 R12: 0000000000000001 [17.722383] R13: 0000000000000000 R14: 00000000c0389424 R15: 000055d4a38fd2a0 [17.724839] Modules linked in: Fix the bug by detecting the inline extent item in add_all_parents and skipping to the next extent item. CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Qu Wenruo Signed-off-by: Boris Burkov Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 1f9cf4daf2d357f51a69a9380b4cccc76f484ece Author: Wenchao Chen Date: Wed Dec 7 13:19:09 2022 +0800 mmc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K commit ff874dbc4f868af128b412a9bd92637103cf11d7 upstream. When the clock is less than 400K, some SD cards fail to initialize because CLK_AUTO is enabled. Fixes: fb8bd90f83c4 ("mmc: sdhci-sprd: Add Spreadtrum's initial host controller") Signed-off-by: Wenchao Chen Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221207051909.32126-1-wenchao.chen@unisoc.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 58d53ff30a00903c62eb0cc65559ddd0bd916cc3 Author: Krzysztof Kozlowski Date: Mon Oct 10 07:44:13 2022 -0400 arm64: dts: qcom: sdm845-db845c: correct SPI2 pins drive strength commit 9905370560d9c29adc15f4937c5a0c0dac05f0b4 upstream. The pin configuration (done with generic pin controller helpers and as expressed by bindings) requires children nodes with either: 1. "pins" property and the actual configuration, 2. another set of nodes with above point. The qup_spi2_default pin configuration uses alreaady the second method with a "pinmux" child, so configure drive-strength similarly in "pinconf". Otherwise the PIN drive strength would not be applied. Fixes: 8d23a0040475 ("arm64: dts: qcom: db845c: add Low speed expansion i2c and spi nodes") Cc: Signed-off-by: Krzysztof Kozlowski Reviewed-by: Douglas Anderson Reviewed-by: Neil Armstrong Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20221010114417.29859-2-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman commit a777b90a057519346e96b2723f6aee7c12aedff8 Author: Alexander Antonov Date: Thu Nov 17 12:28:25 2022 +0000 perf/x86/intel/uncore: Clear attr_update properly commit 6532783310e2b2f50dc13f46c49aa6546cb6e7a3 upstream. Current clear_attr_update procedure in pmu_set_mapping() sets attr_update field in NULL that is not correct because intel_uncore_type pmu types can contain several groups in attr_update field. For example, SPR platform already has uncore_alias_group to update and then UPI topology group will be added in next patches. Fix current behavior and clear attr_update group related to mapping only. Fixes: bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") Signed-off-by: Alexander Antonov Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Kan Liang Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221117122833.3103580-4-alexander.antonov@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit ca77ac238c1e735cf40d3df01f6d2ef36928c493 Author: Alexander Antonov Date: Thu Nov 17 12:28:26 2022 +0000 perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D commit efe062705d149b20a15498cb999a9edbb8241e6f upstream. Current implementation of I/O stacks to PMU mapping doesn't support ICX-D. Detect ICX-D system to disable mapping. Fixes: 10337e95e04c ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX") Signed-off-by: Alexander Antonov Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Kan Liang Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221117122833.3103580-5-alexander.antonov@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit df06e7777cf93851e91d66f251b95001f859a212 Author: Bixuan Cui Date: Tue Oct 11 19:33:44 2022 +0800 jbd2: use the correct print format commit d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8 upstream. The print format error was found when using ftrace event: <...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368 <...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0 Use the correct print format for transaction, head and tid. Fixes: 879c5e6b7cb4 ('jbd2: convert instrumentation from markers to tracepoints') Signed-off-by: Bixuan Cui Reviewed-by: Jason Yan Link: https://lore.kernel.org/r/1665488024-95172-1-git-send-email-cuibixuan@linux.alibaba.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 8e75b1dd4b165d18f9dc70a185a65f62bd255a8e Author: Steven Rostedt Date: Fri Dec 2 11:59:36 2022 -0500 ktest.pl minconfig: Unset configs instead of just removing them commit ef784eebb56425eed6e9b16e7d47e5c00dcf9c38 upstream. After a full run of a make_min_config test, I noticed there were a lot of CONFIGs still enabled that really should not be. Looking at them, I noticed they were all defined as "default y". The issue is that the test simple removes the config and re-runs make oldconfig, which enables it again because it is set to default 'y'. Instead, explicitly disable the config with writing "# CONFIG_FOO is not set" to the file to keep it from being set again. With this change, one of my box's minconfigs went from 768 configs set, down to 521 configs set. Link: https://lkml.kernel.org/r/20221202115936.016fce23@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 0a05c769a9de5 ("ktest: Added config_bisect test type") Reviewed-by: John 'Warthog9' Hawley (VMware) Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 55e5e8b44561df3e28af511f3b9fad2dd694fdcd Author: Steven Rostedt Date: Wed Nov 30 17:54:34 2022 -0500 kest.pl: Fix grub2 menu handling for rebooting commit 26df05a8c1420ad3de314fdd407e7fc2058cc7aa upstream. grub2 has submenus where to use grub-reboot, it requires: grub-reboot X>Y where X is the main index and Y is the submenu. Thus if you have: menuentry 'Debian GNU/Linux' --class debian --class gnu-linux ... [...] } submenu 'Advanced options for Debian GNU/Linux' $menuentry_id_option ... menuentry 'Debian GNU/Linux, with Linux 6.0.0-4-amd64' --class debian --class gnu-linux ... [...] } menuentry 'Debian GNU/Linux, with Linux 6.0.0-4-amd64 (recovery mode)' --class debian --class gnu-linux ... [...] } menuentry 'Debian GNU/Linux, with Linux test' --class debian --class gnu-linux ... [...] } And wanted to boot to the "Linux test" kernel, you need to run: # grub-reboot 1>2 As 1 is the second top menu (the submenu) and 2 is the third of the sub menu entries. Have the grub.cfg parsing for grub2 handle such cases. Cc: stable@vger.kernel.org Fixes: a15ba91361d46 ("ktest: Add support for grub2") Reviewed-by: John 'Warthog9' Hawley (VMware) Signed-off-by: Steven Rostedt Signed-off-by: Greg Kroah-Hartman commit 823fed7c400fbba5d0aaa606cf60615fb32a0c48 Author: Manivannan Sadhasivam Date: Tue Nov 29 12:41:59 2022 +0530 soc: qcom: Select REMAP_MMIO for LLCC driver commit 5d2fe2d7b616b8baa18348ead857b504fc2de336 upstream. LLCC driver uses REGMAP_MMIO for accessing the hardware registers. So select the dependency in Kconfig. Without this, there will be errors while building the driver with COMPILE_TEST only: ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/soc/qcom/llcc-qcom.ko] undefined! make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1 make: *** [Makefile:1944: modpost] Error 2 Cc: # 4.19 Fixes: a3134fb09e0b ("drivers: soc: Add LLCC driver") Reported-by: Borislav Petkov Signed-off-by: Manivannan Sadhasivam Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20221129071201.30024-2-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman commit 8dabeeb1ff89ea341b458f477305821a382fa8d9 Author: Jason A. Donenfeld Date: Mon Oct 24 17:23:43 2022 +0200 media: stv0288: use explicitly signed char commit 7392134428c92a4cb541bd5c8f4f5c8d2e88364d upstream. With char becoming unsigned by default, and with `char` alone being ambiguous and based on architecture, signed chars need to be marked explicitly as such. Use `s8` and `u8` types here, since that's what surrounding code does. This fixes: drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm' drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop Cc: Mauro Carvalho Chehab Cc: linux-media@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit d167ebea9086d225c2e061c1967fba8274fd39b2 Author: Eric Dumazet Date: Thu Jun 2 09:18:59 2022 -0700 net/af_packet: make sure to pull mac header commit e9d3f80935b6607dcdc5682b00b1d4b28e0a0c5d upstream. GSO assumes skb->head contains link layer headers. tun device in some case can provide base 14 bytes, regardless of VLAN being used or not. After blamed commit, we can end up setting a network header offset of 18+, we better pull the missing bytes to avoid a posible crash in GSO. syzbot report was: kernel BUG at include/linux/skbuff.h:2699! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3601 Comm: syz-executor210 Not tainted 5.18.0-syzkaller-11338-g2c5ca23f7414 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__skb_pull include/linux/skbuff.h:2699 [inline] RIP: 0010:skb_mac_gso_segment+0x48f/0x530 net/core/gro.c:136 Code: 00 48 c7 c7 00 96 d4 8a c6 05 cb d3 45 06 01 e8 26 bb d0 01 e9 2f fd ff ff 49 c7 c4 ea ff ff ff e9 f1 fe ff ff e8 91 84 19 fa <0f> 0b 48 89 df e8 97 44 66 fa e9 7f fd ff ff e8 ad 44 66 fa e9 48 RSP: 0018:ffffc90002e2f4b8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000000 RDX: ffff88805bb58000 RSI: ffffffff8760ed0f RDI: 0000000000000004 RBP: 0000000000005dbc R08: 0000000000000004 R09: 0000000000000fe0 R10: 0000000000000fe4 R11: 0000000000000000 R12: 0000000000000fe0 R13: ffff88807194d780 R14: 1ffff920005c5e9b R15: 0000000000000012 FS: 000055555730f300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200015c0 CR3: 0000000071ff8000 CR4: 0000000000350ee0 Call Trace: __skb_gso_segment+0x327/0x6e0 net/core/dev.c:3411 skb_gso_segment include/linux/netdevice.h:4749 [inline] validate_xmit_skb+0x6bc/0xf10 net/core/dev.c:3669 validate_xmit_skb_list+0xbc/0x120 net/core/dev.c:3719 sch_direct_xmit+0x3d1/0xbe0 net/sched/sch_generic.c:327 __dev_xmit_skb net/core/dev.c:3815 [inline] __dev_queue_xmit+0x14a1/0x3a00 net/core/dev.c:4219 packet_snd net/packet/af_packet.c:3071 [inline] packet_sendmsg+0x21cb/0x5550 net/packet/af_packet.c:3102 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492 ___sys_sendmsg+0xf3/0x170 net/socket.c:2546 __sys_sendmsg net/socket.c:2575 [inline] __do_sys_sendmsg net/socket.c:2584 [inline] __se_sys_sendmsg net/socket.c:2582 [inline] __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f4b95da06c9 Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd7defc4c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffd7defc4f0 RCX: 00007f4b95da06c9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 0000000000000003 R08: bb1414ac00000050 R09: bb1414ac00000050 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd7defc4e0 R14: 00007ffd7defc4d8 R15: 00007ffd7defc4d4 Fixes: dfed913e8b55 ("net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO") Signed-off-by: Eric Dumazet Reported-by: syzbot Acked-by: Hangbin Liu Acked-by: Willem de Bruijn Cc: Michael S. Tsirkin Signed-off-by: Jakub Kicinski Signed-off-by: Tudor Ambarus Signed-off-by: Greg Kroah-Hartman commit 9ff46c36df2e0a1ac352f2f4038eaf3f0f7361b8 Author: Hangbin Liu Date: Mon Apr 25 09:45:02 2022 +0800 net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO commit dfed913e8b55a0c2c4906f1242fd38fd9a116e49 upstream. Currently, the kernel drops GSO VLAN tagged packet if it's created with socket(AF_PACKET, SOCK_RAW, 0) plus virtio_net_hdr. The reason is AF_PACKET doesn't adjust the skb network header if there is a VLAN tag. Then after virtio_net_hdr_set_proto() called, the skb->protocol will be set to ETH_P_IP/IPv6. And in later inet/ipv6_gso_segment() the skb is dropped as network header position is invalid. Let's handle VLAN packets by adjusting network header position in packet_parse_headers(). The adjustment is safe and does not affect the later xmit as tap device also did that. In packet_snd(), packet_parse_headers() need to be moved before calling virtio_net_hdr_set_proto(), so we can set correct skb->protocol and network header first. There is no need to update tpacket_snd() as it calls packet_parse_headers() in tpacket_fill_skb(), which is already before calling virtio_net_hdr_* functions. skb->no_fcs setting is also moved upper to make all skb settings together and keep consistency with function packet_sendmsg_spkt(). Signed-off-by: Hangbin Liu Acked-by: Willem de Bruijn Acked-by: Michael S. Tsirkin Link: https://lore.kernel.org/r/20220425014502.985464-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni Signed-off-by: Tudor Ambarus Signed-off-by: Greg Kroah-Hartman commit cd0f597c8aa8fd6128dc4058e33a68427835ae19 Author: Paul E. McKenney Date: Wed Jul 28 10:53:41 2021 -0700 rcu-tasks: Simplify trc_read_check_handler() atomic operations commit 96017bf9039763a2e02dcc6adaa18592cd73a39d upstream. Currently, trc_wait_for_one_reader() atomically increments the trc_n_readers_need_end counter before sending the IPI invoking trc_read_check_handler(). All failure paths out of trc_read_check_handler() and also from the smp_call_function_single() within trc_wait_for_one_reader() must carefully atomically decrement this counter. This is more complex than it needs to be. This commit therefore simplifies things and saves a few lines of code by dispensing with the atomic decrements in favor of having trc_read_check_handler() do the atomic increment only in the success case. In theory, this represents no change in functionality. Signed-off-by: Paul E. McKenney Cc: Joel Fernandes Signed-off-by: Greg Kroah-Hartman commit 593ca696687c70b39041a19fa02917194fee334b Author: Pierre-Louis Bossart Date: Fri Dec 24 10:10:31 2021 +0800 ASoC/SoundWire: dai: expand 'stream' concept beyond SoundWire commit e8444560b4d9302a511f0996f4cfdf85b628f4ca upstream. The HDAudio ASoC support relies on the set_tdm_slots() helper to store the HDaudio stream tag in the tx_mask. This only works because of the pre-existing order in soc-pcm.c, where the hw_params() is handled for codec_dais *before* cpu_dais. When the order is reversed, the stream_tag is used as a mask in the codec fixup functions: /* fixup params based on TDM slot masks */ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && codec_dai->tx_mask) soc_pcm_codec_params_fixup(&codec_params, codec_dai->tx_mask); As a result of this confusion, the codec_params_fixup() ends-up generating bad channel masks, depending on what stream_tag was allocated. We could add a flag to state that the tx_mask is really not a mask, but it would be quite ugly to persist in overloading concepts. Instead, this patch suggests a more generic get/set 'stream' API based on the existing model for SoundWire. We can expand the concept to store 'stream' opaque information that is specific to different DAI types. In the case of HDAudio DAIs, we only need to store a stream tag as an unsigned char pointer. The TDM rx_ and tx_masks should really only be used to store masks. Rename get_sdw_stream/set_sdw_stream callbacks and helpers as get_stream/set_stream. No functionality change beyond the rename. Signed-off-by: Pierre-Louis Bossart Reviewed-by: Rander Wang Reviewed-by: Ranjani Sridharan Signed-off-by: Bard Liao Acked-By: Vinod Koul Link: https://lore.kernel.org/r/20211224021034.26635-5-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown Cc: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit a7874dac6ba678a9936255bc459c302060c740e3 Author: Pierre-Louis Bossart Date: Fri Dec 24 10:10:32 2021 +0800 ASoC: Intel/SOF: use set_stream() instead of set_tdm_slots() for HDAudio commit 636110411ca726f19ef8e87b0be51bb9a4cdef06 upstream. Overloading the tx_mask with a linear value is asking for trouble and only works because the codec_dai hw_params() is called before the cpu_dai hw_params(). Move to the more generic set_stream() API to pass the hdac_stream information. Signed-off-by: Pierre-Louis Bossart Reviewed-by: Rander Wang Reviewed-by: Ranjani Sridharan Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20211224021034.26635-6-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown Cc: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit ae4f70b2fed409b528663a2e2d9de7e140549b1d Author: Marco Elver Date: Mon Sep 12 11:45:40 2022 +0200 kcsan: Instrument memcpy/memset/memmove with newer Clang commit 7c201739beef1a586d806463f1465429cdce34c5 upstream. With Clang version 16+, -fsanitize=thread will turn memcpy/memset/memmove calls in instrumented functions into __tsan_memcpy/__tsan_memset/__tsan_memmove calls respectively. Add these functions to the core KCSAN runtime, so that we (a) catch data races with mem* functions, and (b) won't run into linker errors with such newer compilers. Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Marco Elver Signed-off-by: Paul E. McKenney [ elver@google.com: adjust check_access() call for v5.15 and earlier. ] Signed-off-by: Greg Kroah-Hartman commit d01fa993eb7fbc305f0a9c3e8bfac6513efc13b6 Author: Chuck Lever Date: Sat Nov 26 15:55:18 2022 -0500 SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails commit da522b5fe1a5f8b7c20a0023e87b52a150e53bf5 upstream. Fixes: 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS authentication.") Signed-off-by: Chuck Lever Cc: Reviewed-by: Jeff Layton Signed-off-by: Greg Kroah-Hartman commit 43135fb098126ef2cd6ed584900fd7bfa25f95ce Author: Hanjun Guo Date: Thu Nov 17 19:23:42 2022 +0800 tpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak commit db9622f762104459ff87ecdf885cc42c18053fd9 upstream. In check_acpi_tpm2(), we get the TPM2 table just to make sure the table is there, not used after the init, so the acpi_put_table() should be added to release the ACPI memory. Fixes: 4cb586a188d4 ("tpm_tis: Consolidate the platform and acpi probe flow") Cc: stable@vger.kernel.org Signed-off-by: Hanjun Guo Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit 986cd9a9b95423e35a2cbb8e9105aec0e0d7f337 Author: Hanjun Guo Date: Thu Nov 17 19:23:41 2022 +0800 tpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak commit 37e90c374dd11cf4919c51e847c6d6ced0abc555 upstream. In crb_acpi_add(), we get the TPM2 table to retrieve information like start method, and then assign them to the priv data, so the TPM2 table is not used after the init, should be freed, call acpi_put_table() to fix the memory leak. Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface") Cc: stable@vger.kernel.org Signed-off-by: Hanjun Guo Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit 638cd298dfebce46919cbd6cf1884701215f506d Author: Hanjun Guo Date: Thu Nov 17 19:23:40 2022 +0800 tpm: acpi: Call acpi_put_table() to fix memory leak commit 8740a12ca2e2959531ad253bac99ada338b33d80 upstream. The start and length of the event log area are obtained from TPM2 or TCPA table, so we call acpi_get_table() to get the ACPI information, but the acpi_get_table() should be coupled with acpi_put_table() to release the ACPI memory, add the acpi_put_table() properly to fix the memory leak. While we are at it, remove the redundant empty line at the end of the tpm_read_log_acpi(). Fixes: 0bfb23746052 ("tpm: Move eventlog files to a subdirectory") Fixes: 85467f63a05c ("tpm: Add support for event log pointer found in TPM2 ACPI table") Cc: stable@vger.kernel.org Signed-off-by: Hanjun Guo Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit d58289fc77f8c1f879c818bddaf7ef524c73658b Author: Deren Wu Date: Sun Dec 4 16:24:16 2022 +0800 mmc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING commit 4a44cd249604e29e7b90ae796d7692f5773dd348 upstream. vub300_enable_sdio_irq() works with mutex and need TASK_RUNNING here. Ensure that we mark current as TASK_RUNNING for sleepable context. [ 77.554641] do not call blocking ops when !TASK_RUNNING; state=1 set at [] sdio_irq_thread+0x17d/0x5b0 [ 77.554652] WARNING: CPU: 2 PID: 1983 at kernel/sched/core.c:9813 __might_sleep+0x116/0x160 [ 77.554905] CPU: 2 PID: 1983 Comm: ksdioirqd/mmc1 Tainted: G OE 6.1.0-rc5 #1 [ 77.554910] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0081.2020.0504.1834 05/04/2020 [ 77.554912] RIP: 0010:__might_sleep+0x116/0x160 [ 77.554920] RSP: 0018:ffff888107b7fdb8 EFLAGS: 00010282 [ 77.554923] RAX: 0000000000000000 RBX: ffff888118c1b740 RCX: 0000000000000000 [ 77.554926] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffed1020f6ffa9 [ 77.554928] RBP: ffff888107b7fde0 R08: 0000000000000001 R09: ffffed1043ea60ba [ 77.554930] R10: ffff88821f5305cb R11: ffffed1043ea60b9 R12: ffffffff93aa3a60 [ 77.554932] R13: 000000000000011b R14: 7fffffffffffffff R15: ffffffffc0558660 [ 77.554934] FS: 0000000000000000(0000) GS:ffff88821f500000(0000) knlGS:0000000000000000 [ 77.554937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 77.554939] CR2: 00007f8a44010d68 CR3: 000000024421a003 CR4: 00000000003706e0 [ 77.554942] Call Trace: [ 77.554944] [ 77.554952] mutex_lock+0x78/0xf0 [ 77.554973] vub300_enable_sdio_irq+0x103/0x3c0 [vub300] [ 77.554981] sdio_irq_thread+0x25c/0x5b0 [ 77.555006] kthread+0x2b8/0x370 [ 77.555017] ret_from_fork+0x1f/0x30 [ 77.555023] [ 77.555025] ---[ end trace 0000000000000000 ]--- Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver") Signed-off-by: Deren Wu Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87dc45b122d26d63c80532976813c9365d7160b3.1670140888.git.deren.wu@mediatek.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 7eb57bc92f1ba0e2d27b0c8f16f2c69ae65fce70 Author: Jaegeuk Kim Date: Tue Nov 8 17:59:34 2022 -0800 f2fs: allow to read node block after shutdown commit e6ecb142429183cef4835f31d4134050ae660032 upstream. If block address is still alive, we should give a valid node block even after shutdown. Otherwise, we can see zero data when reading out a file. Cc: stable@vger.kernel.org Fixes: 83a3bfdb5a8a ("f2fs: indicate shutdown f2fs to allow unmount successfully") Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit acc13987fdea7ef6361f7528fec1d2884e377c12 Author: Pavel Machek Date: Mon Oct 24 19:30:12 2022 +0200 f2fs: should put a page when checking the summary info commit c3db3c2fd9992c08f49aa93752d3c103c3a4f6aa upstream. The commit introduces another bug. Cc: stable@vger.kernel.org Fixes: c6ad7fd16657e ("f2fs: fix to do sanity check on summary info") Signed-off-by: Pavel Machek Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 35d8a89862e69f743715613d4751a08094e6aa1c Author: NARIBAYASHI Akira Date: Wed Oct 26 20:24:38 2022 +0900 mm, compaction: fix fast_isolate_around() to stay within boundaries commit be21b32afe470c5ae98e27e49201158a47032942 upstream. Depending on the memory configuration, isolate_freepages_block() may scan pages out of the target range and causes panic. Panic can occur on systems with multiple zones in a single pageblock. The reason it is rare is that it only happens in special configurations. Depending on how many similar systems there are, it may be a good idea to fix this problem for older kernels as well. The problem is that pfn as argument of fast_isolate_around() could be out of the target range. Therefore we should consider the case where pfn < start_pfn, and also the case where end_pfn < pfn. This problem should have been addressd by the commit 6e2b7044c199 ("mm, compaction: make fast_isolate_freepages() stay within zone") but there was an oversight. Case1: pfn < start_pfn | node X's zone | node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ end_pfn ^ start_pfn = cc->zone->zone_start_pfn pfn <---------> scanned range by "Scan After" Case2: end_pfn < pfn | node X's zone | node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ pfn ^ end_pfn start_pfn <---------> scanned range by "Scan Before" It seems that there is no good reason to skip nr_isolated pages just after given pfn. So let perform simple scan from start to end instead of dividing the scan into "Before" and "After". Link: https://lkml.kernel.org/r/20221026112438.236336-1-a.naribayashi@fujitsu.com Fixes: 6e2b7044c199 ("mm, compaction: make fast_isolate_freepages() stay within zone"). Signed-off-by: NARIBAYASHI Akira Cc: David Rientjes Cc: Mel Gorman Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 91bd504128a51776472445070e11a3b0f9348c90 Author: Mikulas Patocka Date: Fri Nov 4 09:53:38 2022 -0400 md: fix a crash in mempool_free commit 341097ee53573e06ab9fc675d96a052385b851fa upstream. There's a crash in mempool_free when running the lvm test shell/lvchange-rebuild-raid.sh. The reason for the crash is this: * super_written calls atomic_dec_and_test(&mddev->pending_writes) and wake_up(&mddev->sb_wait). Then it calls rdev_dec_pending(rdev, mddev) and bio_put(bio). * so, the process that waited on sb_wait and that is woken up is racing with bio_put(bio). * if the process wins the race, it calls bioset_exit before bio_put(bio) is executed. * bio_put(bio) attempts to free a bio into a destroyed bio set - causing a crash in mempool_free. We fix this bug by moving bio_put before atomic_dec_and_test. We also move rdev_dec_pending before atomic_dec_and_test as suggested by Neil Brown. The function md_end_flush has a similar bug - we must call bio_put before we decrement the number of in-progress bios. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 11557f0067 P4D 11557f0067 PUD 0 Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 73 Comm: kworker/0:1 Not tainted 6.1.0-rc3 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: kdelayd flush_expired_bios [dm_delay] RIP: 0010:mempool_free+0x47/0x80 Code: 48 89 ef 5b 5d ff e0 f3 c3 48 89 f7 e8 32 45 3f 00 48 63 53 08 48 89 c6 3b 53 04 7d 2d 48 8b 43 10 8d 4a 01 48 89 df 89 4b 08 <48> 89 2c d0 e8 b0 45 3f 00 48 8d 7b 30 5b 5d 31 c9 ba 01 00 00 00 RSP: 0018:ffff88910036bda8 EFLAGS: 00010093 RAX: 0000000000000000 RBX: ffff8891037b65d8 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff8891037b65d8 RBP: ffff8891447ba240 R08: 0000000000012908 R09: 00000000003d0900 R10: 0000000000000000 R11: 0000000000173544 R12: ffff889101a14000 R13: ffff8891562ac300 R14: ffff889102b41440 R15: ffffe8ffffa00d05 FS: 0000000000000000(0000) GS:ffff88942fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000001102e99000 CR4: 00000000000006b0 Call Trace: clone_endio+0xf4/0x1c0 [dm_mod] clone_endio+0xf4/0x1c0 [dm_mod] __submit_bio+0x76/0x120 submit_bio_noacct_nocheck+0xb6/0x2a0 flush_expired_bios+0x28/0x2f [dm_delay] process_one_work+0x1b4/0x300 worker_thread+0x45/0x3e0 ? rescuer_thread+0x380/0x380 kthread+0xc2/0x100 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 Modules linked in: brd dm_delay dm_raid dm_mod af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg scsi_common [last unloaded: brd] CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- Signed-off-by: Mikulas Patocka Cc: stable@vger.kernel.org Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman commit 29328fbce56c367a3ab3c9b29080fbbf9d9ef384 Author: ChiYuan Huang Date: Thu Sep 29 10:00:17 2022 +0800 mfd: mt6360: Add bounds checking in Regmap read/write call-backs commit 5f4f94e9f26cca6514474b307b59348b8485e711 upstream. Fix the potential risk of OOB read if bank index is over the maximum. Refer to the discussion list for the experiment result on mt6370. https://lore.kernel.org/all/20220914013345.GA5802@cyhuang-hp-elitebook-840-g3.rt/ If not to check the bound, there is the same issue on mt6360. Cc: stable@vger.kernel.org Fixes: 3b0850440a06c (mfd: mt6360: Merge different sub-devices I2C read/write) Signed-off-by: ChiYuan Huang Signed-off-by: Lee Jones Link: https://lore.kernel.org/r/1664416817-31590-1-git-send-email-u0084500@gmail.com Signed-off-by: Greg Kroah-Hartman commit c24cc476acd8bccb5af54849aac5e779d8223bf5 Author: Christian Brauner Date: Sat Dec 17 22:28:40 2022 +0100 pnode: terminate at peers of source commit 11933cf1d91d57da9e5c53822a540bbdc2656c16 upstream. The propagate_mnt() function handles mount propagation when creating mounts and propagates the source mount tree @source_mnt to all applicable nodes of the destination propagation mount tree headed by @dest_mnt. Unfortunately it contains a bug where it fails to terminate at peers of @source_mnt when looking up copies of the source mount that become masters for copies of the source mount tree mounted on top of slaves in the destination propagation tree causing a NULL dereference. Once the mechanics of the bug are understood it's easy to trigger. Because of unprivileged user namespaces it is available to unprivileged users. While fixing this bug we've gotten confused multiple times due to unclear terminology or missing concepts. So let's start this with some clarifications: * The terms "master" or "peer" denote a shared mount. A shared mount belongs to a peer group. * A peer group is a set of shared mounts that propagate to each other. They are identified by a peer group id. The peer group id is available in @shared_mnt->mnt_group_id. Shared mounts within the same peer group have the same peer group id. The peers in a peer group can be reached via @shared_mnt->mnt_share. * The terms "slave mount" or "dependent mount" denote a mount that receives propagation from a peer in a peer group. IOW, shared mounts may have slave mounts and slave mounts have shared mounts as their master. Slave mounts of a given peer in a peer group are listed on that peers slave list available at @shared_mnt->mnt_slave_list. * The term "master mount" denotes a mount in a peer group. IOW, it denotes a shared mount or a peer mount in a peer group. The term "master mount" - or "master" for short - is mostly used when talking in the context of slave mounts that receive propagation from a master mount. A master mount of a slave identifies the closest peer group a slave mount receives propagation from. The master mount of a slave can be identified via @slave_mount->mnt_master. Different slaves may point to different masters in the same peer group. * Multiple peers in a peer group can have non-empty ->mnt_slave_lists. Non-empty ->mnt_slave_lists of peers don't intersect. Consequently, to ensure all slave mounts of a peer group are visited the ->mnt_slave_lists of all peers in a peer group have to be walked. * Slave mounts point to a peer in the closest peer group they receive propagation from via @slave_mnt->mnt_master (see above). Together with these peers they form a propagation group (see below). The closest peer group can thus be identified through the peer group id @slave_mnt->mnt_master->mnt_group_id of the peer/master that a slave mount receives propagation from. * A shared-slave mount is a slave mount to a peer group pg1 while also a peer in another peer group pg2. IOW, a peer group may receive propagation from another peer group. If a peer group pg1 is a slave to another peer group pg2 then all peers in peer group pg1 point to the same peer in peer group pg2 via ->mnt_master. IOW, all peers in peer group pg1 appear on the same ->mnt_slave_list. IOW, they cannot be slaves to different peer groups. * A pure slave mount is a slave mount that is a slave to a peer group but is not a peer in another peer group. * A propagation group denotes the set of mounts consisting of a single peer group pg1 and all slave mounts and shared-slave mounts that point to a peer in that peer group via ->mnt_master. IOW, all slave mounts such that @slave_mnt->mnt_master->mnt_group_id is equal to @shared_mnt->mnt_group_id. The concept of a propagation group makes it easier to talk about a single propagation level in a propagation tree. For example, in propagate_mnt() the immediate peers of @dest_mnt and all slaves of @dest_mnt's peer group form a propagation group propg1. So a shared-slave mount that is a slave in propg1 and that is a peer in another peer group pg2 forms another propagation group propg2 together with all slaves that point to that shared-slave mount in their ->mnt_master. * A propagation tree refers to all mounts that receive propagation starting from a specific shared mount. For example, for propagate_mnt() @dest_mnt is the start of a propagation tree. The propagation tree ecompasses all mounts that receive propagation from @dest_mnt's peer group down to the leafs. With that out of the way let's get to the actual algorithm. We know that @dest_mnt is guaranteed to be a pure shared mount or a shared-slave mount. This is guaranteed by a check in attach_recursive_mnt(). So propagate_mnt() will first propagate the source mount tree to all peers in @dest_mnt's peer group: for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) { ret = propagate_one(n); if (ret) goto out; } Notice, that the peer propagation loop of propagate_mnt() doesn't propagate @dest_mnt itself. @dest_mnt is mounted directly in attach_recursive_mnt() after we propagated to the destination propagation tree. The mount that will be mounted on top of @dest_mnt is @source_mnt. This copy was created earlier even before we entered attach_recursive_mnt() and doesn't concern us a lot here. It's just important to notice that when propagate_mnt() is called @source_mnt will not yet have been mounted on top of @dest_mnt. Thus, @source_mnt->mnt_parent will either still point to @source_mnt or - in the case @source_mnt is moved and thus already attached - still to its former parent. For each peer @m in @dest_mnt's peer group propagate_one() will create a new copy of the source mount tree and mount that copy @child on @m such that @child->mnt_parent points to @m after propagate_one() returns. propagate_one() will stash the last destination propagation node @m in @last_dest and the last copy it created for the source mount tree in @last_source. Hence, if we call into propagate_one() again for the next destination propagation node @m, @last_dest will point to the previous destination propagation node and @last_source will point to the previous copy of the source mount tree and mounted on @last_dest. Each new copy of the source mount tree is created from the previous copy of the source mount tree. This will become important later. The peer loop in propagate_mnt() is straightforward. We iterate through the peers copying and updating @last_source and @last_dest as we go through them and mount each copy of the source mount tree @child on a peer @m in @dest_mnt's peer group. After propagate_mnt() handled the peers in @dest_mnt's peer group propagate_mnt() will propagate the source mount tree down the propagation tree that @dest_mnt's peer group propagates to: for (m = next_group(dest_mnt, dest_mnt); m; m = next_group(m, dest_mnt)) { /* everything in that slave group */ n = m; do { ret = propagate_one(n); if (ret) goto out; n = next_peer(n); } while (n != m); } The next_group() helper will recursively walk the destination propagation tree, descending into each propagation group of the propagation tree. The important part is that it takes care to propagate the source mount tree to all peers in the peer group of a propagation group before it propagates to the slaves to those peers in the propagation group. IOW, it creates and mounts copies of the source mount tree that become masters before it creates and mounts copies of the source mount tree that become slaves to these masters. It is important to remember that propagating the source mount tree to each mount @m in the destination propagation tree simply means that we create and mount new copies @child of the source mount tree on @m such that @child->mnt_parent points to @m. Since we know that each node @m in the destination propagation tree headed by @dest_mnt's peer group will be overmounted with a copy of the source mount tree and since we know that the propagation properties of each copy of the source mount tree we create and mount at @m will mostly mirror the propagation properties of @m. We can use that information to create and mount the copies of the source mount tree that become masters before their slaves. The easy case is always when @m and @last_dest are peers in a peer group of a given propagation group. In that case we know that we can simply copy @last_source without having to figure out what the master for the new copy @child of the source mount tree needs to be as we've done that in a previous call to propagate_one(). The hard case is when we're dealing with a slave mount or a shared-slave mount @m in a destination propagation group that we need to create and mount a copy of the source mount tree on. For each propagation group in the destination propagation tree we propagate the source mount tree to we want to make sure that the copies @child of the source mount tree we create and mount on slaves @m pick an ealier copy of the source mount tree that we mounted on a master @m of the destination propagation group as their master. This is a mouthful but as far as we can tell that's the core of it all. But, if we keep track of the masters in the destination propagation tree @m we can use the information to find the correct master for each copy of the source mount tree we create and mount at the slaves in the destination propagation tree @m. Let's walk through the base case as that's still fairly easy to grasp. If we're dealing with the first slave in the propagation group that @dest_mnt is in then we don't yet have marked any masters in the destination propagation tree. We know the master for the first slave to @dest_mnt's peer group is simple @dest_mnt. So we expect this algorithm to yield a copy of the source mount tree that was mounted on a peer in @dest_mnt's peer group as the master for the copy of the source mount tree we want to mount at the first slave @m: for (n = m; ; n = p) { p = n->mnt_master; if (p == dest_master || IS_MNT_MARKED(p)) break; } For the first slave we walk the destination propagation tree all the way up to a peer in @dest_mnt's peer group. IOW, the propagation hierarchy can be walked by walking up the @mnt->mnt_master hierarchy of the destination propagation tree @m. We will ultimately find a peer in @dest_mnt's peer group and thus ultimately @dest_mnt->mnt_master. Btw, here the assumption we listed at the beginning becomes important. Namely, that peers in a peer group pg1 that are slaves in another peer group pg2 appear on the same ->mnt_slave_list. IOW, all slaves who are peers in peer group pg1 point to the same peer in peer group pg2 via their ->mnt_master. Otherwise the termination condition in the code above would be wrong and next_group() would be broken too. So the first iteration sets: n = m; p = n->mnt_master; such that @p now points to a peer or @dest_mnt itself. We walk up one more level since we don't have any marked mounts. So we end up with: n = dest_mnt; p = dest_mnt->mnt_master; If @dest_mnt's peer group is not slave to another peer group then @p is now NULL. If @dest_mnt's peer group is a slave to another peer group then @p now points to @dest_mnt->mnt_master points which is a master outside the propagation tree we're dealing with. Now we need to figure out the master for the copy of the source mount tree we're about to create and mount on the first slave of @dest_mnt's peer group: do { struct mount *parent = last_source->mnt_parent; if (last_source == first_source) break; done = parent->mnt_master == p; if (done && peers(n, parent)) break; last_source = last_source->mnt_master; } while (!done); We know that @last_source->mnt_parent points to @last_dest and @last_dest is the last peer in @dest_mnt's peer group we propagated to in the peer loop in propagate_mnt(). Consequently, @last_source is the last copy we created and mount on that last peer in @dest_mnt's peer group. So @last_source is the master we want to pick. We know that @last_source->mnt_parent->mnt_master points to @last_dest->mnt_master. We also know that @last_dest->mnt_master is either NULL or points to a master outside of the destination propagation tree and so does @p. Hence: done = parent->mnt_master == p; is trivially true in the base condition. We also know that for the first slave mount of @dest_mnt's peer group that @last_dest either points @dest_mnt itself because it was initialized to: last_dest = dest_mnt; at the beginning of propagate_mnt() or it will point to a peer of @dest_mnt in its peer group. In both cases it is guaranteed that on the first iteration @n and @parent are peers (Please note the check for peers here as that's important.): if (done && peers(n, parent)) break; So, as we expected, we select @last_source, which referes to the last copy of the source mount tree we mounted on the last peer in @dest_mnt's peer group, as the master of the first slave in @dest_mnt's peer group. The rest is taken care of by clone_mnt(last_source, ...). We'll skip over that part otherwise this becomes a blogpost. At the end of propagate_mnt() we now mark @m->mnt_master as the first master in the destination propagation tree that is distinct from @dest_mnt->mnt_master. IOW, we mark @dest_mnt itself as a master. By marking @dest_mnt or one of it's peers we are able to easily find it again when we later lookup masters for other copies of the source mount tree we mount copies of the source mount tree on slaves @m to @dest_mnt's peer group. This, in turn allows us to find the master we selected for the copies of the source mount tree we mounted on master in the destination propagation tree again. The important part is to realize that the code makes use of the fact that the last copy of the source mount tree stashed in @last_source was mounted on top of the previous destination propagation node @last_dest. What this means is that @last_source allows us to walk the destination propagation hierarchy the same way each destination propagation node @m does. If we take @last_source, which is the copy of @source_mnt we have mounted on @last_dest in the previous iteration of propagate_one(), then we know @last_source->mnt_parent points to @last_dest but we also know that as we walk through the destination propagation tree that @last_source->mnt_master will point to an earlier copy of the source mount tree we mounted one an earlier destination propagation node @m. IOW, @last_source->mnt_parent will be our hook into the destination propagation tree and each consecutive @last_source->mnt_master will lead us to an earlier propagation node @m via @last_source->mnt_master->mnt_parent. Hence, by walking up @last_source->mnt_master, each of which is mounted on a node that is a master @m in the destination propagation tree we can also walk up the destination propagation hierarchy. So, for each new destination propagation node @m we use the previous copy of @last_source and the fact it's mounted on the previous propagation node @last_dest via @last_source->mnt_master->mnt_parent to determine what the master of the new copy of @last_source needs to be. The goal is to find the _closest_ master that the new copy of the source mount tree we are about to create and mount on a slave @m in the destination propagation tree needs to pick. IOW, we want to find a suitable master in the propagation group. As the propagation structure of the source mount propagation tree we create mirrors the propagation structure of the destination propagation tree we can find @m's closest master - i.e., a marked master - which is a peer in the closest peer group that @m receives propagation from. We store that closest master of @m in @p as before and record the slave to that master in @n We then search for this master @p via @last_source by walking up the master hierarchy starting from the last copy of the source mount tree stored in @last_source that we created and mounted on the previous destination propagation node @m. We will try to find the master by walking @last_source->mnt_master and by comparing @last_source->mnt_master->mnt_parent->mnt_master to @p. If we find @p then we can figure out what earlier copy of the source mount tree needs to be the master for the new copy of the source mount tree we're about to create and mount at the current destination propagation node @m. If @last_source->mnt_master->mnt_parent and @n are peers then we know that the closest master they receive propagation from is @last_source->mnt_master->mnt_parent->mnt_master. If not then the closest immediate peer group that they receive propagation from must be one level higher up. This builds on the earlier clarification at the beginning that all peers in a peer group which are slaves of other peer groups all point to the same ->mnt_master, i.e., appear on the same ->mnt_slave_list, of the closest peer group that they receive propagation from. However, terminating the walk has corner cases. If the closest marked master for a given destination node @m cannot be found by walking up the master hierarchy via @last_source->mnt_master then we need to terminate the walk when we encounter @source_mnt again. This isn't an arbitrary termination. It simply means that the new copy of the source mount tree we're about to create has a copy of the source mount tree we created and mounted on a peer in @dest_mnt's peer group as its master. IOW, @source_mnt is the peer in the closest peer group that the new copy of the source mount tree receives propagation from. We absolutely have to stop @source_mnt because @last_source->mnt_master either points outside the propagation hierarchy we're dealing with or it is NULL because @source_mnt isn't a shared-slave. So continuing the walk past @source_mnt would cause a NULL dereference via @last_source->mnt_master->mnt_parent. And so we have to stop the walk when we encounter @source_mnt again. One scenario where this can happen is when we first handled a series of slaves of @dest_mnt's peer group and then encounter peers in a new peer group that is a slave to @dest_mnt's peer group. We handle them and then we encounter another slave mount to @dest_mnt that is a pure slave to @dest_mnt's peer group. That pure slave will have a peer in @dest_mnt's peer group as its master. Consequently, the new copy of the source mount tree will need to have @source_mnt as it's master. So we walk the propagation hierarchy all the way up to @source_mnt based on @last_source->mnt_master. So terminate on @source_mnt, easy peasy. Except, that the check misses something that the rest of the algorithm already handles. If @dest_mnt has peers in it's peer group the peer loop in propagate_mnt(): for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) { ret = propagate_one(n); if (ret) goto out; } will consecutively update @last_source with each previous copy of the source mount tree we created and mounted at the previous peer in @dest_mnt's peer group. So after that loop terminates @last_source will point to whatever copy of the source mount tree was created and mounted on the last peer in @dest_mnt's peer group. Furthermore, if there is even a single additional peer in @dest_mnt's peer group then @last_source will __not__ point to @source_mnt anymore. Because, as we mentioned above, @dest_mnt isn't even handled in this loop but directly in attach_recursive_mnt(). So it can't even accidently come last in that peer loop. So the first time we handle a slave mount @m of @dest_mnt's peer group the copy of the source mount tree we create will make the __last copy of the source mount tree we created and mounted on the last peer in @dest_mnt's peer group the master of the new copy of the source mount tree we create and mount on the first slave of @dest_mnt's peer group__. But this means that the termination condition that checks for @source_mnt is wrong. The @source_mnt cannot be found anymore by propagate_one(). Instead it will find the last copy of the source mount tree we created and mounted for the last peer of @dest_mnt's peer group again. And that is a peer of @source_mnt not @source_mnt itself. IOW, we fail to terminate the loop correctly and ultimately dereference @last_source->mnt_master->mnt_parent. When @source_mnt's peer group isn't slave to another peer group then @last_source->mnt_master is NULL causing the splat below. For example, assume @dest_mnt is a pure shared mount and has three peers in its peer group: =================================================================================== mount-id mount-parent-id peer-group-id =================================================================================== (@dest_mnt) mnt_master[216] 309 297 shared:216 \ (@source_mnt) mnt_master[218]: 609 609 shared:218 (1) mnt_master[216]: 607 605 shared:216 \ (P1) mnt_master[218]: 624 607 shared:218 (2) mnt_master[216]: 576 574 shared:216 \ (P2) mnt_master[218]: 625 576 shared:218 (3) mnt_master[216]: 545 543 shared:216 \ (P3) mnt_master[218]: 626 545 shared:218 After this sequence has been processed @last_source will point to (P3), the copy generated for the third peer in @dest_mnt's peer group we handled. So the copy of the source mount tree (P4) we create and mount on the first slave of @dest_mnt's peer group: =================================================================================== mount-id mount-parent-id peer-group-id =================================================================================== mnt_master[216] 309 297 shared:216 / / (S0) mnt_slave 483 481 master:216 \ \ (P3) mnt_master[218] 626 545 shared:218 \ / \/ (P4) mnt_slave 627 483 master:218 will pick the last copy of the source mount tree (P3) as master, not (S0). When walking the propagation hierarchy via @last_source's master hierarchy we encounter (P3) but not (S0), i.e., @source_mnt. We can fix this in multiple ways: (1) By setting @last_source to @source_mnt after we processed the peers in @dest_mnt's peer group right after the peer loop in propagate_mnt(). (2) By changing the termination condition that relies on finding exactly @source_mnt to finding a peer of @source_mnt. (3) By only moving @last_source when we actually venture into a new peer group or some clever variant thereof. The first two options are minimally invasive and what we want as a fix. The third option is more intrusive but something we'd like to explore in the near future. This passes all LTP tests and specifically the mount propagation testsuite part of it. It also holds up against all known reproducers of this issues. Final words. First, this is a clever but __worringly__ underdocumented algorithm. There isn't a single detailed comment to be found in next_group(), propagate_one() or anywhere else in that file for that matter. This has been a giant pain to understand and work through and a bug like this is insanely difficult to fix without a detailed understanding of what's happening. Let's not talk about the amount of time that was sunk into fixing this. Second, all the cool kids with access to unshare --mount --user --map-root --propagation=unchanged are going to have a lot of fun. IOW, triggerable by unprivileged users while namespace_lock() lock is held. [ 115.848393] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 115.848967] #PF: supervisor read access in kernel mode [ 115.849386] #PF: error_code(0x0000) - not-present page [ 115.849803] PGD 0 P4D 0 [ 115.850012] Oops: 0000 [#1] PREEMPT SMP PTI [ 115.850354] CPU: 0 PID: 15591 Comm: mount Not tainted 6.1.0-rc7 #3 [ 115.850851] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 115.851510] RIP: 0010:propagate_one.part.0+0x7f/0x1a0 [ 115.851924] Code: 75 eb 4c 8b 05 c2 25 37 02 4c 89 ca 48 8b 4a 10 49 39 d0 74 1e 48 3b 81 e0 00 00 00 74 26 48 8b 92 e0 00 00 00 be 01 00 00 00 <48> 8b 4a 10 49 39 d0 75 e2 40 84 f6 74 38 4c 89 05 84 25 37 02 4d [ 115.853441] RSP: 0018:ffffb8d5443d7d50 EFLAGS: 00010282 [ 115.853865] RAX: ffff8e4d87c41c80 RBX: ffff8e4d88ded780 RCX: ffff8e4da4333a00 [ 115.854458] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8e4d88ded780 [ 115.855044] RBP: ffff8e4d88ded780 R08: ffff8e4da4338000 R09: ffff8e4da43388c0 [ 115.855693] R10: 0000000000000002 R11: ffffb8d540158000 R12: ffffb8d5443d7da8 [ 115.856304] R13: ffff8e4d88ded780 R14: 0000000000000000 R15: 0000000000000000 [ 115.856859] FS: 00007f92c90c9800(0000) GS:ffff8e4dfdc00000(0000) knlGS:0000000000000000 [ 115.857531] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.858006] CR2: 0000000000000010 CR3: 0000000022f4c002 CR4: 00000000000706f0 [ 115.858598] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 115.859393] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 115.860099] Call Trace: [ 115.860358] [ 115.860535] propagate_mnt+0x14d/0x190 [ 115.860848] attach_recursive_mnt+0x274/0x3e0 [ 115.861212] path_mount+0x8c8/0xa60 [ 115.861503] __x64_sys_mount+0xf6/0x140 [ 115.861819] do_syscall_64+0x5b/0x80 [ 115.862117] ? do_faccessat+0x123/0x250 [ 115.862435] ? syscall_exit_to_user_mode+0x17/0x40 [ 115.862826] ? do_syscall_64+0x67/0x80 [ 115.863133] ? syscall_exit_to_user_mode+0x17/0x40 [ 115.863527] ? do_syscall_64+0x67/0x80 [ 115.863835] ? do_syscall_64+0x67/0x80 [ 115.864144] ? do_syscall_64+0x67/0x80 [ 115.864452] ? exc_page_fault+0x70/0x170 [ 115.864775] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 115.865187] RIP: 0033:0x7f92c92b0ebe [ 115.865480] Code: 48 8b 0d 75 4f 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 4f 0c 00 f7 d8 64 89 01 48 [ 115.866984] RSP: 002b:00007fff000aa728 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 115.867607] RAX: ffffffffffffffda RBX: 000055a77888d6b0 RCX: 00007f92c92b0ebe [ 115.868240] RDX: 000055a77888d8e0 RSI: 000055a77888e6e0 RDI: 000055a77888e620 [ 115.868823] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 [ 115.869403] R10: 0000000000001000 R11: 0000000000000246 R12: 000055a77888e620 [ 115.869994] R13: 000055a77888d8e0 R14: 00000000ffffffff R15: 00007f92c93e4076 [ 115.870581] [ 115.870763] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables nfnetlink qrtr snd_intel8x0 sunrpc snd_ac97_codec ac97_bus snd_pcm snd_timer intel_rapl_msr intel_rapl_common snd vboxguest intel_powerclamp video rapl joydev soundcore i2c_piix4 wmi fuse zram xfs vmwgfx crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic drm_ttm_helper ttm e1000 ghash_clmulni_intel serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath [ 115.875288] CR2: 0000000000000010 [ 115.875641] ---[ end trace 0000000000000000 ]--- [ 115.876135] RIP: 0010:propagate_one.part.0+0x7f/0x1a0 [ 115.876551] Code: 75 eb 4c 8b 05 c2 25 37 02 4c 89 ca 48 8b 4a 10 49 39 d0 74 1e 48 3b 81 e0 00 00 00 74 26 48 8b 92 e0 00 00 00 be 01 00 00 00 <48> 8b 4a 10 49 39 d0 75 e2 40 84 f6 74 38 4c 89 05 84 25 37 02 4d [ 115.878086] RSP: 0018:ffffb8d5443d7d50 EFLAGS: 00010282 [ 115.878511] RAX: ffff8e4d87c41c80 RBX: ffff8e4d88ded780 RCX: ffff8e4da4333a00 [ 115.879128] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8e4d88ded780 [ 115.879715] RBP: ffff8e4d88ded780 R08: ffff8e4da4338000 R09: ffff8e4da43388c0 [ 115.880359] R10: 0000000000000002 R11: ffffb8d540158000 R12: ffffb8d5443d7da8 [ 115.880962] R13: ffff8e4d88ded780 R14: 0000000000000000 R15: 0000000000000000 [ 115.881548] FS: 00007f92c90c9800(0000) GS:ffff8e4dfdc00000(0000) knlGS:0000000000000000 [ 115.882234] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.882713] CR2: 0000000000000010 CR3: 0000000022f4c002 CR4: 00000000000706f0 [ 115.883314] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 115.883966] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: f2ebb3a921c1 ("smarter propagate_mnt()") Fixes: 5ec0811d3037 ("propogate_mnt: Handle the first propogated copy being a slave") Cc: Reported-by: Ditang Chen Signed-off-by: Seth Forshee (Digital Ocean) Signed-off-by: Christian Brauner (Microsoft) Signed-off-by: Greg Kroah-Hartman commit 0c9118e381ff538874e00fd4e66a768273c150fb Author: Artem Egorkine Date: Sun Dec 25 12:57:28 2022 +0200 ALSA: line6: fix stack overflow in line6_midi_transmit commit b8800d324abb50160560c636bfafe2c81001b66c upstream. Correctly calculate available space including the size of the chunk buffer. This fixes a buffer overflow when multiple MIDI sysex messages are sent to a PODxt device. Signed-off-by: Artem Egorkine Cc: Link: https://lore.kernel.org/r/20221225105728.1153989-2-arteme@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit ac4b4fdf32626397dd943c86d86e3f5d9d9812a4 Author: Artem Egorkine Date: Sun Dec 25 12:57:27 2022 +0200 ALSA: line6: correct midi status byte when receiving data from podxt commit 8508fa2e7472f673edbeedf1b1d2b7a6bb898ecc upstream. A PODxt device sends 0xb2, 0xc2 or 0xf2 as a status byte for MIDI messages over USB that should otherwise have a 0xb0, 0xc0 or 0xf0 status byte. This is usually corrected by the driver on other OSes. This fixes MIDI sysex messages sent by PODxt. [ tiwai: fixed white spaces ] Signed-off-by: Artem Egorkine Cc: Link: https://lore.kernel.org/r/20221225105728.1153989-1-arteme@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 83c44f0ebfd075ec7526487ec1835c44aeb74e79 Author: Zhang Tianci Date: Thu Sep 1 16:29:29 2022 +0800 ovl: Use ovl mounter's fsuid and fsgid in ovl_link() commit 5b0db51215e895a361bc63132caa7cca36a53d6a upstream. There is a wrong case of link() on overlay: $ mkdir /lower /fuse /merge $ mount -t fuse /fuse $ mkdir /fuse/upper /fuse/work $ mount -t overlay /merge -o lowerdir=/lower,upperdir=/fuse/upper,\ workdir=work $ touch /merge/file $ chown bin.bin /merge/file // the file's caller becomes "bin" $ ln /merge/file /merge/lnkfile Then we will get an error(EACCES) because fuse daemon checks the link()'s caller is "bin", it denied this request. In the changing history of ovl_link(), there are two key commits: The first is commit bb0d2b8ad296 ("ovl: fix sgid on directory") which overrides the cred's fsuid/fsgid using the new inode. The new inode's owner is initialized by inode_init_owner(), and inode->fsuid is assigned to the current user. So the override fsuid becomes the current user. We know link() is actually modifying the directory, so the caller must have the MAY_WRITE permission on the directory. The current caller may should have this permission. This is acceptable to use the caller's fsuid. The second is commit 51f7e52dc943 ("ovl: share inode for hard link") which removed the inode creation in ovl_link(). This commit move inode_init_owner() into ovl_create_object(), so the ovl_link() just give the old inode to ovl_create_or_link(). Then the override fsuid becomes the old inode's fsuid, neither the caller nor the overlay's mounter! So this is incorrect. Fix this bug by using ovl mounter's fsuid/fsgid to do underlying fs's link(). Link: https://lore.kernel.org/all/20220817102952.xnvesg3a7rbv576x@wittgenstein/T Link: https://lore.kernel.org/lkml/20220825130552.29587-1-zhangtianci.1997@bytedance.com/t Signed-off-by: Zhang Tianci Signed-off-by: Jiachen Zhang Reviewed-by: Christian Brauner (Microsoft) Fixes: 51f7e52dc943 ("ovl: share inode for hard link") Cc: # v4.8 Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit fcb94283e01422cdaeba3d2771759962a17f4113 Author: Wang Yufen Date: Fri Dec 2 09:41:01 2022 +0800 binfmt: Fix error return code in load_elf_fdpic_binary() commit e7f703ff2507f4e9f496da96cd4b78fd3026120c upstream. Fix to return a negative error code from create_elf_fdpic_tables() instead of 0. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Wang Yufen Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/1669945261-30271-1-git-send-email-wangyufen@huawei.com Signed-off-by: Greg Kroah-Hartman commit ed9947277b2d29c1663fbe6cc9e94fa23d8cafe8 Author: Aditya Garg Date: Wed Dec 7 03:05:40 2022 +0000 hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount commit 9f2b5debc07073e6dfdd774e3594d0224b991927 upstream. Despite specifying UID and GID in mount command, the specified UID and GID were not being assigned. This patch fixes this issue. Link: https://lkml.kernel.org/r/C0264BF5-059C-45CF-B8DA-3A3BD2C803A2@live.com Signed-off-by: Aditya Garg Reviewed-by: Viacheslav Dubeyko Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 76d52b54127c65ef7211eb2e638e0ad5cb6c6630 Author: Qiujun Huang Date: Sun Sep 4 23:17:13 2022 +0800 pstore/zone: Use GFP_ATOMIC to allocate zone buffer commit 99b3b837855b987563bcfb397cf9ddd88262814b upstream. There is a case found when triggering a panic_on_oom, pstore fails to dump kmsg. Because psz_kmsg_write_record can't get the new buffer. Handle this by using GFP_ATOMIC to allocate a buffer at lower watermark. Signed-off-by: Qiujun Huang Fixes: 335426c6dcdd ("pstore/zone: Provide way to skip "broken" zone for MTD devices") Cc: WeiXiong Liao Cc: stable@vger.kernel.org Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/CAJRQjofRCF7wjrYmw3D7zd5QZnwHQq+F8U-mJDJ6NZ4bddYdLA@mail.gmail.com Signed-off-by: Greg Kroah-Hartman commit 74b0a2fcc31a98a049cbc57815247532024b32f3 Author: Luca Stefani Date: Thu Dec 22 14:10:49 2022 +0100 pstore: Properly assign mem_type property commit beca3e311a49cd3c55a056096531737d7afa4361 upstream. If mem-type is specified in the device tree it would end up overriding the record_size field instead of populating mem_type. As record_size is currently parsed after the improper assignment with default size 0 it continued to work as expected regardless of the value found in the device tree. Simply changing the target field of the struct is enough to get mem-type working as expected. Fixes: 9d843e8fafc7 ("pstore: Add mem_type property DT parsing support") Cc: stable@vger.kernel.org Signed-off-by: Luca Stefani Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20221222131049.286288-1-luca@osomprivacy.com Signed-off-by: Greg Kroah-Hartman commit d25aac3489af809485ee83b779218cfb3ad7e8a8 Author: Terry Junge Date: Thu Dec 8 15:05:06 2022 -0800 HID: plantronics: Additional PIDs for double volume key presses quirk [ Upstream commit 3d57f36c89d8ba32b2c312f397a37fd1a2dc7cfc ] I no longer work for Plantronics (aka Poly, aka HP) and do not have access to the headsets in order to test. However, as noted by Maxim, the other 32xx models that share the same base code set as the 3220 would need the same quirk. This patch adds the PIDs for the rest of the Blackwire 32XX product family that require the quirk. Plantronics Blackwire 3210 Series (047f:c055) Plantronics Blackwire 3215 Series (047f:c057) Plantronics Blackwire 3225 Series (047f:c058) Quote from previous patch by Maxim Mikityanskiy Plantronics Blackwire 3220 Series (047f:c056) sends HID reports twice for each volume key press. This patch adds a quirk to hid-plantronics for this product ID, which will ignore the second volume key press if it happens within 5 ms from the last one that was handled. The patch was tested on the mentioned model only, it shouldn't affect other models, however, this quirk might be needed for them too. Auto-repeat (when a key is held pressed) is not affected, because the rate is about 3 times per second, which is far less frequent than once in 5 ms. End quote Signed-off-by: Terry Junge Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit 9d4294545c1db15d85247ad776a2e2e13e0fd856 Author: José Expósito Date: Mon Nov 28 17:57:05 2022 +0100 HID: multitouch: fix Asus ExpertBook P2 P2451FA trackpoint [ Upstream commit 4eab1c2fe06c98a4dff258dd64800b6986c101e9 ] The HID descriptor of this device contains two mouse collections, one for mouse emulation and the other for the trackpoint. Both collections get merged and, because the first one defines X and Y, the movemenent events reported by the trackpoint collection are ignored. Set the MT_CLS_WIN_8_FORCE_MULTI_INPUT class for this device to be able to receive its reports. This fix is similar to/based on commit 40d5bb87377a ("HID: multitouch: enable multi-input as a quirk for some devices"). Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/825 Reported-by: Akito Tested-by: Akito Signed-off-by: José Expósito Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit 7280fdb80bf0fe35d9b799fc7009f2cbe0a397d7 Author: Nathan Lynch Date: Fri Nov 18 09:07:42 2022 -0600 powerpc/rtas: avoid scheduling in rtas_os_term() [ Upstream commit 6c606e57eecc37d6b36d732b1ff7e55b7dc32dd4 ] It's unsafe to use rtas_busy_delay() to handle a busy status from the ibm,os-term RTAS function in rtas_os_term(): Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:618 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 CPU: 7 PID: 1 Comm: swapper/0 Tainted: G D 6.0.0-rc5-02182-gf8553a572277-dirty #9 Call Trace: [c000000007b8f000] [c000000001337110] dump_stack_lvl+0xb4/0x110 (unreliable) [c000000007b8f040] [c0000000002440e4] __might_resched+0x394/0x3c0 [c000000007b8f0e0] [c00000000004f680] rtas_busy_delay+0x120/0x1b0 [c000000007b8f100] [c000000000052d04] rtas_os_term+0xb8/0xf4 [c000000007b8f180] [c0000000001150fc] pseries_panic+0x50/0x68 [c000000007b8f1f0] [c000000000036354] ppc_panic_platform_handler+0x34/0x50 [c000000007b8f210] [c0000000002303c4] notifier_call_chain+0xd4/0x1c0 [c000000007b8f2b0] [c0000000002306cc] atomic_notifier_call_chain+0xac/0x1c0 [c000000007b8f2f0] [c0000000001d62b8] panic+0x228/0x4d0 [c000000007b8f390] [c0000000001e573c] do_exit+0x140c/0x1420 [c000000007b8f480] [c0000000001e586c] make_task_dead+0xdc/0x200 Use rtas_busy_delay_time() instead, which signals without side effects whether to attempt the ibm,os-term RTAS call again. Signed-off-by: Nathan Lynch Reviewed-by: Nicholas Piggin Reviewed-by: Andrew Donnellan Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20221118150751.469393-5-nathanl@linux.ibm.com Signed-off-by: Sasha Levin commit d8939315b7342860df143afe0adda6212cdd3193 Author: Nathan Lynch Date: Fri Nov 18 09:07:41 2022 -0600 powerpc/rtas: avoid device tree lookups in rtas_os_term() [ Upstream commit ed2213bfb192ab51f09f12e9b49b5d482c6493f3 ] rtas_os_term() is called during panic. Its behavior depends on a couple of conditions in the /rtas node of the device tree, the traversal of which entails locking and local IRQ state changes. If the kernel panics while devtree_lock is held, rtas_os_term() as currently written could hang. Instead of discovering the relevant characteristics at panic time, cache them in file-static variables at boot. Note the lookup for "ibm,extended-os-term" is converted to of_property_read_bool() since it is a boolean property, not an RTAS function token. Signed-off-by: Nathan Lynch Reviewed-by: Nicholas Piggin Reviewed-by: Andrew Donnellan [mpe: Incorporate suggested change from Nick] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20221118150751.469393-4-nathanl@linux.ibm.com Signed-off-by: Sasha Levin commit 23a249b1185cdd5bfb6971d1608ba49e589f2288 Author: Christophe Leroy Date: Mon Nov 14 23:27:46 2022 +0530 objtool: Fix SEGFAULT [ Upstream commit efb11fdb3e1a9f694fa12b70b21e69e55ec59c36 ] find_insn() will return NULL in case of failure. Check insn in order to avoid a kernel Oops for NULL pointer dereference. Tested-by: Naveen N. Rao Reviewed-by: Naveen N. Rao Acked-by: Josh Poimboeuf Acked-by: Peter Zijlstra (Intel) Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20221114175754.1131267-9-sv@linux.ibm.com Signed-off-by: Sasha Levin commit ed686e7a26dd19ae6b46bb662f735acfa88ff7bc Author: Yin Xiujiang Date: Mon Dec 6 10:40:45 2021 +0800 fs/ntfs3: Fix slab-out-of-bounds in r_page [ Upstream commit ecfbd57cf9c5ca225184ae266ce44ae473792132 ] When PAGE_SIZE is 64K, if read_log_page is called by log_read_rst for the first time, the size of *buffer would be equal to DefaultLogPageSize(4K).But for *buffer operations like memcpy, if the memory area size(n) which being assigned to buffer is larger than 4K (log->page_size(64K) or bytes(64K-page_off)), it will cause an out of boundary error. Call trace: [...] kasan_report+0x44/0x130 check_memory_region+0xf8/0x1a0 memcpy+0xc8/0x100 ntfs_read_run_nb+0x20c/0x460 read_log_page+0xd0/0x1f4 log_read_rst+0x110/0x75c log_replay+0x1e8/0x4aa0 ntfs_loadlog_and_replay+0x290/0x2d0 ntfs_fill_super+0x508/0xec0 get_tree_bdev+0x1fc/0x34c [...] Fix this by setting variable r_page to NULL in log_read_rst. Signed-off-by: Yin Xiujiang Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit dd34665cb00449504c9da90a0ef6d3f3da56b035 Author: Dan Carpenter Date: Sat Oct 15 11:28:55 2022 +0300 fs/ntfs3: Delete duplicate condition in ntfs_read_mft() [ Upstream commit 658015167a8432b88f5d032e9d85d8fd50e5bf2c ] There were two patches which addressed the same bug and added the same condition: commit 6db620863f85 ("fs/ntfs3: Validate data run offset") commit 887bfc546097 ("fs/ntfs3: Fix slab-out-of-bounds read in run_unpack") Delete one condition. Signed-off-by: Dan Carpenter Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit a9847a11b683784d78893a8c8c07e5ceeda8b1c6 Author: Tetsuo Handa Date: Sun Oct 2 23:54:11 2022 +0900 fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_fill_super() [ Upstream commit 59bfd7a483da36bd202532a3d9ea1f14f3bf3aaf ] syzbot is reporting too large allocation at ntfs_fill_super() [1], for a crafted filesystem can contain bogus inode->i_size. Add __GFP_NOWARN in order to avoid too large allocation warning, than exhausting memory by using kvmalloc(). Link: https://syzkaller.appspot.com/bug?extid=33f3faaa0c08744f7d40 [1] Reported-by: syzot Signed-off-by: Tetsuo Handa Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit abd2ee2cf42f7fe5bb49c0d86e7d540765b0fbae Author: Tetsuo Handa Date: Sun Oct 2 23:39:15 2022 +0900 fs/ntfs3: Use __GFP_NOWARN allocation at wnd_init() [ Upstream commit 0d0f659bf713662fabed973f9996b8f23c59ca51 ] syzbot is reporting too large allocation at wnd_init() [1], for a crafted filesystem can become wnd->nwnd close to UINT_MAX. Add __GFP_NOWARN in order to avoid too large allocation warning, than exhausting memory by using kvcalloc(). Link: https://syzkaller.appspot.com/bug?extid=fa4648a5446460b7b963 [1] Reported-by: syzot Signed-off-by: Tetsuo Handa Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit d7ce7bb6881aae186e50f57eea935cff8d504751 Author: Edward Lo Date: Fri Sep 30 09:58:40 2022 +0800 fs/ntfs3: Validate index root when initialize NTFS security [ Upstream commit bfcdbae0523bd95eb75a739ffb6221a37109881e ] This enhances the sanity check for $SDH and $SII while initializing NTFS security, guarantees these index root are legit. [ 162.459513] BUG: KASAN: use-after-free in hdr_find_e.isra.0+0x10c/0x320 [ 162.460176] Read of size 2 at addr ffff8880037bca99 by task mount/243 [ 162.460851] [ 162.461252] CPU: 0 PID: 243 Comm: mount Not tainted 6.0.0-rc7 #42 [ 162.461744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 162.462609] Call Trace: [ 162.462954] [ 162.463276] dump_stack_lvl+0x49/0x63 [ 162.463822] print_report.cold+0xf5/0x689 [ 162.464608] ? unwind_get_return_address+0x3a/0x60 [ 162.465766] ? hdr_find_e.isra.0+0x10c/0x320 [ 162.466975] kasan_report+0xa7/0x130 [ 162.467506] ? _raw_spin_lock_irq+0xc0/0xf0 [ 162.467998] ? hdr_find_e.isra.0+0x10c/0x320 [ 162.468536] __asan_load2+0x68/0x90 [ 162.468923] hdr_find_e.isra.0+0x10c/0x320 [ 162.469282] ? cmp_uints+0xe0/0xe0 [ 162.469557] ? cmp_sdh+0x90/0x90 [ 162.469864] ? ni_find_attr+0x214/0x300 [ 162.470217] ? ni_load_mi+0x80/0x80 [ 162.470479] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 162.470931] ? ntfs_bread_run+0x190/0x190 [ 162.471307] ? indx_get_root+0xe4/0x190 [ 162.471556] ? indx_get_root+0x140/0x190 [ 162.471833] ? indx_init+0x1e0/0x1e0 [ 162.472069] ? fnd_clear+0x115/0x140 [ 162.472363] ? _raw_spin_lock_irqsave+0x100/0x100 [ 162.472731] indx_find+0x184/0x470 [ 162.473461] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ 162.474429] ? indx_find_buffer+0x2d0/0x2d0 [ 162.474704] ? do_syscall_64+0x3b/0x90 [ 162.474962] dir_search_u+0x196/0x2f0 [ 162.475381] ? ntfs_nls_to_utf16+0x450/0x450 [ 162.475661] ? ntfs_security_init+0x3d6/0x440 [ 162.475906] ? is_sd_valid+0x180/0x180 [ 162.476191] ntfs_extend_init+0x13f/0x2c0 [ 162.476496] ? ntfs_fix_post_read+0x130/0x130 [ 162.476861] ? iput.part.0+0x286/0x320 [ 162.477325] ntfs_fill_super+0x11e0/0x1b50 [ 162.477709] ? put_ntfs+0x1d0/0x1d0 [ 162.477970] ? vsprintf+0x20/0x20 [ 162.478258] ? set_blocksize+0x95/0x150 [ 162.478538] get_tree_bdev+0x232/0x370 [ 162.478789] ? put_ntfs+0x1d0/0x1d0 [ 162.479038] ntfs_fs_get_tree+0x15/0x20 [ 162.479374] vfs_get_tree+0x4c/0x130 [ 162.479729] path_mount+0x654/0xfe0 [ 162.480124] ? putname+0x80/0xa0 [ 162.480484] ? finish_automount+0x2e0/0x2e0 [ 162.480894] ? putname+0x80/0xa0 [ 162.481467] ? kmem_cache_free+0x1c4/0x440 [ 162.482280] ? putname+0x80/0xa0 [ 162.482714] do_mount+0xd6/0xf0 [ 162.483264] ? path_mount+0xfe0/0xfe0 [ 162.484782] ? __kasan_check_write+0x14/0x20 [ 162.485593] __x64_sys_mount+0xca/0x110 [ 162.486024] do_syscall_64+0x3b/0x90 [ 162.486543] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 162.487141] RIP: 0033:0x7f9d374e948a [ 162.488324] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 162.489728] RSP: 002b:00007ffe30e73d18 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 162.490971] RAX: ffffffffffffffda RBX: 0000561cdb43a060 RCX: 00007f9d374e948a [ 162.491669] RDX: 0000561cdb43a260 RSI: 0000561cdb43a2e0 RDI: 0000561cdb442af0 [ 162.492050] RBP: 0000000000000000 R08: 0000561cdb43a280 R09: 0000000000000020 [ 162.492459] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000561cdb442af0 [ 162.493183] R13: 0000561cdb43a260 R14: 0000000000000000 R15: 00000000ffffffff [ 162.493644] [ 162.493908] [ 162.494214] The buggy address belongs to the physical page: [ 162.494761] page:000000003e38a3d5 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x37bc [ 162.496064] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) [ 162.497278] raw: 000fffffc0000000 ffffea00000df1c8 ffffea00000df008 0000000000000000 [ 162.498928] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000 [ 162.500542] page dumped because: kasan: bad access detected [ 162.501057] [ 162.501242] Memory state around the buggy address: [ 162.502230] ffff8880037bc980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 162.502977] ffff8880037bca00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 162.503522] >ffff8880037bca80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 162.503963] ^ [ 162.504370] ffff8880037bcb00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 162.504766] ffff8880037bcb80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit f29676cc3a46dfc669488472d0575fc241482573 Author: Pierre-Louis Bossart Date: Tue Oct 18 09:25:00 2022 +0800 soundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15 [ Upstream commit f74495761df10c25a98256d16ea7465191b6e2cd ] Some NUC15 LAPBC710 devices don't expose the same DMI information as the Intel reference, add additional entry in the match table. BugLink: https://github.com/thesofproject/linux/issues/3885 Signed-off-by: Pierre-Louis Bossart Reviewed-by: Ranjani Sridharan Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20221018012500.1592994-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 9c8471a17f1f15b18cb7b96cba86e6f9bd6aae1c Author: Hawkins Jiawei Date: Fri Sep 23 19:09:04 2022 +0800 fs/ntfs3: Fix slab-out-of-bounds read in run_unpack [ Upstream commit 887bfc546097fbe8071dac13b2fef73b77920899 ] Syzkaller reports slab-out-of-bounds bug as follows: ================================================================== BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944 Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611 [...] Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold+0x2ba/0x719 mm/kasan/report.c:433 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495 run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944 run_unpack_ex+0xb0/0x7c0 fs/ntfs3/run.c:1057 ntfs_read_mft fs/ntfs3/inode.c:368 [inline] ntfs_iget5+0xc20/0x3280 fs/ntfs3/inode.c:501 ntfs_loadlog_and_replay+0x124/0x5d0 fs/ntfs3/fsntfs.c:272 ntfs_fill_super+0x1eff/0x37f0 fs/ntfs3/super.c:1018 get_tree_bdev+0x440/0x760 fs/super.c:1323 vfs_get_tree+0x89/0x2f0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x1326/0x1e20 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] The buggy address belongs to the physical page: page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8 head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0 flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff) page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Kernel will tries to read record and parse MFT from disk in ntfs_read_mft(). Yet the problem is that during enumerating attributes in record, kernel doesn't check whether run_off field loading from the disk is a valid value. To be more specific, if attr->nres.run_off is larger than attr->size, kernel will passes an invalid argument run_buf_size in run_unpack_ex(), which having an integer overflow. Then this invalid argument will triggers the slab-out-of-bounds Read bug as above. This patch solves it by adding the sanity check between the offset to packed runs and attribute size. link: https://lore.kernel.org/all/0000000000009145fc05e94bd5c3@google.com/#t Reported-and-tested-by: syzbot+8d6fbb27a6aded64b25b@syzkaller.appspotmail.com Signed-off-by: Hawkins Jiawei Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 3a52f17867727818ae8dbcfd9425033df32f92e0 Author: Edward Lo Date: Fri Sep 23 00:50:23 2022 +0800 fs/ntfs3: Validate resident attribute name [ Upstream commit 54e45702b648b7c0000e90b3e9b890e367e16ea8 ] Though we already have some sanity checks while enumerating attributes, resident attribute names aren't included. This patch checks the resident attribute names are in the valid ranges. [ 259.209031] BUG: KASAN: slab-out-of-bounds in ni_create_attr_list+0x1e1/0x850 [ 259.210770] Write of size 426 at addr ffff88800632f2b2 by task exp/255 [ 259.211551] [ 259.212035] CPU: 0 PID: 255 Comm: exp Not tainted 6.0.0-rc6 #37 [ 259.212955] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 259.214387] Call Trace: [ 259.214640] [ 259.214895] dump_stack_lvl+0x49/0x63 [ 259.215284] print_report.cold+0xf5/0x689 [ 259.215565] ? kasan_poison+0x3c/0x50 [ 259.215778] ? kasan_unpoison+0x28/0x60 [ 259.215991] ? ni_create_attr_list+0x1e1/0x850 [ 259.216270] kasan_report+0xa7/0x130 [ 259.216481] ? ni_create_attr_list+0x1e1/0x850 [ 259.216719] kasan_check_range+0x15a/0x1d0 [ 259.216939] memcpy+0x3c/0x70 [ 259.217136] ni_create_attr_list+0x1e1/0x850 [ 259.217945] ? __rcu_read_unlock+0x5b/0x280 [ 259.218384] ? ni_remove_attr+0x2e0/0x2e0 [ 259.218712] ? kernel_text_address+0xcf/0xe0 [ 259.219064] ? __kernel_text_address+0x12/0x40 [ 259.219434] ? arch_stack_walk+0x9e/0xf0 [ 259.219668] ? __this_cpu_preempt_check+0x13/0x20 [ 259.219904] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ 259.220140] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 259.220561] ni_ins_attr_ext+0x52c/0x5c0 [ 259.220984] ? ni_create_attr_list+0x850/0x850 [ 259.221532] ? run_deallocate+0x120/0x120 [ 259.221972] ? vfs_setxattr+0x128/0x300 [ 259.222688] ? setxattr+0x126/0x140 [ 259.222921] ? path_setxattr+0x164/0x180 [ 259.223431] ? __x64_sys_setxattr+0x6d/0x80 [ 259.223828] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 259.224417] ? mi_find_attr+0x3c/0xf0 [ 259.224772] ni_insert_attr+0x1ba/0x420 [ 259.225216] ? ni_ins_attr_ext+0x5c0/0x5c0 [ 259.225504] ? ntfs_read_ea+0x119/0x450 [ 259.225775] ni_insert_resident+0xc0/0x1c0 [ 259.226316] ? ni_insert_nonresident+0x400/0x400 [ 259.227001] ? __kasan_kmalloc+0x88/0xb0 [ 259.227468] ? __kmalloc+0x192/0x320 [ 259.227773] ntfs_set_ea+0x6bf/0xb30 [ 259.228216] ? ftrace_graph_ret_addr+0x2a/0xb0 [ 259.228494] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 259.228838] ? ntfs_read_ea+0x450/0x450 [ 259.229098] ? is_bpf_text_address+0x24/0x40 [ 259.229418] ? kernel_text_address+0xcf/0xe0 [ 259.229681] ? __kernel_text_address+0x12/0x40 [ 259.229948] ? unwind_get_return_address+0x3a/0x60 [ 259.230271] ? write_profile+0x270/0x270 [ 259.230537] ? arch_stack_walk+0x9e/0xf0 [ 259.230836] ntfs_setxattr+0x114/0x5c0 [ 259.231099] ? ntfs_set_acl_ex+0x2e0/0x2e0 [ 259.231529] ? evm_protected_xattr_common+0x6d/0x100 [ 259.231817] ? posix_xattr_acl+0x13/0x80 [ 259.232073] ? evm_protect_xattr+0x1f7/0x440 [ 259.232351] __vfs_setxattr+0xda/0x120 [ 259.232635] ? xattr_resolve_name+0x180/0x180 [ 259.232912] __vfs_setxattr_noperm+0x93/0x300 [ 259.233219] __vfs_setxattr_locked+0x141/0x160 [ 259.233492] ? kasan_poison+0x3c/0x50 [ 259.233744] vfs_setxattr+0x128/0x300 [ 259.234002] ? __vfs_setxattr_locked+0x160/0x160 [ 259.234837] do_setxattr+0xb8/0x170 [ 259.235567] ? vmemdup_user+0x53/0x90 [ 259.236212] setxattr+0x126/0x140 [ 259.236491] ? do_setxattr+0x170/0x170 [ 259.236791] ? debug_smp_processor_id+0x17/0x20 [ 259.237232] ? kasan_quarantine_put+0x57/0x180 [ 259.237605] ? putname+0x80/0xa0 [ 259.237870] ? __kasan_slab_free+0x11c/0x1b0 [ 259.238234] ? putname+0x80/0xa0 [ 259.238500] ? preempt_count_sub+0x18/0xc0 [ 259.238775] ? __mnt_want_write+0xaa/0x100 [ 259.238990] ? mnt_want_write+0x8b/0x150 [ 259.239290] path_setxattr+0x164/0x180 [ 259.239605] ? setxattr+0x140/0x140 [ 259.239849] ? debug_smp_processor_id+0x17/0x20 [ 259.240174] ? fpregs_assert_state_consistent+0x67/0x80 [ 259.240411] __x64_sys_setxattr+0x6d/0x80 [ 259.240715] do_syscall_64+0x3b/0x90 [ 259.240934] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 259.241697] RIP: 0033:0x7fc6b26e4469 [ 259.242647] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088 [ 259.244512] RSP: 002b:00007ffc3c7841f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000bc [ 259.245086] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6b26e4469 [ 259.246025] RDX: 00007ffc3c784380 RSI: 00007ffc3c7842e0 RDI: 00007ffc3c784238 [ 259.246961] RBP: 00007ffc3c788410 R08: 0000000000000001 R09: 00007ffc3c7884f8 [ 259.247775] R10: 000000000000007f R11: 0000000000000217 R12: 00000000004004e0 [ 259.248534] R13: 00007ffc3c7884f0 R14: 0000000000000000 R15: 0000000000000000 [ 259.249368] [ 259.249644] [ 259.249888] Allocated by task 255: [ 259.250283] kasan_save_stack+0x26/0x50 [ 259.250957] __kasan_kmalloc+0x88/0xb0 [ 259.251826] __kmalloc+0x192/0x320 [ 259.252745] ni_create_attr_list+0x11e/0x850 [ 259.253298] ni_ins_attr_ext+0x52c/0x5c0 [ 259.253685] ni_insert_attr+0x1ba/0x420 [ 259.253974] ni_insert_resident+0xc0/0x1c0 [ 259.254311] ntfs_set_ea+0x6bf/0xb30 [ 259.254629] ntfs_setxattr+0x114/0x5c0 [ 259.254859] __vfs_setxattr+0xda/0x120 [ 259.255155] __vfs_setxattr_noperm+0x93/0x300 [ 259.255445] __vfs_setxattr_locked+0x141/0x160 [ 259.255862] vfs_setxattr+0x128/0x300 [ 259.256251] do_setxattr+0xb8/0x170 [ 259.256522] setxattr+0x126/0x140 [ 259.256911] path_setxattr+0x164/0x180 [ 259.257308] __x64_sys_setxattr+0x6d/0x80 [ 259.257637] do_syscall_64+0x3b/0x90 [ 259.257970] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 259.258550] [ 259.258772] The buggy address belongs to the object at ffff88800632f000 [ 259.258772] which belongs to the cache kmalloc-1k of size 1024 [ 259.260190] The buggy address is located 690 bytes inside of [ 259.260190] 1024-byte region [ffff88800632f000, ffff88800632f400) [ 259.261412] [ 259.261743] The buggy address belongs to the physical page: [ 259.262354] page:0000000081e8cac9 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x632c [ 259.263722] head:0000000081e8cac9 order:2 compound_mapcount:0 compound_pincount:0 [ 259.264284] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) [ 259.265312] raw: 000fffffc0010200 ffffea0000060d00 dead000000000004 ffff888001041dc0 [ 259.265772] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000 [ 259.266305] page dumped because: kasan: bad access detected [ 259.266588] [ 259.266728] Memory state around the buggy address: [ 259.267225] ffff88800632f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 259.267841] ffff88800632f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 259.269111] >ffff88800632f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 259.269626] ^ [ 259.270162] ffff88800632f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 259.270810] ffff88800632f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 3cd9e5b41b83bb57ac3cf9888f9fef2a6ef8ed96 Author: Edward Lo Date: Thu Sep 22 15:30:44 2022 +0800 fs/ntfs3: Validate buffer length while parsing index [ Upstream commit 4d42ecda239cc13738d6fd84d098a32e67b368b9 ] indx_read is called when we have some NTFS directory operations that need more information from the index buffers. This adds a sanity check to make sure the returned index buffer length is legit, or we may have some out-of-bound memory accesses. [ 560.897595] BUG: KASAN: slab-out-of-bounds in hdr_find_e.isra.0+0x10c/0x320 [ 560.898321] Read of size 2 at addr ffff888009497238 by task exp/245 [ 560.898760] [ 560.899129] CPU: 0 PID: 245 Comm: exp Not tainted 6.0.0-rc6 #37 [ 560.899505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 560.900170] Call Trace: [ 560.900407] [ 560.900732] dump_stack_lvl+0x49/0x63 [ 560.901108] print_report.cold+0xf5/0x689 [ 560.901395] ? hdr_find_e.isra.0+0x10c/0x320 [ 560.901716] kasan_report+0xa7/0x130 [ 560.901950] ? hdr_find_e.isra.0+0x10c/0x320 [ 560.902208] __asan_load2+0x68/0x90 [ 560.902427] hdr_find_e.isra.0+0x10c/0x320 [ 560.902846] ? cmp_uints+0xe0/0xe0 [ 560.903363] ? cmp_sdh+0x90/0x90 [ 560.903883] ? ntfs_bread_run+0x190/0x190 [ 560.904196] ? rwsem_down_read_slowpath+0x750/0x750 [ 560.904969] ? ntfs_fix_post_read+0xe0/0x130 [ 560.905259] ? __kasan_check_write+0x14/0x20 [ 560.905599] ? up_read+0x1a/0x90 [ 560.905853] ? indx_read+0x22c/0x380 [ 560.906096] indx_find+0x2ef/0x470 [ 560.906352] ? indx_find_buffer+0x2d0/0x2d0 [ 560.906692] ? __kasan_kmalloc+0x88/0xb0 [ 560.906977] dir_search_u+0x196/0x2f0 [ 560.907220] ? ntfs_nls_to_utf16+0x450/0x450 [ 560.907464] ? __kasan_check_write+0x14/0x20 [ 560.907747] ? mutex_lock+0x8f/0xe0 [ 560.907970] ? __mutex_lock_slowpath+0x20/0x20 [ 560.908214] ? kmem_cache_alloc+0x143/0x4b0 [ 560.908459] ntfs_lookup+0xe0/0x100 [ 560.908788] __lookup_slow+0x116/0x220 [ 560.909050] ? lookup_fast+0x1b0/0x1b0 [ 560.909309] ? lookup_fast+0x13f/0x1b0 [ 560.909601] walk_component+0x187/0x230 [ 560.909944] link_path_walk.part.0+0x3f0/0x660 [ 560.910285] ? handle_lookup_down+0x90/0x90 [ 560.910618] ? path_init+0x642/0x6e0 [ 560.911084] ? percpu_counter_add_batch+0x6e/0xf0 [ 560.912559] ? __alloc_file+0x114/0x170 [ 560.913008] path_openat+0x19c/0x1d10 [ 560.913419] ? getname_flags+0x73/0x2b0 [ 560.913815] ? kasan_save_stack+0x3a/0x50 [ 560.914125] ? kasan_save_stack+0x26/0x50 [ 560.914542] ? __kasan_slab_alloc+0x6d/0x90 [ 560.914924] ? kmem_cache_alloc+0x143/0x4b0 [ 560.915339] ? getname_flags+0x73/0x2b0 [ 560.915647] ? getname+0x12/0x20 [ 560.916114] ? __x64_sys_open+0x4c/0x60 [ 560.916460] ? path_lookupat.isra.0+0x230/0x230 [ 560.916867] ? __isolate_free_page+0x2e0/0x2e0 [ 560.917194] do_filp_open+0x15c/0x1f0 [ 560.917448] ? may_open_dev+0x60/0x60 [ 560.917696] ? expand_files+0xa4/0x3a0 [ 560.917923] ? __kasan_check_write+0x14/0x20 [ 560.918185] ? _raw_spin_lock+0x88/0xdb [ 560.918409] ? _raw_spin_lock_irqsave+0x100/0x100 [ 560.918783] ? _find_next_bit+0x4a/0x130 [ 560.919026] ? _raw_spin_unlock+0x19/0x40 [ 560.919276] ? alloc_fd+0x14b/0x2d0 [ 560.919635] do_sys_openat2+0x32a/0x4b0 [ 560.920035] ? file_open_root+0x230/0x230 [ 560.920336] ? __rcu_read_unlock+0x5b/0x280 [ 560.920813] do_sys_open+0x99/0xf0 [ 560.921208] ? filp_open+0x60/0x60 [ 560.921482] ? exit_to_user_mode_prepare+0x49/0x180 [ 560.921867] __x64_sys_open+0x4c/0x60 [ 560.922128] do_syscall_64+0x3b/0x90 [ 560.922369] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 560.923030] RIP: 0033:0x7f7dff2e4469 [ 560.923681] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088 [ 560.924451] RSP: 002b:00007ffd41a210b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000002 [ 560.925168] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dff2e4469 [ 560.925655] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd41a211f0 [ 560.926085] RBP: 00007ffd41a252a0 R08: 00007f7dff60fba0 R09: 00007ffd41a25388 [ 560.926405] R10: 0000000000400b80 R11: 0000000000000206 R12: 00000000004004e0 [ 560.926867] R13: 00007ffd41a25380 R14: 0000000000000000 R15: 0000000000000000 [ 560.927241] [ 560.927491] [ 560.927755] Allocated by task 245: [ 560.928409] kasan_save_stack+0x26/0x50 [ 560.929271] __kasan_kmalloc+0x88/0xb0 [ 560.929778] __kmalloc+0x192/0x320 [ 560.930023] indx_read+0x249/0x380 [ 560.930224] indx_find+0x2a2/0x470 [ 560.930695] dir_search_u+0x196/0x2f0 [ 560.930892] ntfs_lookup+0xe0/0x100 [ 560.931115] __lookup_slow+0x116/0x220 [ 560.931323] walk_component+0x187/0x230 [ 560.931570] link_path_walk.part.0+0x3f0/0x660 [ 560.931791] path_openat+0x19c/0x1d10 [ 560.932008] do_filp_open+0x15c/0x1f0 [ 560.932226] do_sys_openat2+0x32a/0x4b0 [ 560.932413] do_sys_open+0x99/0xf0 [ 560.932709] __x64_sys_open+0x4c/0x60 [ 560.933417] do_syscall_64+0x3b/0x90 [ 560.933776] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 560.934235] [ 560.934486] The buggy address belongs to the object at ffff888009497000 [ 560.934486] which belongs to the cache kmalloc-512 of size 512 [ 560.935239] The buggy address is located 56 bytes to the right of [ 560.935239] 512-byte region [ffff888009497000, ffff888009497200) [ 560.936153] [ 560.937326] The buggy address belongs to the physical page: [ 560.938228] page:0000000062a3dfae refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9496 [ 560.939616] head:0000000062a3dfae order:1 compound_mapcount:0 compound_pincount:0 [ 560.940219] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) [ 560.942702] raw: 000fffffc0010200 ffffea0000164f80 dead000000000005 ffff888001041c80 [ 560.943932] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000 [ 560.944568] page dumped because: kasan: bad access detected [ 560.945735] [ 560.946112] Memory state around the buggy address: [ 560.946870] ffff888009497100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 560.947242] ffff888009497180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 560.947611] >ffff888009497200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 560.947915] ^ [ 560.948249] ffff888009497280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 560.948687] ffff888009497300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit c878a915bcb992c12a97ebae1013e377158f560a Author: Edward Lo Date: Fri Sep 9 09:04:00 2022 +0800 fs/ntfs3: Validate attribute name offset [ Upstream commit 4f1dc7d9756e66f3f876839ea174df2e656b7f79 ] Although the attribute name length is checked before comparing it to some common names (e.g., $I30), the offset isn't. This adds a sanity check for the attribute name offset, guarantee the validity and prevent possible out-of-bound memory accesses. [ 191.720056] BUG: unable to handle page fault for address: ffffebde00000008 [ 191.721060] #PF: supervisor read access in kernel mode [ 191.721586] #PF: error_code(0x0000) - not-present page [ 191.722079] PGD 0 P4D 0 [ 191.722571] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 191.723179] CPU: 0 PID: 244 Comm: mount Not tainted 6.0.0-rc4 #28 [ 191.723749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 191.724832] RIP: 0010:kfree+0x56/0x3b0 [ 191.725870] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069 [ 191.727375] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286 [ 191.727897] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9 [ 191.728531] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040 [ 191.729183] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000 [ 191.729628] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040 [ 191.730158] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0 [ 191.730645] FS: 00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000 [ 191.731328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 191.731667] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0 [ 191.732568] Call Trace: [ 191.733231] [ 191.733860] kvfree+0x2c/0x40 [ 191.734632] ni_clear+0x180/0x290 [ 191.735085] ntfs_evict_inode+0x45/0x70 [ 191.735495] evict+0x199/0x280 [ 191.735996] iput.part.0+0x286/0x320 [ 191.736438] iput+0x32/0x50 [ 191.736811] iget_failed+0x23/0x30 [ 191.737270] ntfs_iget5+0x337/0x1890 [ 191.737629] ? ntfs_clear_mft_tail+0x20/0x260 [ 191.738201] ? ntfs_get_block_bmap+0x70/0x70 [ 191.738482] ? ntfs_objid_init+0xf6/0x140 [ 191.738779] ? ntfs_reparse_init+0x140/0x140 [ 191.739266] ntfs_fill_super+0x121b/0x1b50 [ 191.739623] ? put_ntfs+0x1d0/0x1d0 [ 191.739984] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 191.740466] ? put_ntfs+0x1d0/0x1d0 [ 191.740787] ? sb_set_blocksize+0x6a/0x80 [ 191.741272] get_tree_bdev+0x232/0x370 [ 191.741829] ? put_ntfs+0x1d0/0x1d0 [ 191.742669] ntfs_fs_get_tree+0x15/0x20 [ 191.743132] vfs_get_tree+0x4c/0x130 [ 191.743457] path_mount+0x654/0xfe0 [ 191.743938] ? putname+0x80/0xa0 [ 191.744271] ? finish_automount+0x2e0/0x2e0 [ 191.744582] ? putname+0x80/0xa0 [ 191.745053] ? kmem_cache_free+0x1c4/0x440 [ 191.745403] ? putname+0x80/0xa0 [ 191.745616] do_mount+0xd6/0xf0 [ 191.745887] ? path_mount+0xfe0/0xfe0 [ 191.746287] ? __kasan_check_write+0x14/0x20 [ 191.746582] __x64_sys_mount+0xca/0x110 [ 191.746850] do_syscall_64+0x3b/0x90 [ 191.747122] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 191.747517] RIP: 0033:0x7f351fee948a [ 191.748332] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 191.749341] RSP: 002b:00007ffd51cf3af8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 191.749960] RAX: ffffffffffffffda RBX: 000055b903733060 RCX: 00007f351fee948a [ 191.750589] RDX: 000055b903733260 RSI: 000055b9037332e0 RDI: 000055b90373bce0 [ 191.751115] RBP: 0000000000000000 R08: 000055b903733280 R09: 0000000000000020 [ 191.751537] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055b90373bce0 [ 191.751946] R13: 000055b903733260 R14: 0000000000000000 R15: 00000000ffffffff [ 191.752519] [ 191.752782] Modules linked in: [ 191.753785] CR2: ffffebde00000008 [ 191.754937] ---[ end trace 0000000000000000 ]--- [ 191.755429] RIP: 0010:kfree+0x56/0x3b0 [ 191.755725] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069 [ 191.756744] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286 [ 191.757218] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9 [ 191.757580] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040 [ 191.758016] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000 [ 191.758570] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040 [ 191.758957] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0 [ 191.759317] FS: 00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000 [ 191.759711] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 191.760118] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0 Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit f62506f5e45afbb01c84c3f28a2878b320a0b0f7 Author: Edward Lo Date: Fri Sep 9 09:03:10 2022 +0800 fs/ntfs3: Add null pointer check for inode operations [ Upstream commit c1ca8ef0262b25493631ecbd9cb8c9893e1481a1 ] This adds a sanity check for the i_op pointer of the inode which is returned after reading Root directory MFT record. We should check the i_op is valid before trying to create the root dentry, otherwise we may encounter a NPD while mounting a image with a funny Root directory MFT record. [ 114.484325] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 114.484811] #PF: supervisor read access in kernel mode [ 114.485084] #PF: error_code(0x0000) - not-present page [ 114.485606] PGD 0 P4D 0 [ 114.485975] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 114.486570] CPU: 0 PID: 237 Comm: mount Tainted: G B 6.0.0-rc4 #28 [ 114.486977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 114.488169] RIP: 0010:d_flags_for_inode+0xe0/0x110 [ 114.488816] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241 [ 114.490326] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296 [ 114.490695] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea [ 114.490986] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020 [ 114.491364] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05 [ 114.491675] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000 [ 114.491954] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750 [ 114.492397] FS: 00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000 [ 114.492797] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 114.493150] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0 [ 114.493671] Call Trace: [ 114.493890] [ 114.494075] __d_instantiate+0x24/0x1c0 [ 114.494505] d_instantiate.part.0+0x35/0x50 [ 114.494754] d_make_root+0x53/0x80 [ 114.494998] ntfs_fill_super+0x1232/0x1b50 [ 114.495260] ? put_ntfs+0x1d0/0x1d0 [ 114.495499] ? vsprintf+0x20/0x20 [ 114.495723] ? set_blocksize+0x95/0x150 [ 114.495964] get_tree_bdev+0x232/0x370 [ 114.496272] ? put_ntfs+0x1d0/0x1d0 [ 114.496502] ntfs_fs_get_tree+0x15/0x20 [ 114.496859] vfs_get_tree+0x4c/0x130 [ 114.497099] path_mount+0x654/0xfe0 [ 114.497507] ? putname+0x80/0xa0 [ 114.497933] ? finish_automount+0x2e0/0x2e0 [ 114.498362] ? putname+0x80/0xa0 [ 114.498571] ? kmem_cache_free+0x1c4/0x440 [ 114.498819] ? putname+0x80/0xa0 [ 114.499069] do_mount+0xd6/0xf0 [ 114.499343] ? path_mount+0xfe0/0xfe0 [ 114.499683] ? __kasan_check_write+0x14/0x20 [ 114.500133] __x64_sys_mount+0xca/0x110 [ 114.500592] do_syscall_64+0x3b/0x90 [ 114.500930] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 114.501294] RIP: 0033:0x7fdc898e948a [ 114.501542] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 114.502716] RSP: 002b:00007ffd793e58f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 114.503175] RAX: ffffffffffffffda RBX: 0000564b2228f060 RCX: 00007fdc898e948a [ 114.503588] RDX: 0000564b2228f260 RSI: 0000564b2228f2e0 RDI: 0000564b22297ce0 [ 114.504925] RBP: 0000000000000000 R08: 0000564b2228f280 R09: 0000000000000020 [ 114.505484] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564b22297ce0 [ 114.505823] R13: 0000564b2228f260 R14: 0000000000000000 R15: 00000000ffffffff [ 114.506562] [ 114.506887] Modules linked in: [ 114.507648] CR2: 0000000000000008 [ 114.508884] ---[ end trace 0000000000000000 ]--- [ 114.509675] RIP: 0010:d_flags_for_inode+0xe0/0x110 [ 114.510140] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241 [ 114.511762] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296 [ 114.512401] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea [ 114.513103] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020 [ 114.513512] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05 [ 114.513831] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000 [ 114.514757] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750 [ 114.515411] FS: 00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000 [ 114.515794] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 114.516208] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0 Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 2dd9ccfb06bcdad30ad92d96c3affa38a458679e Author: Shigeru Yoshida Date: Tue Aug 23 19:32:05 2022 +0900 fs/ntfs3: Fix memory leak on ntfs_fill_super() error path [ Upstream commit 51e76a232f8c037f1d9e9922edc25b003d5f3414 ] syzbot reported kmemleak as below: BUG: memory leak unreferenced object 0xffff8880122f1540 (size 32): comm "a.out", pid 6664, jiffies 4294939771 (age 25.500s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 ed ff ed ff 00 00 00 00 ................ backtrace: [] ntfs_init_fs_context+0x22/0x1c0 [] alloc_fs_context+0x217/0x430 [] path_mount+0x704/0x1080 [] __x64_sys_mount+0x18c/0x1d0 [] do_syscall_64+0x34/0xb0 [] entry_SYSCALL_64_after_hwframe+0x63/0xcd This patch fixes this issue by freeing mount options on error path of ntfs_fill_super(). Reported-by: syzbot+9d67170b20e8f94351c8@syzkaller.appspotmail.com Signed-off-by: Shigeru Yoshida Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit ea6b3598406c58c5d09b6f4328e09616c077597f Author: Edward Lo Date: Sun Aug 7 01:05:18 2022 +0800 fs/ntfs3: Add null pointer check to attr_load_runs_vcn [ Upstream commit 2681631c29739509eec59cc0b34e977bb04c6cf1 ] Some metadata files are handled before MFT. This adds a null pointer check for some corner cases that could lead to NPD while reading these metadata files for a malformed NTFS image. [ 240.190827] BUG: kernel NULL pointer dereference, address: 0000000000000158 [ 240.191583] #PF: supervisor read access in kernel mode [ 240.191956] #PF: error_code(0x0000) - not-present page [ 240.192391] PGD 0 P4D 0 [ 240.192897] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 240.193805] CPU: 0 PID: 242 Comm: mount Tainted: G B 5.19.0+ #17 [ 240.194477] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 240.195152] RIP: 0010:ni_find_attr+0xae/0x300 [ 240.195679] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f [ 240.196642] RSP: 0018:ffff88800812f690 EFLAGS: 00000286 [ 240.197019] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a [ 240.197523] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60 [ 240.197877] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed [ 240.198292] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000 [ 240.198647] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [ 240.199410] FS: 00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000 [ 240.199895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 240.200314] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0 [ 240.200839] Call Trace: [ 240.201104] [ 240.201502] ? ni_load_mi+0x80/0x80 [ 240.202297] ? ___slab_alloc+0x465/0x830 [ 240.202614] attr_load_runs_vcn+0x8c/0x1a0 [ 240.202886] ? __kasan_slab_alloc+0x32/0x90 [ 240.203157] ? attr_data_write_resident+0x250/0x250 [ 240.203543] mi_read+0x133/0x2c0 [ 240.203785] mi_get+0x70/0x140 [ 240.204012] ni_load_mi_ex+0xfa/0x190 [ 240.204346] ? ni_std5+0x90/0x90 [ 240.204588] ? __kasan_kmalloc+0x88/0xb0 [ 240.204859] ni_enum_attr_ex+0xf1/0x1c0 [ 240.205107] ? ni_fname_type.part.0+0xd0/0xd0 [ 240.205600] ? ntfs_load_attr_list+0xbe/0x300 [ 240.205864] ? ntfs_cmp_names_cpu+0x125/0x180 [ 240.206157] ntfs_iget5+0x56c/0x1870 [ 240.206510] ? ntfs_get_block_bmap+0x70/0x70 [ 240.206776] ? __kasan_kmalloc+0x88/0xb0 [ 240.207030] ? set_blocksize+0x95/0x150 [ 240.207545] ntfs_fill_super+0xb8f/0x1e20 [ 240.207839] ? put_ntfs+0x1d0/0x1d0 [ 240.208069] ? vsprintf+0x20/0x20 [ 240.208467] ? mutex_unlock+0x81/0xd0 [ 240.208846] ? set_blocksize+0x95/0x150 [ 240.209221] get_tree_bdev+0x232/0x370 [ 240.209804] ? put_ntfs+0x1d0/0x1d0 [ 240.210519] ntfs_fs_get_tree+0x15/0x20 [ 240.210991] vfs_get_tree+0x4c/0x130 [ 240.211455] path_mount+0x645/0xfd0 [ 240.211806] ? putname+0x80/0xa0 [ 240.212112] ? finish_automount+0x2e0/0x2e0 [ 240.212559] ? kmem_cache_free+0x110/0x390 [ 240.212906] ? putname+0x80/0xa0 [ 240.213329] do_mount+0xd6/0xf0 [ 240.213829] ? path_mount+0xfd0/0xfd0 [ 240.214246] ? __kasan_check_write+0x14/0x20 [ 240.214774] __x64_sys_mount+0xca/0x110 [ 240.215080] do_syscall_64+0x3b/0x90 [ 240.215442] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 240.215811] RIP: 0033:0x7f233b4e948a [ 240.216104] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 240.217615] RSP: 002b:00007fff02211ec8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 240.218718] RAX: ffffffffffffffda RBX: 0000561cdc35b060 RCX: 00007f233b4e948a [ 240.219556] RDX: 0000561cdc35b260 RSI: 0000561cdc35b2e0 RDI: 0000561cdc363af0 [ 240.219975] RBP: 0000000000000000 R08: 0000561cdc35b280 R09: 0000000000000020 [ 240.220403] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000561cdc363af0 [ 240.220803] R13: 0000561cdc35b260 R14: 0000000000000000 R15: 00000000ffffffff [ 240.221256] [ 240.221567] Modules linked in: [ 240.222028] CR2: 0000000000000158 [ 240.223291] ---[ end trace 0000000000000000 ]--- [ 240.223669] RIP: 0010:ni_find_attr+0xae/0x300 [ 240.224058] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f [ 240.225033] RSP: 0018:ffff88800812f690 EFLAGS: 00000286 [ 240.225968] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a [ 240.226624] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60 [ 240.227307] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed [ 240.227816] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000 [ 240.228330] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [ 240.228729] FS: 00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000 [ 240.229281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 240.230298] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0 Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit de5e0955248ff90a2ae91e7f5c108392b52152d0 Author: Edward Lo Date: Sat Aug 6 00:47:27 2022 +0800 fs/ntfs3: Validate data run offset [ Upstream commit 6db620863f8528ed9a9aa5ad323b26554a17881d ] This adds sanity checks for data run offset. We should make sure data run offset is legit before trying to unpack them, otherwise we may encounter use-after-free or some unexpected memory access behaviors. [ 82.940342] BUG: KASAN: use-after-free in run_unpack+0x2e3/0x570 [ 82.941180] Read of size 1 at addr ffff888008a8487f by task mount/240 [ 82.941670] [ 82.942069] CPU: 0 PID: 240 Comm: mount Not tainted 5.19.0+ #15 [ 82.942482] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 82.943720] Call Trace: [ 82.944204] [ 82.944471] dump_stack_lvl+0x49/0x63 [ 82.944908] print_report.cold+0xf5/0x67b [ 82.945141] ? __wait_on_bit+0x106/0x120 [ 82.945750] ? run_unpack+0x2e3/0x570 [ 82.946626] kasan_report+0xa7/0x120 [ 82.947046] ? run_unpack+0x2e3/0x570 [ 82.947280] __asan_load1+0x51/0x60 [ 82.947483] run_unpack+0x2e3/0x570 [ 82.947709] ? memcpy+0x4e/0x70 [ 82.947927] ? run_pack+0x7a0/0x7a0 [ 82.948158] run_unpack_ex+0xad/0x3f0 [ 82.948399] ? mi_enum_attr+0x14a/0x200 [ 82.948717] ? run_unpack+0x570/0x570 [ 82.949072] ? ni_enum_attr_ex+0x1b2/0x1c0 [ 82.949332] ? ni_fname_type.part.0+0xd0/0xd0 [ 82.949611] ? mi_read+0x262/0x2c0 [ 82.949970] ? ntfs_cmp_names_cpu+0x125/0x180 [ 82.950249] ntfs_iget5+0x632/0x1870 [ 82.950621] ? ntfs_get_block_bmap+0x70/0x70 [ 82.951192] ? evict+0x223/0x280 [ 82.951525] ? iput.part.0+0x286/0x320 [ 82.951969] ntfs_fill_super+0x1321/0x1e20 [ 82.952436] ? put_ntfs+0x1d0/0x1d0 [ 82.952822] ? vsprintf+0x20/0x20 [ 82.953188] ? mutex_unlock+0x81/0xd0 [ 82.953379] ? set_blocksize+0x95/0x150 [ 82.954001] get_tree_bdev+0x232/0x370 [ 82.954438] ? put_ntfs+0x1d0/0x1d0 [ 82.954700] ntfs_fs_get_tree+0x15/0x20 [ 82.955049] vfs_get_tree+0x4c/0x130 [ 82.955292] path_mount+0x645/0xfd0 [ 82.955615] ? putname+0x80/0xa0 [ 82.955955] ? finish_automount+0x2e0/0x2e0 [ 82.956310] ? kmem_cache_free+0x110/0x390 [ 82.956723] ? putname+0x80/0xa0 [ 82.957023] do_mount+0xd6/0xf0 [ 82.957411] ? path_mount+0xfd0/0xfd0 [ 82.957638] ? __kasan_check_write+0x14/0x20 [ 82.957948] __x64_sys_mount+0xca/0x110 [ 82.958310] do_syscall_64+0x3b/0x90 [ 82.958719] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 82.959341] RIP: 0033:0x7fd0d1ce948a [ 82.960193] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 82.961532] RSP: 002b:00007ffe59ff69a8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 82.962527] RAX: ffffffffffffffda RBX: 0000564dcc107060 RCX: 00007fd0d1ce948a [ 82.963266] RDX: 0000564dcc107260 RSI: 0000564dcc1072e0 RDI: 0000564dcc10fce0 [ 82.963686] RBP: 0000000000000000 R08: 0000564dcc107280 R09: 0000000000000020 [ 82.964272] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564dcc10fce0 [ 82.964785] R13: 0000564dcc107260 R14: 0000000000000000 R15: 00000000ffffffff Signed-off-by: Edward Lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit d4489ba8fb806e07b43eecca5e9af5865d94cbf6 Author: edward lo Date: Mon Aug 1 18:20:51 2022 +0800 fs/ntfs3: Add overflow check for attribute size [ Upstream commit e19c6277652efba203af4ecd8eed4bd30a0054c9 ] The offset addition could overflow and pass the used size check given an attribute with very large size (e.g., 0xffffff7f) while parsing MFT attributes. This could lead to out-of-bound memory R/W if we try to access the next attribute derived by Add2Ptr(attr, asize) [ 32.963847] BUG: unable to handle page fault for address: ffff956a83c76067 [ 32.964301] #PF: supervisor read access in kernel mode [ 32.964526] #PF: error_code(0x0000) - not-present page [ 32.964893] PGD 4dc01067 P4D 4dc01067 PUD 0 [ 32.965316] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 32.965727] CPU: 0 PID: 243 Comm: mount Not tainted 5.19.0+ #6 [ 32.966050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.966628] RIP: 0010:mi_enum_attr+0x44/0x110 [ 32.967239] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a [ 32.968101] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283 [ 32.968364] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f [ 32.968651] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8 [ 32.968963] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f [ 32.969249] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000 [ 32.969870] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170 [ 32.970655] FS: 00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000 [ 32.971098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.971378] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0 [ 32.972098] Call Trace: [ 32.972842] [ 32.973341] ni_enum_attr_ex+0xda/0xf0 [ 32.974087] ntfs_iget5+0x1db/0xde0 [ 32.974386] ? slab_post_alloc_hook+0x53/0x270 [ 32.974778] ? ntfs_fill_super+0x4c7/0x12a0 [ 32.975115] ntfs_fill_super+0x5d6/0x12a0 [ 32.975336] get_tree_bdev+0x175/0x270 [ 32.975709] ? put_ntfs+0x150/0x150 [ 32.975956] ntfs_fs_get_tree+0x15/0x20 [ 32.976191] vfs_get_tree+0x2a/0xc0 [ 32.976374] ? capable+0x19/0x20 [ 32.976572] path_mount+0x484/0xaa0 [ 32.977025] ? putname+0x57/0x70 [ 32.977380] do_mount+0x80/0xa0 [ 32.977555] __x64_sys_mount+0x8b/0xe0 [ 32.978105] do_syscall_64+0x3b/0x90 [ 32.978830] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 32.979311] RIP: 0033:0x7fdab72e948a [ 32.980015] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 32.981251] RSP: 002b:00007ffd15b87588 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 32.981832] RAX: ffffffffffffffda RBX: 0000557de0aaf060 RCX: 00007fdab72e948a [ 32.982234] RDX: 0000557de0aaf260 RSI: 0000557de0aaf2e0 RDI: 0000557de0ab7ce0 [ 32.982714] RBP: 0000000000000000 R08: 0000557de0aaf280 R09: 0000000000000020 [ 32.983046] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000557de0ab7ce0 [ 32.983494] R13: 0000557de0aaf260 R14: 0000000000000000 R15: 00000000ffffffff [ 32.984094] [ 32.984352] Modules linked in: [ 32.984753] CR2: ffff956a83c76067 [ 32.985911] ---[ end trace 0000000000000000 ]--- [ 32.986555] RIP: 0010:mi_enum_attr+0x44/0x110 [ 32.987217] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a [ 32.988232] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283 [ 32.988532] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f [ 32.988916] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8 [ 32.989356] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f [ 32.989994] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000 [ 32.990415] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170 [ 32.991011] FS: 00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000 [ 32.991524] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.991936] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0 This patch adds an overflow check Signed-off-by: edward lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit af7a195deae349f15baa765d000a5188920d61dd Author: edward lo Date: Mon Aug 1 15:37:31 2022 +0800 fs/ntfs3: Validate BOOT record_size [ Upstream commit 0b66046266690454dc04e6307bcff4a5605b42a1 ] When the NTFS BOOT record_size field < 0, it represents a shift value. However, there is no sanity check on the shift result and the sbi->record_bits calculation through blksize_bits() assumes the size always > 256, which could lead to NPD while mounting a malformed NTFS image. [ 318.675159] BUG: kernel NULL pointer dereference, address: 0000000000000158 [ 318.675682] #PF: supervisor read access in kernel mode [ 318.675869] #PF: error_code(0x0000) - not-present page [ 318.676246] PGD 0 P4D 0 [ 318.676502] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 318.676934] CPU: 0 PID: 259 Comm: mount Not tainted 5.19.0 #5 [ 318.677289] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 318.678136] RIP: 0010:ni_find_attr+0x2d/0x1c0 [ 318.678656] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180 [ 318.679848] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246 [ 318.680104] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080 [ 318.680790] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 318.681679] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 318.682577] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080 [ 318.683015] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000 [ 318.683618] FS: 00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000 [ 318.684280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 318.684651] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0 [ 318.685623] Call Trace: [ 318.686607] [ 318.686872] ? ntfs_alloc_inode+0x1a/0x60 [ 318.687235] attr_load_runs_vcn+0x2b/0xa0 [ 318.687468] mi_read+0xbb/0x250 [ 318.687576] ntfs_iget5+0x114/0xd90 [ 318.687750] ntfs_fill_super+0x588/0x11b0 [ 318.687953] ? put_ntfs+0x130/0x130 [ 318.688065] ? snprintf+0x49/0x70 [ 318.688164] ? put_ntfs+0x130/0x130 [ 318.688256] get_tree_bdev+0x16a/0x260 [ 318.688407] vfs_get_tree+0x20/0xb0 [ 318.688519] path_mount+0x2dc/0x9b0 [ 318.688877] do_mount+0x74/0x90 [ 318.689142] __x64_sys_mount+0x89/0xd0 [ 318.689636] do_syscall_64+0x3b/0x90 [ 318.689998] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 318.690318] RIP: 0033:0x7fd9e133c48a [ 318.690687] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 318.691357] RSP: 002b:00007ffd374406c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 318.691632] RAX: ffffffffffffffda RBX: 0000564d0b051080 RCX: 00007fd9e133c48a [ 318.691920] RDX: 0000564d0b051280 RSI: 0000564d0b051300 RDI: 0000564d0b0596a0 [ 318.692123] RBP: 0000000000000000 R08: 0000564d0b0512a0 R09: 0000000000000020 [ 318.692349] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564d0b0596a0 [ 318.692673] R13: 0000564d0b051280 R14: 0000000000000000 R15: 00000000ffffffff [ 318.693007] [ 318.693271] Modules linked in: [ 318.693614] CR2: 0000000000000158 [ 318.694446] ---[ end trace 0000000000000000 ]--- [ 318.694779] RIP: 0010:ni_find_attr+0x2d/0x1c0 [ 318.694952] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180 [ 318.696042] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246 [ 318.696531] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080 [ 318.698114] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 318.699286] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 318.699795] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080 [ 318.700236] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000 [ 318.700973] FS: 00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000 [ 318.701688] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 318.702190] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0 [ 318.726510] mount (259) used greatest stack depth: 13320 bytes left This patch adds a sanity check. Signed-off-by: edward lo Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 8e228ac90c39a02e6e6c95e31d497b1929d87049 Author: Christoph Hellwig Date: Wed Dec 21 09:51:19 2022 +0100 nvmet: don't defer passthrough commands with trivial effects to the workqueue [ Upstream commit 2a459f6933e1c459bffb7cc73fd6c900edc714bd ] Mask out the "Command Supported" and "Logical Block Content Change" bits and only defer execution of commands that have non-trivial effects to the workqueue for synchronous execution. This allows to execute admin commands asynchronously on controllers that provide a Command Supported and Effects log page, and will keep allowing to execute Write commands asynchronously once command effects on I/O commands are taken into account. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reviewed-by: Kanchan Joshi Signed-off-by: Sasha Levin commit f068a7315a9eaa68ca6025fe483dfd2757da3050 Author: Christoph Hellwig Date: Wed Dec 21 10:30:45 2022 +0100 nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition [ Upstream commit 685e6311637e46f3212439ce2789f8a300e5050f ] 3 << 16 does not generate the correct mask for bits 16, 17 and 18. Use the GENMASK macro to generate the correct mask instead. Fixes: 84fef62d135b ("nvme: check admin passthru command effects") Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reviewed-by: Kanchan Joshi Signed-off-by: Sasha Levin commit 576502f25f7912cdfd65fe22ee4dd2a40b6e27a9 Author: Adam Vodopjan Date: Fri Dec 9 09:26:34 2022 +0000 ata: ahci: Fix PCS quirk application for suspend [ Upstream commit 37e14e4f3715428b809e4df9a9958baa64c77d51 ] Since kernel 5.3.4 my laptop (ICH8M controller) does not see Kingston SV300S37A60G SSD disk connected into a SATA connector on wake from suspend. The problem was introduced in c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond"): the quirk is not applied on wake from suspend as it originally was. It is worth to mention the commit contained another bug: the quirk is not applied at all to controllers which require it. The fix commit 09d6ac8dc51a ("libata/ahci: Fix PCS quirk application") landed in 5.3.8. So testing my patch anywhere between commits c312ef176399 and 09d6ac8dc51a is pointless. Not all disks trigger the problem. For example nothing bad happens with Western Digital WD5000LPCX HDD. Test hardware: - Acer 5920G with ICH8M SATA controller - sda: some SATA HDD connnected into the DVD drive IDE port with a SATA-IDE caddy. It is a boot disk - sdb: Kingston SV300S37A60G SSD connected into the only SATA port Sample "dmesg --notime | grep -E '^(sd |ata)'" output on wake: sd 0:0:0:0: [sda] Starting disk sd 2:0:0:0: [sdb] Starting disk ata4: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out ata1.00: ACPI cmd ef/03:42:00:00:00:a0 (SET FEATURES) filtered out ata1: FORCE: cable set to 80c ata5: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3: SATA link down (SStatus 4 SControl 300) ata3.00: disabled sd 2:0:0:0: rejecting I/O to offline device ata3.00: detaching (SCSI 2:0:0:0) sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Synchronizing SCSI cache sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 2:0:0:0: [sdb] Stopping disk sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Commit c312ef176399 dropped ahci_pci_reset_controller() which internally calls ahci_reset_controller() and applies the PCS quirk if needed after that. It was called each time a reset was required instead of just ahci_reset_controller(). This patch puts the function back in place. Fixes: c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond") Signed-off-by: Adam Vodopjan Signed-off-by: Damien Le Moal Signed-off-by: Sasha Levin commit 7949b0df3dd9f4817ed4a4e989fa9ee81df6205f Author: Yu Kuai Date: Mon Dec 26 11:06:05 2022 +0800 block, bfq: fix uaf for bfqq in bfq_exit_icq_bfqq [ Upstream commit 246cf66e300b76099b5dbd3fdd39e9a5dbc53f02 ] Commit 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'") will access 'bic->bfqq' in bic_set_bfqq(), however, bfq_exit_icq_bfqq() can free bfqq first, and then call bic_set_bfqq(), which will cause uaf. Fix the problem by moving bfq_exit_bfqq() behind bic_set_bfqq(). Fixes: 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'") Reported-by: Yi Zhang Signed-off-by: Yu Kuai Link: https://lore.kernel.org/r/20221226030605.1437081-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit ff3d9ab51cd572f18cb493683b9389f732febff8 Author: Adrian Freund Date: Tue Dec 13 21:13:11 2022 +0100 ACPI: resource: do IRQ override on Lenovo 14ALC7 [ Upstream commit f3cb9b740869712d448edf3b9ef5952b847caf8b ] Commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad") added an override for Lenovo IdeaPad 5 16ALC7. The 14ALC7 variant also suffers from a broken touchscreen and trackpad. Fixes: 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216804 Signed-off-by: Adrian Freund Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit 698a0813ce69b46284e96a71323a1cc7c14d294a Author: Erik Schumacher Date: Sun Dec 11 14:33:22 2022 +0100 ACPI: resource: do IRQ override on XMG Core 15 [ Upstream commit 7592b79ba4a91350b38469e05238308bcfe1019b ] The Schenker XMG CORE 15 (M22) is Ryzen-6 based and needs IRQ overriding for the keyboard to work. Adding an entry for this laptop to the override_table makes the internal keyboard functional again. Signed-off-by: Erik Schumacher Signed-off-by: Rafael J. Wysocki Stable-dep-of: f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7") Signed-off-by: Sasha Levin commit a9ac7633bbe5f86977555e7dcf251c8dfcd14375 Author: Jiri Slaby (SUSE) Date: Tue Oct 4 12:33:40 2022 +0200 ACPI: resource: do IRQ override on LENOVO IdeaPad [ Upstream commit bfcdf58380b1d9be564a78a9370da722ed1a9965 ] LENOVO IdeaPad Flex 5 is ryzen-5 based and the commit below removed IRQ overriding for those. This broke touchscreen and trackpad: i2c_designware AMDI0010:00: controller timed out i2c_designware AMDI0010:03: controller timed out i2c_hid_acpi i2c-MSFT0001:00: failed to reset device: -61 i2c_designware AMDI0010:03: controller timed out ... i2c_hid_acpi i2c-MSFT0001:00: can't add hid device: -61 i2c_hid_acpi: probe of i2c-MSFT0001:00 failed with error -61 White-list this specific model in the override_table. For this to work, the ZEN test needs to be put below the table walk. Fixes: 37c81d9f1d1b (ACPI: resource: skip IRQ override on AMD Zen platforms) Link: https://bugzilla.suse.com/show_bug.cgi?id=1203794 Signed-off-by: Jiri Slaby (SUSE) Signed-off-by: Rafael J. Wysocki Stable-dep-of: f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7") Signed-off-by: Sasha Levin commit 5fe31f29501c0def0b7f6c997801a1bdc162a7d2 Author: Tamim Khan Date: Sun Aug 28 23:04:19 2022 -0400 ACPI: resource: Skip IRQ override on Asus Vivobook K3402ZA/K3502ZA [ Upstream commit e12dee3736731e24b1e7367f87d66ac0fcd73ce7 ] In the ACPI DSDT table for Asus VivoBook K3402ZA/K3502ZA IRQ 1 is described as ActiveLow; however, the kernel overrides it to Edge_High. This prevents the internal keyboard from working on these laptops. In order to fix this add these laptops to the skip_override_table so that the kernel does not override IRQ 1 to Edge_High. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216158 Reviewed-by: Hui Wang Tested-by: Tamim Khan Tested-by: Sunand Signed-off-by: Tamim Khan Signed-off-by: Rafael J. Wysocki Stable-dep-of: f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7") Signed-off-by: Sasha Levin commit 4c5fee0d883abfb68794a4df9fa89ea6f072f1b8 Author: Keith Busch Date: Mon Dec 19 13:54:55 2022 -0800 nvme-pci: fix page size checks [ Upstream commit 841734234a28fd5cd0889b84bd4d93a0988fa11e ] The size allocated out of the dma pool is at most NVME_CTRL_PAGE_SIZE, which may be smaller than the PAGE_SIZE. Fixes: c61b82c7b7134 ("nvme-pci: fix PRP pool size") Signed-off-by: Keith Busch Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 9141144b37f30e3e7fa024bcfa0a13011e546ba9 Author: Keith Busch Date: Mon Dec 19 10:59:06 2022 -0800 nvme-pci: fix mempool alloc size [ Upstream commit c89a529e823d51dd23c7ec0c047c7a454a428541 ] Convert the max size to bytes to match the units of the divisor that calculates the worst-case number of PRP entries. The result is used to determine how many PRP Lists are required. The code was previously rounding this to 1 list, but we can require 2 in the worst case. In that scenario, the driver would corrupt memory beyond the size provided by the mempool. While unlikely to occur (you'd need a 4MB in exactly 127 phys segments on a queue that doesn't support SGLs), this memory corruption has been observed by kfence. Cc: Jens Axboe Fixes: 943e942e6266f ("nvme-pci: limit max IO size and segments to avoid high order allocations") Signed-off-by: Keith Busch Reviewed-by: Jens Axboe Reviewed-by: Kanchan Joshi Reviewed-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit f17cf8fa2c9d94709801502e4f9e36dc7f8e064e Author: Klaus Jensen Date: Tue Dec 13 09:58:07 2022 +0100 nvme-pci: fix doorbell buffer value endianness [ Upstream commit b5f96cb719d8ba220b565ddd3ba4ac0d8bcfb130 ] When using shadow doorbells, the event index and the doorbell values are written to host memory. Prior to this patch, the values written would erroneously be written in host endianness. This causes trouble on big-endian platforms. Fix this by adding missing endian conversions. This issue was noticed by Guenter while testing various big-endian platforms under QEMU[1]. A similar fix required for hw/nvme in QEMU is up for review as well[2]. [1]: https://lore.kernel.org/qemu-devel/20221209110022.GA3396194@roeck-us.net/ [2]: https://lore.kernel.org/qemu-devel/20221212114409.34972-4-its@irrelevant.dk/ Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices") Reported-by: Guenter Roeck Signed-off-by: Klaus Jensen Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit ead99ec669b58f8b3d4836d23d6c012ab952aacf Author: Sasha Levin Date: Sat Dec 31 10:14:21 2022 -0500 Revert "selftests/bpf: Add test for unstable CT lookup API" This reverts commit f463a1295c4fa73eac0b16fbfbdfc5726b06445d. Signed-off-by: Sasha Levin commit bf0543b93740916ee91956f9a63da6fc0d79daaa Author: Paulo Alcantara Date: Sun Dec 11 18:18:55 2022 -0300 cifs: fix oops during encryption [ Upstream commit f7f291e14dde32a07b1f0aa06921d28f875a7b54 ] When running xfstests against Azure the following oops occurred on an arm64 system Unable to handle kernel write to read-only memory at virtual address ffff0001221cf000 Mem abort info: ESR = 0x9600004f EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x0f: level 3 permission fault Data abort info: ISV = 0, ISS = 0x0000004f CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000 [ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003, pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787 Internal error: Oops: 9600004f [#1] PREEMPT SMP ... pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) pc : __memcpy+0x40/0x230 lr : scatterwalk_copychunks+0xe0/0x200 sp : ffff800014e92de0 x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008 x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008 x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000 x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014 x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058 x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590 x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580 x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005 x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001 x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000 Call trace: __memcpy+0x40/0x230 scatterwalk_map_and_copy+0x98/0x100 crypto_ccm_encrypt+0x150/0x180 crypto_aead_encrypt+0x2c/0x40 crypt_message+0x750/0x880 smb3_init_transform_rq+0x298/0x340 smb_send_rqst.part.11+0xd8/0x180 smb_send_rqst+0x3c/0x100 compound_send_recv+0x534/0xbc0 smb2_query_info_compound+0x32c/0x440 smb2_set_ea+0x438/0x4c0 cifs_xattr_set+0x5d4/0x7c0 This is because in scatterwalk_copychunks(), we attempted to write to a buffer (@sign) that was allocated in the stack (vmalloc area) by crypt_message() and thus accessing its remaining 8 (x2) bytes ended up crossing a page boundary. To simply fix it, we could just pass @sign kmalloc'd from crypt_message() and then we're done. Luckily, we don't seem to pass any other vmalloc'd buffers in smb_rqst::rq_iov... Instead, let's map the correct pages and offsets from vmalloc buffers as well in cifs_sg_set_buf() and then avoiding such oopses. Signed-off-by: Paulo Alcantara (SUSE) Cc: stable@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 56f6de394f0f57928cd401255a5c7866b68a77e3 Author: Miaoqian Lin Date: Tue Dec 6 12:17:31 2022 +0400 usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init [ Upstream commit 97a48da1619ba6bd42a0e5da0a03aa490a9496b1 ] of_icc_get() alloc resources for path handle, we should release it when not need anymore. Like the release in dwc3_qcom_interconnect_exit() function. Add icc_put() in error handling to fix this. Fixes: bea46b981515 ("usb: dwc3: qcom: Add interconnect support in dwc3 driver") Cc: stable Acked-by: Thinh Nguyen Signed-off-by: Miaoqian Lin Link: https://lore.kernel.org/r/20221206081731.818107-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin