commit 798d3c532b82dce20bcdc512572f542093142d02 Author: Greg Kroah-Hartman Date: Sat Apr 26 17:19:26 2014 -0700 Linux 3.14.2 commit 0af83220a4e41797edb0915b177dae84e6e6b57f Author: Oleg Nesterov Date: Mon Apr 7 15:38:29 2014 -0700 exit: call disassociate_ctty() before exit_task_namespaces() commit c39df5fa37b0623589508c95515b4aa1531c524e upstream. Commit 8aac62706ada ("move exit_task_namespaces() outside of exit_notify()") breaks pppd and the exiting service crashes the kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: ppp_register_channel+0x13/0x20 [ppp_generic] Call Trace: ppp_asynctty_open+0x12b/0x170 [ppp_async] tty_ldisc_open.isra.2+0x27/0x60 tty_ldisc_hangup+0x1e3/0x220 __tty_hangup+0x2c4/0x440 disassociate_ctty+0x61/0x270 do_exit+0x7f2/0xa50 ppp_register_channel() needs ->net_ns and current->nsproxy == NULL. Move disassociate_ctty() before exit_task_namespaces(), it doesn't make sense to delay it after perf_event_exit_task() or cgroup_exit(). This also allows to use task_work_add() inside the (nontrivial) code paths in disassociate_ctty(). Investigated by Peter Hurley. Signed-off-by: Oleg Nesterov Reported-by: Sree Harsha Totakura Cc: Peter Hurley Cc: Sree Harsha Totakura Cc: "Eric W. Biederman" Cc: Jeff Dike Cc: Ingo Molnar Cc: Andrey Vagin Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit de661dfb1d9fee1a482ae22213d431fb901d86dd Author: Oleg Nesterov Date: Mon Apr 7 15:38:41 2014 -0700 wait: fix reparent_leader() vs EXIT_DEAD->EXIT_ZOMBIE race commit dfccbb5e49a621c1b21a62527d61fc4305617aca upstream. wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and drops tasklist_lock. If this task is not the natural child and it is traced, we change its state back to EXIT_ZOMBIE for ->real_parent. The last transition is racy, this is even documented in 50b8d257486a "ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race". wait_consider_task() tries to detect this transition and clear ->notask_error but we can't rely on ptrace_reparented(), debugger can exit and do ptrace_unlink() before its sub-thread sets EXIT_ZOMBIE. And there is another problem which were missed before: this transition can also race with reparent_leader() which doesn't reset >exit_signal if EXIT_DEAD, assuming that this task must be reaped by someone else. So the tracee can be re-parented with ->exit_signal != SIGCHLD, and if /sbin/init doesn't use __WALL it becomes unreapable. Change reparent_leader() to update ->exit_signal even if EXIT_DEAD. Note: this is the simple temporary hack for -stable, it doesn't try to solve all problems, it will be reverted by the next changes. Signed-off-by: Oleg Nesterov Reported-by: Jan Kratochvil Reported-by: Michal Schmidt Tested-by: Michal Schmidt Cc: Al Viro Cc: Lennart Poettering Cc: Roland McGrath Cc: Tejun Heo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 5e7d38dd797403f965854d3e25e5982a739343d6 Author: Li Zefan Date: Wed Feb 12 12:44:57 2014 -0800 jffs2: remove from wait queue after schedule() commit 3ead9578443b66ddb3d50ed4f53af8a0c0298ec5 upstream. @wait is a local variable, so if we don't remove it from the wait queue list, later wake_up() may end up accessing invalid memory. This was spotted by eyes. Signed-off-by: Li Zefan Cc: David Woodhouse Cc: Artem Bityutskiy Signed-off-by: Andrew Morton Signed-off-by: Brian Norris Signed-off-by: Greg Kroah-Hartman commit bd82a0a4284847fb03dc76f9a36bd993f542afe1 Author: Li Zefan Date: Wed Feb 12 12:44:56 2014 -0800 jffs2: avoid soft-lockup in jffs2_reserve_space_gc() commit 13b546d96207c131eeae15dc7b26c6e7d0f1cad7 upstream. We triggered soft-lockup under stress test on 2.6.34 kernel. BUG: soft lockup - CPU#1 stuck for 60009ms! [lockf2.test:14488] ... [] (jffs2_do_reserve_space+0x420/0x440 [jffs2]) [] (jffs2_reserve_space_gc+0x34/0x78 [jffs2]) [] (jffs2_garbage_collect_dnode.isra.3+0x264/0x478 [jffs2]) [] (jffs2_garbage_collect_pass+0x9c0/0xe4c [jffs2]) [] (jffs2_reserve_space+0x104/0x2a8 [jffs2]) [] (jffs2_write_inode_range+0x5c/0x4d4 [jffs2]) [] (jffs2_write_end+0x198/0x2c0 [jffs2]) [] (generic_file_buffered_write+0x158/0x200) [] (__generic_file_aio_write+0x3a4/0x414) [] (generic_file_aio_write+0x5c/0xbc) [] (do_sync_write+0x98/0xd4) [] (vfs_write+0xa8/0x150) [] (sys_write+0x3c/0xc0)] Fix this by adding a cond_resched() in the while loop. [akpm@linux-foundation.org: don't initialize `ret'] Signed-off-by: Li Zefan Cc: David Woodhouse Cc: Artem Bityutskiy Signed-off-by: Andrew Morton Signed-off-by: Brian Norris Signed-off-by: Greg Kroah-Hartman commit a29cb38d48d9656752fb461177b728c263f44a77 Author: Ajesh Kunhipurayil Vijayan Date: Mon Jan 6 19:06:55 2014 +0530 jffs2: Fix crash due to truncation of csize commit 41bf1a24c1001f4d0d41a78e1ac575d2f14789d7 upstream. mounting JFFS2 partition sometimes crashes with this call trace: [ 1322.240000] Kernel bug detected[#1]: [ 1322.244000] Cpu 2 [ 1322.244000] $ 0 : 0000000000000000 0000000000000018 000000003ff00070 0000000000000001 [ 1322.252000] $ 4 : 0000000000000000 c0000000f3980150 0000000000000000 0000000000010000 [ 1322.260000] $ 8 : ffffffffc09cd5f8 0000000000000001 0000000000000088 c0000000ed300de8 [ 1322.268000] $12 : e5e19d9c5f613a45 ffffffffc046d464 0000000000000000 66227ba5ea67b74e [ 1322.276000] $16 : c0000000f1769c00 c0000000ed1e0200 c0000000f3980150 0000000000000000 [ 1322.284000] $20 : c0000000f3a80000 00000000fffffffc c0000000ed2cfbd8 c0000000f39818f0 [ 1322.292000] $24 : 0000000000000004 0000000000000000 [ 1322.300000] $28 : c0000000ed2c0000 c0000000ed2cfab8 0000000000010000 ffffffffc039c0b0 [ 1322.308000] Hi : 000000000000023c [ 1322.312000] Lo : 000000000003f802 [ 1322.316000] epc : ffffffffc039a9f8 check_tn_node+0x88/0x3b0 [ 1322.320000] Not tainted [ 1322.324000] ra : ffffffffc039c0b0 jffs2_do_read_inode_internal+0x1250/0x1e48 [ 1322.332000] Status: 5400f8e3 KX SX UX KERNEL EXL IE [ 1322.336000] Cause : 00800034 [ 1322.340000] PrId : 000c1004 (Netlogic XLP) [ 1322.344000] Modules linked in: [ 1322.348000] Process jffs2_gcd_mtd7 (pid: 264, threadinfo=c0000000ed2c0000, task=c0000000f0e68dd8, tls=0000000000000000) [ 1322.356000] Stack : c0000000f1769e30 c0000000ed010780 c0000000ed010780 c0000000ed300000 c0000000f1769c00 c0000000f3980150 c0000000f3a80000 00000000fffffffc c0000000ed2cfbd8 ffffffffc039c0b0 ffffffffc09c6340 0000000000001000 0000000000000dec ffffffffc016c9d8 c0000000f39805a0 c0000000f3980180 0000008600000000 0000000000000000 0000000000000000 0000000000000000 0001000000000dec c0000000f1769d98 c0000000ed2cfb18 0000000000010000 0000000000010000 0000000000000044 c0000000f3a80000 c0000000f1769c00 c0000000f3d207a8 c0000000f1769d98 c0000000f1769de0 ffffffffc076f9c0 0000000000000009 0000000000000000 0000000000000000 ffffffffc039cf90 0000000000000017 ffffffffc013fbdc 0000000000000001 000000010003e61c ... [ 1322.424000] Call Trace: [ 1322.428000] [] check_tn_node+0x88/0x3b0 [ 1322.432000] [] jffs2_do_read_inode_internal+0x1250/0x1e48 [ 1322.440000] [] jffs2_do_crccheck_inode+0x70/0xd0 [ 1322.448000] [] jffs2_garbage_collect_pass+0x160/0x870 [ 1322.452000] [] jffs2_garbage_collect_thread+0xdc/0x1f0 [ 1322.460000] [] kthread+0xb8/0xc0 [ 1322.464000] [] kernel_thread_helper+0x10/0x18 [ 1322.472000] [ 1322.472000] Code: 67bd0050 94a4002c 2c830001 <00038036> de050218 2403fffc 0080a82d 00431824 24630044 [ 1322.480000] ---[ end trace b052bb90e97dfbf5 ]--- The variable csize in structure jffs2_tmp_dnode_info is of type uint16_t, but it is used to hold the compressed data length(csize) which is declared as uint32_t. So, when the value of csize exceeds 16bits, it gets truncated when assigned to tn->csize. This is causing a kernel BUG. Changing the definition of csize in jffs2_tmp_dnode_info to uint32_t fixes the issue. Signed-off-by: Ajesh Kunhipurayil Vijayan Signed-off-by: Kamlakant Patel Signed-off-by: Brian Norris Signed-off-by: Greg Kroah-Hartman commit 882b3a9e70f4513aae524af53d0604730f52151b Author: Kamlakant Patel Date: Mon Jan 6 19:06:54 2014 +0530 jffs2: Fix segmentation fault found in stress test commit 3367da5610c50e6b83f86d366d72b41b350b06a2 upstream. Creating a large file on a JFFS2 partition sometimes crashes with this call trace: [ 306.476000] CPU 13 Unable to handle kernel paging request at virtual address c0000000dfff8002, epc == ffffffffc03a80a8, ra == ffffffffc03a8044 [ 306.488000] Oops[#1]: [ 306.488000] Cpu 13 [ 306.492000] $ 0 : 0000000000000000 0000000000000000 0000000000008008 0000000000008007 [ 306.500000] $ 4 : c0000000dfff8002 000000000000009f c0000000e0007cde c0000000ee95fa58 [ 306.508000] $ 8 : 0000000000000001 0000000000008008 0000000000010000 ffffffffffff8002 [ 306.516000] $12 : 0000000000007fa9 000000000000ff0e 000000000000ff0f 80e55930aebb92bb [ 306.524000] $16 : c0000000e0000000 c0000000ee95fa5c c0000000efc80000 ffffffffc09edd70 [ 306.532000] $20 : ffffffffc2b60000 c0000000ee95fa58 0000000000000000 c0000000efc80000 [ 306.540000] $24 : 0000000000000000 0000000000000004 [ 306.548000] $28 : c0000000ee950000 c0000000ee95f738 0000000000000000 ffffffffc03a8044 [ 306.556000] Hi : 00000000000574a5 [ 306.560000] Lo : 6193b7a7e903d8c9 [ 306.564000] epc : ffffffffc03a80a8 jffs2_rtime_compress+0x98/0x198 [ 306.568000] Tainted: G W [ 306.572000] ra : ffffffffc03a8044 jffs2_rtime_compress+0x34/0x198 [ 306.580000] Status: 5000f8e3 KX SX UX KERNEL EXL IE [ 306.584000] Cause : 00800008 [ 306.588000] BadVA : c0000000dfff8002 [ 306.592000] PrId : 000c1100 (Netlogic XLP) [ 306.596000] Modules linked in: [ 306.596000] Process dd (pid: 170, threadinfo=c0000000ee950000, task=c0000000ee6e0858, tls=0000000000c47490) [ 306.608000] Stack : 7c547f377ddc7ee4 7ffc7f967f5d7fae 7f617f507fc37ff4 7e7d7f817f487f5f 7d8e7fec7ee87eb3 7e977ff27eec7f9e 7d677ec67f917f67 7f3d7e457f017ed7 7fd37f517f867eb2 7fed7fd17ca57e1d 7e5f7fe87f257f77 7fd77f0d7ede7fdb 7fba7fef7e197f99 7fde7fe07ee37eb5 7f5c7f8c7fc67f65 7f457fb87f847e93 7f737f3e7d137cd9 7f8e7e9c7fc47d25 7dbb7fac7fb67e52 7ff17f627da97f64 7f6b7df77ffa7ec5 80057ef17f357fb3 7f767fa27dfc7fd5 7fe37e8e7fd07e53 7e227fcf7efb7fa1 7f547e787fa87fcc 7fcb7fc57f5a7ffb 7fc07f6c7ea97e80 7e2d7ed17e587ee0 7fb17f9d7feb7f31 7f607e797e887faa 7f757fdd7c607ff3 7e877e657ef37fbd 7ec17fd67fe67ff7 7ff67f797ff87dc4 7eef7f3a7c337fa6 7fe57fc97ed87f4b 7ebe7f097f0b8003 7fe97e2a7d997cba 7f587f987f3c7fa9 ... [ 306.676000] Call Trace: [ 306.680000] [] jffs2_rtime_compress+0x98/0x198 [ 306.684000] [] jffs2_selected_compress+0x110/0x230 [ 306.692000] [] jffs2_compress+0x5c/0x388 [ 306.696000] [] jffs2_write_inode_range+0xd8/0x388 [ 306.704000] [] jffs2_write_end+0x16c/0x2d0 [ 306.708000] [] generic_file_buffered_write+0xf8/0x2b8 [ 306.716000] [] __generic_file_aio_write+0x1ac/0x350 [ 306.720000] [] generic_file_aio_write+0x80/0x168 [ 306.728000] [] do_sync_write+0x94/0xf8 [ 306.732000] [] vfs_write+0xa4/0x1a0 [ 306.736000] [] SyS_write+0x50/0x90 [ 306.744000] [] handle_sys+0x180/0x1a0 [ 306.748000] [ 306.748000] Code: 020b202d 0205282d 90a50000 <90840000> 14a40038 00000000 0060602d 0000282d 016c5823 [ 306.760000] ---[ end trace 79dd088435be02d0 ]--- Segmentation fault This crash is caused because the 'positions' is declared as an array of signed short. The value of position is in the range 0..65535, and will be converted to a negative number when the position is greater than 32767 and causes a corruption and crash. Changing the definition to 'unsigned short' fixes this issue Signed-off-by: Jayachandran C Signed-off-by: Kamlakant Patel Signed-off-by: Brian Norris Signed-off-by: Greg Kroah-Hartman commit d4654552f7290647d0767201b2afd1a19de94177 Author: Dan Carpenter Date: Fri Feb 14 12:05:49 2014 +0300 fs: NULL dereference in posix_acl_to_xattr() commit 47ba9734403770a4c5e685b01f0a72b835dd4fff upstream. This patch moves the dereference of "buffer" after the check for NULL. The only place which passes a NULL parameter is gfs2_set_acl(). Signed-off-by: Dan Carpenter Signed-off-by: Steven Whitehouse Signed-off-by: Greg Kroah-Hartman commit c80e9ae4c5ac98314d8b3d1d5702bb9ceae4d346 Author: Eric Whitney Date: Tue Apr 1 19:49:30 2014 -0400 ext4: fix premature freeing of partial clusters split across leaf blocks commit ad6599ab3ac98a4474544086e048ce86ec15a4d1 upstream. Xfstests generic/311 and shared/298 fail when run on a bigalloc file system. Kernel error messages produced during the tests report that blocks to be freed are already on the to-be-freed list. When e2fsck is run at the end of the tests, it typically reports bad i_blocks and bad free blocks counts. The bug that causes these failures is located in ext4_ext_rm_leaf(). Code at the end of the function frees a partial cluster if it's not shared with an extent remaining in the leaf. However, if all the extents in the leaf have been removed, the code dereferences an invalid extent pointer (off the front of the leaf) when the check for sharing is made. This generally has the effect of unconditionally freeing the partial cluster, which leads to the observed failures when the partial cluster is shared with the last extent in the next leaf. Fix this by attempting to free the cluster only if extents remain in the leaf. Any remaining partial cluster will be freed if possible when the next leaf is processed or when leaf removal is complete. Signed-off-by: Eric Whitney Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 5ffaffe424119ade4c03bdbe08387a952e318db2 Author: Eric Whitney Date: Thu Mar 13 23:34:16 2014 -0400 ext4: fix partial cluster handling for bigalloc file systems commit c06344939422bbd032ac967223a7863de57496b5 upstream. Commit 9cb00419fa, which enables hole punching for bigalloc file systems, exposed a bug introduced by commit 6ae06ff51e in an earlier release. When run on a bigalloc file system, xfstests generic/013, 068, 075, 083, 091, 100, 112, 127, 263, 269, and 270 fail with e2fsck errors or cause kernel error messages indicating that previously freed blocks are being freed again. The latter commit optimizes the selection of the starting extent in ext4_ext_rm_leaf() when hole punching by beginning with the extent supplied in the path argument rather than with the last extent in the leaf node (as is still done when truncating). However, the code in rm_leaf that initially sets partial_cluster to track cluster sharing on extent boundaries is only guaranteed to run if rm_leaf starts with the last node in the leaf. Consequently, partial_cluster is not correctly initialized when hole punching, and a cluster on the boundary of a punched region that should be retained may instead be deallocated. Signed-off-by: Eric Whitney Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit dc5af0f7e234498decf671185ee9224bd1e15657 Author: Eric Whitney Date: Wed Feb 19 18:52:39 2014 -0500 ext4: fix error return from ext4_ext_handle_uninitialized_extents() commit ce37c42919608e96ade3748fe23c3062a0a966c5 upstream. Commit 3779473246 breaks the return of error codes from ext4_ext_handle_uninitialized_extents() in ext4_ext_map_blocks(). A portion of the patch assigns that function's signed integer return value to an unsigned int. Consequently, negatively valued error codes are lost and can be treated as a bogus allocated block count. Signed-off-by: Eric Whitney Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 2891d0f1023283a2d931c91193ff3f3696be7780 Author: Josef Bacik Date: Thu Mar 27 19:41:34 2014 -0400 Btrfs: check for an extent_op on the locked ref commit 573a075567f0174551e2fad2a3164afd2af788f2 upstream. We could have possibly added an extent_op to the locked_ref while we dropped locked_ref->lock, so check for this case as well and loop around. Otherwise we could lose flag updates which would lead to extent tree corruption. Thanks, Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 50b51ee3114bfce0ab7cadead016f2801588f1ca Author: Josef Bacik Date: Thu Mar 6 19:01:07 2014 -0500 Btrfs: fix deadlock with nested trans handles commit 3bbb24b20a8800158c33eca8564f432dd14d0bf3 upstream. Zach found this deadlock that would happen like this btrfs_end_transaction <- reduce trans->use_count to 0 btrfs_run_delayed_refs btrfs_cow_block find_free_extent btrfs_start_transaction <- increase trans->use_count to 1 allocate chunk btrfs_end_transaction <- decrease trans->use_count to 0 btrfs_run_delayed_refs lock tree block we are cowing above ^^ We need to only decrease trans->use_count if it is above 1, otherwise leave it alone. This will make nested trans be the only ones who decrease their added ref, and will let us get rid of the trans->use_count++ hack if we have to commit the transaction. Thanks, Reported-by: Zach Brown Signed-off-by: Josef Bacik Tested-by: Zach Brown Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 53dcb28868912eea366550ce30c0f5e17cbf0099 Author: Hidetoshi Seto Date: Wed Feb 5 16:34:38 2014 +0900 Btrfs: skip submitting barrier for missing device commit f88ba6a2a44ee98e8d59654463dc157bb6d13c43 upstream. I got an error on v3.13: BTRFS error (device sdf1) in write_all_supers:3378: errno=-5 IO failure (errors while submitting device barriers.) how to reproduce: > mkfs.btrfs -f -d raid1 /dev/sdf1 /dev/sdf2 > wipefs -a /dev/sdf2 > mount -o degraded /dev/sdf1 /mnt > btrfs balance start -f -sconvert=single -mconvert=single -dconvert=single /mnt The reason of the error is that barrier_all_devices() failed to submit barrier to the missing device. However it is clear that we cannot do anything on missing device, and also it is not necessary to care chunks on the missing device. This patch stops sending/waiting barrier if device is missing. Signed-off-by: Hidetoshi Seto Signed-off-by: Josef Bacik Signed-off-by: Greg Kroah-Hartman commit 7de24f7b0ddb815d7a8375354a9612264092edcb Author: Mark Tinguely Date: Fri Apr 4 07:10:49 2014 +1100 xfs: fix directory hash ordering bug commit c88547a8119e3b581318ab65e9b72f27f23e641d upstream. Commit f5ea1100 ("xfs: add CRCs to dir2/da node blocks") introduced in 3.10 incorrectly converted the btree hash index array pointer in xfs_da3_fixhashpath(). It resulted in the the current hash always being compared against the first entry in the btree rather than the current block index into the btree block's hash entry array. As a result, it was comparing the wrong hashes, and so could misorder the entries in the btree. For most cases, this doesn't cause any problems as it requires hash collisions to expose the ordering problem. However, when there are hash collisions within a directory there is a very good probability that the entries will be ordered incorrectly and that actually matters when duplicate hashes are placed into or removed from the btree block hash entry array. This bug results in an on-disk directory corruption and that results in directory verifier functions throwing corruption warnings into the logs. While no data or directory entries are lost, access to them may be compromised, and attempts to remove entries from a directory that has suffered from this corruption may result in a filesystem shutdown. xfs_repair will fix the directory hash ordering without data loss occuring. [dchinner: wrote useful a commit message] Reported-by: Hannes Frederic Sowa Signed-off-by: Mark Tinguely Reviewed-by: Ben Myers Signed-off-by: Dave Chinner Signed-off-by: Greg Kroah-Hartman commit 1c23ab6f8860f1d5823868d868314f674333a8b3 Author: Jan Kara Date: Thu Apr 3 14:46:23 2014 -0700 bdi: avoid oops on device removal commit 5acda9d12dcf1ad0d9a5a2a7c646de3472fa7555 upstream. After commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue") when device is removed while we are writing to it we crash in bdi_writeback_workfn() -> set_worker_desc() because bdi->dev is NULL. This can happen because even though bdi_unregister() cancels all pending flushing work, nothing really prevents new ones from being queued from balance_dirty_pages() or other places. Fix the problem by clearing BDI_registered bit in bdi_unregister() and checking it before scheduling of any flushing work. Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977 Reviewed-by: Tejun Heo Signed-off-by: Jan Kara Cc: Derek Basehore Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 67001f3a0bc4373452a2a3241e4b038bdc796cda Author: Derek Basehore Date: Thu Apr 3 14:46:22 2014 -0700 backing_dev: fix hung task on sync commit 6ca738d60c563d5c6cf6253ee4b8e76fa77b2b9e upstream. bdi_wakeup_thread_delayed() used the mod_delayed_work() function to schedule work to writeback dirty inodes. The problem with this is that it can delay work that is scheduled for immediate execution, such as the work from sync_inodes_sb(). This can happen since mod_delayed_work() can now steal work from a work_queue. This fixes the problem by using queue_delayed_work() instead. This is a regression caused by commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue"). The reason that this causes a problem is that laptop-mode will change the delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default. In the case that bdi_wakeup_thread_delayed() races with sync_inodes_sb(), sync will be stopped for 10 minutes and trigger a hung task. Even if dirty_writeback_centisecs is not long enough to cause a hung task, we still don't want to delay sync for that long. We fix the problem by using queue_delayed_work() when we want to schedule writeback sometime in future. This function doesn't change the timer if it is already armed. For the same reason, we also change bdi_writeback_workfn() to immediately queue the work again in the case that the work_list is not empty. The same problem can happen if the sync work is run on the rescue worker. [jack@suse.cz: update changelog, add comment, use bdi_wakeup_thread_delayed()] Signed-off-by: Derek Basehore Reviewed-by: Jan Kara Cc: Alexander Viro Reviewed-by: Tejun Heo Cc: Greg Kroah-Hartman Cc: "Darrick J. Wong" Cc: Derek Basehore Cc: Kees Cook Cc: Benson Leung Cc: Sonny Rao Cc: Luigi Semenzato Cc: Jens Axboe Cc: Dave Chinner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e5cf61cfe83f493f118d93853d4bad367367e2b8 Author: Roberto Sassu Date: Mon Feb 3 13:56:04 2014 +0100 ima: restore the original behavior for sending data with ima template commit c019e307ad82a8ee652b8ccbacf69ae94263b07b upstream. With the new template mechanism introduced in IMA since kernel 3.13, the format of data sent through the binary_runtime_measurements interface is slightly changed. Now, for a generic measurement, the format of template data (after the template name) is: template_len | field1_len | field1 | ... | fieldN_len | fieldN In addition, fields containing a string now include the '\0' termination character. Instead, the format for the 'ima' template should be: SHA1 digest | event name length | event name It must be noted that while in the IMA 3.13 code 'event name length' is 'IMA_EVENT_NAME_LEN_MAX + 1' (256 bytes), so that the template digest is calculated correctly, and 'event name' contains '\0', in the pre 3.13 code 'event name length' is exactly the string length and 'event name' does not contain the termination character. The patch restores the behavior of the IMA code pre 3.13 for the 'ima' template so that legacy userspace tools obtain a consistent behavior when receiving data from the binary_runtime_measurements interface regardless of which kernel version is used. Signed-off-by: Roberto Sassu Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit cf151ce1869c43c271163eba3de13d13ceac282e Author: Claudio Takahasi Date: Thu Jul 25 16:34:24 2013 -0300 Bluetooth: Fix removing Long Term Key commit 5981a8821b774ada0be512fd9bad7c241e17657e upstream. This patch fixes authentication failure on LE link re-connection when BlueZ acts as slave (peripheral). LTK is removed from the internal list after its first use causing PIN or Key missing reply when re-connecting the link. The LE Long Term Key Request event indicates that the master is attempting to encrypt or re-encrypt the link. Pre-condition: BlueZ host paired and running as slave. How to reproduce(master): 1) Establish an ACL LE encrypted link 2) Disconnect the link 3) Try to re-establish the ACL LE encrypted link (fails) > HCI Event: LE Meta Event (0x3e) plen 19 LE Connection Complete (0x01) Status: Success (0x00) Handle: 64 Role: Slave (0x01) ... @ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000 > HCI Event: LE Meta Event (0x3e) plen 13 LE Long Term Key Request (0x05) Handle: 64 Random number: 875be18439d9aa37 Encryption diversifier: 0x76ed < HCI Command: LE Long Term Key Request Reply (0x08|0x001a) plen 18 Handle: 64 Long term key: 2aa531db2fce9f00a0569c7d23d17409 > HCI Event: Command Complete (0x0e) plen 6 LE Long Term Key Request Reply (0x08|0x001a) ncmd 1 Status: Success (0x00) Handle: 64 > HCI Event: Encryption Change (0x08) plen 4 Status: Success (0x00) Handle: 64 Encryption: Enabled with AES-CCM (0x01) ... @ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 3 < HCI Command: LE Set Advertise Enable (0x08|0x000a) plen 1 Advertising: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 LE Set Advertise Enable (0x08|0x000a) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 LE Connection Complete (0x01) Status: Success (0x00) Handle: 64 Role: Slave (0x01) ... @ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000 > HCI Event: LE Meta Event (0x3e) plen 13 LE Long Term Key Request (0x05) Handle: 64 Random number: 875be18439d9aa37 Encryption diversifier: 0x76ed < HCI Command: LE Long Term Key Request Neg Reply (0x08|0x001b) plen 2 Handle: 64 > HCI Event: Command Complete (0x0e) plen 6 LE Long Term Key Request Neg Reply (0x08|0x001b) ncmd 1 Status: Success (0x00) Handle: 64 > HCI Event: Disconnect Complete (0x05) plen 4 Status: Success (0x00) Handle: 64 Reason: Authentication Failure (0x05) @ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 0 Signed-off-by: Claudio Takahasi Signed-off-by: Johan Hedberg Signed-off-by: Greg Kroah-Hartman commit d0aed5c2b40ad7226b025c36d238e437a98f1162 Author: Oleg Nesterov Date: Wed Apr 2 17:45:05 2014 +0200 pid_namespace: pidns_get() should check task_active_pid_ns() != NULL commit d23082257d83e4bc89727d5aedee197e907999d2 upstream. pidns_get()->get_pid_ns() can hit ns == NULL. This task_struct can't go away, but task_active_pid_ns(task) is NULL if release_task(task) was already called. Alternatively we could change get_pid_ns(ns) to check ns != NULL, but it seems that other callers are fine. Signed-off-by: Oleg Nesterov Cc: Eric W. Biederman ebiederm@xmission.com> Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0561b77b0f366aa168fa23aa07b606f5f49638c8 Author: Alan Stern Date: Wed Jan 15 15:37:04 2014 -0500 SCSI: sd: don't fail if the device doesn't recognize SYNCHRONIZE CACHE commit 7aae51347b21eb738dc1981df1365b57a6c5ee4e upstream. Evidently some wacky USB-ATA bridges don't recognize the SYNCHRONIZE CACHE command, as shown in this email thread: http://marc.info/?t=138978356200002&r=1&w=2 The fact that we can't tell them to drain their caches shouldn't prevent the system from going into suspend. Therefore sd_sync_cache() shouldn't return an error if the device replies with an Invalid Command ASC. Signed-off-by: Alan Stern Reported-by: Sven Neumann Tested-by: Daniel Mack Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 4f1f4df2c1aee858da70f91970f6c9cb651a63de Author: Peter Hurley Date: Sat Feb 22 07:31:21 2014 -0500 tty: Fix low_latency BUG commit a9c3f68f3cd8d55f809fbdb0c138ed061ea1bd25 upstream. The user-settable knob, low_latency, has been the source of several BUG reports which stem from flush_to_ldisc() running in interrupt context. Since 3.12, which added several sleeping locks (termios_rwsem and buf->lock) to the input processing path, the frequency of these BUG reports has increased. Note that changes in 3.12 did not introduce this regression; sleeping locks were first added to the input processing path with the removal of the BKL from N_TTY in commit a88a69c91256418c5907c2f1f8a0ec0a36f9e6cc, 'n_tty: Fix loss of echoed characters and remove bkl from n_tty' and later in commit 38db89799bdf11625a831c5af33938dcb11908b6, 'tty: throttling race fix'. Since those changes, executing flush_to_ldisc() in interrupt_context (ie, low_latency set), is unsafe. However, since most devices do not validate if the low_latency setting is appropriate for the context (process or interrupt) in which they receive data, some reports are due to misconfiguration. Further, serial dma devices for which dma fails, resort to interrupt receiving as a backup without resetting low_latency. Historically, low_latency was used to force wake-up the reading process rather than wait for the next scheduler tick. The effect was to trim multiple milliseconds of latency from when the process would receive new data. Recent tests [1] have shown that the reading process now receives data with only 10's of microseconds latency without low_latency set. Remove the low_latency rx steering from tty_flip_buffer_push(); however, leave the knob as an optional hint to drivers that can tune their rx fifos and such like. Cleanup stale code comments regarding low_latency. [1] https://lkml.org/lkml/2014/2/20/434 "Yay.. thats an annoying historical pain in the butt gone." -- Alan Cox Reported-by: Beat Bolli Reported-by: Pavel Roskin Acked-by: David Sterba Cc: Grant Edwards Cc: Stanislaw Gruszka Cc: Hal Murray Signed-off-by: Peter Hurley Signed-off-by: Alan Cox Signed-off-by: Greg Kroah-Hartman commit 192ae8160c9465dd3f0c6650b29cc4ad019bc4e2 Author: Hannes Reinecke Date: Thu Feb 27 12:30:51 2014 +0100 tty: Set correct tty name in 'active' sysfs attribute commit 723abd87f6e536f1353c8f64f621520bc29523a3 upstream. The 'active' sysfs attribute should refer to the currently active tty devices the console is running on, not the currently active console. The console structure doesn't refer to any device in sysfs, only the tty the console is running on has. So we need to print out the tty names in 'active', not the console names. There is one special-case, which is tty0. If the console is directed to it, we want 'tty0' to show up in the file, so user-space knows that the messages get forwarded to the active VT. The ->device() callback would resolve tty0, though. Hence, treat it special and don't call into the VT layer to resolve it (plymouth is known to depend on it). Cc: Lennart Poettering Cc: Kay Sievers Cc: Jiri Slaby Signed-off-by: Werner Fink Signed-off-by: Hannes Reinecke Signed-off-by: David Herrmann Signed-off-by: Greg Kroah-Hartman commit 29b311867df2f1f7aee9ec42517becbbb4aeeeef Author: Tejun Heo Date: Wed Apr 2 16:40:52 2014 -0400 kernfs: protect lazy kernfs_iattrs allocation with mutex commit 4afddd60a770560d370d6f85c5aef57c16bf7502 upstream. kernfs_iattrs is allocated lazily when operations which require it take place; unfortunately, the lazy allocation and returning weren't properly synchronized and when there are multiple concurrent operations, it might end up returning kernfs_iattrs which hasn't finished initialization yet or different copies to different callers. Fix it by synchronizing with a mutex. This can be smarter with memory barriers but let's go there if it actually turns out to be necessary. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/g/533ABA32.9080602@oracle.com Reported-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 52d6c48c9db5b020e6595067a7f1b00562d26434 Author: Richard Cochran Date: Wed Mar 5 17:10:52 2014 +0100 kernfs: fix off by one error. commit 88391d49abb7d8dee91d405f96bd9e003cb6798d upstream. The hash values 0 and 1 are reserved for magic directory entries, but the code only prevents names hashing to 0. This patch fixes the test to also prevent hash value 1. Signed-off-by: Richard Cochran Reviewed-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit 541d5f25b500f84ed324df3e2b06f2235cefae52 Author: Ian Abbott Date: Thu Apr 10 19:41:57 2014 +0100 staging: comedi: fix circular locking dependency in comedi_mmap() commit b34aa86f12e8848ba453215602c8c50fa63c4cb3 upstream. Mmapping a comedi data buffer with lockdep checking enabled produced the following kernel debug messages: ====================================================== [ INFO: possible circular locking dependency detected ] 3.5.0-rc3-ija1+ #9 Tainted: G C ------------------------------------------------------- comedi_test/4160 is trying to acquire lock: (&dev->mutex#2){+.+.+.}, at: [] comedi_mmap+0x57/0x1d9 [comedi] but task is already holding lock: (&mm->mmap_sem){++++++}, at: [] vm_mmap_pgoff+0x41/0x76 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_sem){++++++}: [] lock_acquire+0x97/0x105 [] might_fault+0x6d/0x90 [] do_devinfo_ioctl.isra.7+0x11e/0x14c [comedi] [] comedi_unlocked_ioctl+0x256/0xe48 [comedi] [] vfs_ioctl+0x18/0x34 [] do_vfs_ioctl+0x382/0x43c [] sys_ioctl+0x42/0x65 [] system_call_fastpath+0x16/0x1b -> #0 (&dev->mutex#2){+.+.+.}: [] __lock_acquire+0x101d/0x1591 [] lock_acquire+0x97/0x105 [] mutex_lock_nested+0x46/0x2a4 [] comedi_mmap+0x57/0x1d9 [comedi] [] mmap_region+0x281/0x492 [] do_mmap_pgoff+0x26b/0x2a7 [] vm_mmap_pgoff+0x5d/0x76 [] sys_mmap_pgoff+0xc7/0x10d [] sys_mmap+0x16/0x20 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(&dev->mutex#2); lock(&mm->mmap_sem); lock(&dev->mutex#2); *** DEADLOCK *** To avoid the circular dependency, just try to get the lock in `comedi_mmap()` instead of blocking. Since the comedi device's main mutex is heavily used, do a down-read of its `attach_lock` rwsemaphore instead. Trying to down-read `attach_lock` should only fail if some task has down-write locked it, and that is only done while the comedi device is being attached to or detached from a low-level hardware device. Unfortunately, acquiring the `attach_lock` doesn't prevent another task replacing the comedi data buffer we are trying to mmap. The details of the buffer are held in a `struct comedi_buf_map` and pointed to by `s->async->buf_map` where `s` is the comedi subdevice whose buffer we are trying to map. The `struct comedi_buf_map` is already reference counted with a `struct kref`, so we can stop it being freed prematurely. Modify `comedi_mmap()` to call new function `comedi_buf_map_from_subdev_get()` to read the subdevice's current buffer map pointer and increment its reference instead of accessing `async->buf_map` directly. Call `comedi_buf_map_put()` to decrement the reference once the buffer map structure has been dealt with. (Note that `comedi_buf_map_put()` does nothing if passed a NULL pointer.) `comedi_buf_map_from_subdev_get()` checks the subdevice's buffer map pointer has been set and the buffer map has been initialized enough for `comedi_mmap()` to deal with it (specifically, check the `n_pages` member has been set to a non-zero value). If all is well, the buffer map's reference is incremented and a pointer to it is returned. The comedi subdevice's spin-lock is used to protect the checks. Also use the spin-lock in `__comedi_buf_alloc()` and `__comedi_buf_free()` to protect changes to the subdevice's buffer map structure pointer and the buffer map structure's `n_pages` member. (This checking of `n_pages` is a bit clunky and I [Ian Abbott] plan to deal with it in the future.) Signed-off-by: Ian Abbott Signed-off-by: Greg Kroah-Hartman commit a0409781aaa8cea0daed9292ec0bd7326ba13368 Author: Ian Abbott Date: Thu Mar 13 15:30:39 2014 +0000 staging: comedi: 8255_pci: initialize MITE data window commit 268d1e799663b795cba15c64f5d29407786a9dd4 upstream. According to National Instruments' PCI-DIO-96/PXI-6508/PCI-6503 User Manual, the physical address in PCI BAR1 needs to be OR'ed with 0x80 and written to register offset 0xC0 in the "MITE" registers (BAR0). Do so during initialization of the National Instruments boards handled by the "8255_pci" driver. The boards were previously handled by the "ni_pcidio" driver, where the initialization was done by `mite_setup()` in the "mite" module. The "mite" module comes with too much extra baggage for the "8255_pci" driver to deal with so use a local, simpler initialization function. Signed-off-by: Ian Abbott Signed-off-by: Greg Kroah-Hartman commit 03132845a32cdd2cc4b1b913253e0f8a9f1b4494 Author: Lan Tianyu Date: Sat Mar 15 13:37:13 2014 -0400 ACPI / button: Add ACPI Button event via netlink routine commit 0bf6368ee8f25826d0645c0f7a4f17c8845356a4 upstream. Commit 1696d9d (ACPI: Remove the old /proc/acpi/event interface) removed ACPI Button event which originally was sent to userspace via /proc/acpi/event. This caused ACPI shutdown regression on gentoo in VirtualBox. Now ACPI events are sent to userspace via netlink, so add ACPI Button event back via netlink routine. References: https://bugzilla.kernel.org/show_bug.cgi?id=71721 Reported-and-tested-by: Richard Musil Signed-off-by: Lan Tianyu Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 6b3ad9432a345bcea1fad2eab578db6cf4cbe76d Author: Mohit Kumar Date: Wed Apr 16 10:23:34 2014 -0600 PCI: designware: Fix iATU programming for cfg1, io and mem viewport commit 017fcdc30cdae18c0946eef1ece1f14b4c7897ba upstream. This patch corrects iATU programming for cfg1, io and mem viewport. Enable ATU only after configuring it. Signed-off-by: Mohit Kumar Signed-off-by: Ajay Khandelwal Signed-off-by: Bjorn Helgaas Acked-by: Jingoo Han Signed-off-by: Greg Kroah-Hartman commit 3bbf1685d4e0ad97e5b62fdc33e8144218ca595d Author: Mohit Kumar Date: Wed Feb 19 17:34:35 2014 +0530 PCI: designware: Fix RC BAR to be single 64-bit non-prefetchable memory BAR commit dbffdd6862e67d60703f2df66c558bf448f81d6e upstream. The Synopsys PCIe core provides one pair of 32-bit BARs (BAR 0 and BAR 1). The BARs can be configured as follows: - One 64-bit BAR: BARs 0 and 1 are combined to form a single 64-bit BAR - Two 32-bit BARs: BARs 0 and 1 are two independent 32-bit BARs This patch corrects 64-bit, non-prefetchable memory BAR configuration implemented in dw driver. Signed-off-by: Mohit Kumar Signed-off-by: Bjorn Helgaas Cc: Pratyush Anand Cc: Jingoo Han Cc: Arnd Bergmann Signed-off-by: Greg Kroah-Hartman commit 91800e9977fdf5cd3de2705b1ad1d803ccda48a8 Author: Neil Horman Date: Wed Mar 12 14:44:33 2014 -0400 x86: Adjust irq remapping quirk for older revisions of 5500/5520 chipsets commit 6f8a1b335fde143b7407036e2368d3cd6eb55674 upstream. Commit 03bbcb2e7e2 (iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets) properly disables irq remapping on the 5500/5520 chipsets that don't correctly perform that feature. However, when I wrote it, I followed the errata sheet linked in that commit too closely, and explicitly tied the activation of the quirk to revision 0x13 of the chip, under the assumption that earlier revisions were not in the field. Recently a system was reported to be suffering from this remap bug and the quirk hadn't triggered, because the revision id register read at a lower value that 0x13, so the quirk test failed improperly. Given this, it seems only prudent to adjust this quirk so that any revision less than 0x13 has the quirk asserted. [ tglx: Removed the 0x12 comparison of pci id 3405 as this is covered by the <= 0x13 check already ] Signed-off-by: Neil Horman Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1394649873-14913-1-git-send-email-nhorman@tuxdriver.com Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit f2355ea2f4437d66b7807dbc75eb5f7874fffdfa Author: Jason Wang Date: Fri Feb 28 11:30:29 2014 +0800 x86, hyperv: Bypass the timer_irq_works() check commit ca3ba2a2f4a49a308e7d78c784d51b2332064f15 upstream. This patch bypass the timer_irq_works() check for hyperv guest since: - It was guaranteed to work. - timer_irq_works() may fail sometime due to the lpj calibration were inaccurate in a hyperv guest or a buggy host. In the future, we should get the tsc frequency from hypervisor and use preset lpj instead. [ hpa: I would prefer to not defer things to "the future" in the future... ] Cc: K. Y. Srinivasan Cc: Haiyang Zhang Acked-by: K. Y. Srinivasan Signed-off-by: Jason Wang Link: http://lkml.kernel.org/r/1393558229-14755-1-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit d202b0192fad23c7b7d309770e305b27e0d787f1 Author: Jiri Slaby Date: Mon Apr 14 09:46:50 2014 -0500 Char: ipmi_bt_sm, fix infinite loop commit a94cdd1f4d30f12904ab528152731fb13a812a16 upstream. In read_all_bytes, we do unsigned char i; ... bt->read_data[0] = BMC2HOST; bt->read_count = bt->read_data[0]; ... for (i = 1; i <= bt->read_count; i++) bt->read_data[i] = BMC2HOST; If bt->read_data[0] == bt->read_count == 255, we loop infinitely in the 'for' loop. Make 'i' an 'int' instead of 'char' to get rid of the overflow and finish the loop after 255 iterations every time. Signed-off-by: Jiri Slaby Reported-and-debugged-by: Rui Hui Dian Cc: Tomas Cech Cc: Corey Minyard Cc: Signed-off-by: Corey Minyard Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 99c3556cdd0d809f4082e790a3df0ab86ded5913 Author: Mikulas Patocka Date: Mon Apr 14 16:58:55 2014 -0400 user namespace: fix incorrect memory barriers commit e79323bd87808fdfbc68ce6c5371bd224d9672ee upstream. smp_read_barrier_depends() can be used if there is data dependency between the readers - i.e. if the read operation after the barrier uses address that was obtained from the read operation before the barrier. In this file, there is only control dependency, no data dependecy, so the use of smp_read_barrier_depends() is incorrect. The code could fail in the following way: * the cpu predicts that idx < entries is true and starts executing the body of the for loop * the cpu fetches map->extent[0].first and map->extent[0].count * the cpu fetches map->nr_extents * the cpu verifies that idx < extents is true, so it commits the instructions in the body of the for loop The problem is that in this scenario, the cpu read map->extent[0].first and map->nr_extents in the wrong order. We need a full read memory barrier to prevent it. Signed-off-by: Mikulas Patocka Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman