commit 485f25fcc014f2744754f22de395f745f2c7e492 Author: Greg Kroah-Hartman Date: Thu Jun 20 12:01:46 2013 -0700 Linux 3.9.7 commit 8e8577e87943a83aa1b84b3d5202f1f4e8f088d0 Author: Nicolas Schichan Date: Thu Jun 6 19:00:46 2013 +0200 ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant(). commit 4089fe95bfed295c8ad38251d5fe02b6b0ba684c upstream. MPP_F6281_MASK would be previously be returned when on mv88f6282, which would disallow some valid MPP configurations. Commit 830f8b91 (arm: plat-orion: fix printing of "MPP config unavailable on this hardware") made this problem visible as an invalid MPP configuration is now correctly detected and not applied. Signed-off-by: Nicolas Schichan Signed-off-by: Jason Cooper Signed-off-by: Greg Kroah-Hartman commit 43553a33ff54c011225967861e34f3f4047a3b7e Author: Nithin Sujir Date: Wed Jun 12 11:08:59 2013 -0700 tg3: Wait for boot code to finish after power on commit df465abfe06f7dc4f33f4a96d17f096e9e8ac917 upstream. Some systems that don't need wake-on-lan may choose to power down the chip on system standby. Upon resume, the power on causes the boot code to startup and initialize the hardware. On one new platform, this is causing the device to go into a bad state due to a race between the driver and boot code, once every several hundred resumes. The same race exists on open since we come up from a power on. This patch adds a wait for boot code signature at the beginning of tg3_init_hw() which is common to both cases. If there has not been a power-off or the boot code has already completed, the signature will be present and poll_fw() returns immediately. Also return immediately if the device does not have firmware. Signed-off-by: Nithin Nayak Sujir Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 70b6bf5ba36ef71e7301553ee79f718380a546a7 Author: Johan Hovold Date: Mon Jun 10 18:29:39 2013 +0200 USB: spcp8x5: fix device initialisation at open commit 5e4211f1c47560c36a8b3d4544dfd866dcf7ccd0 upstream. Do not use uninitialised termios data to determine when to configure the device at open. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 4f0856222860481ce20b248cdd824dce7c995d74 Author: Johan Hovold Date: Mon Jun 10 18:29:37 2013 +0200 USB: f81232: fix device initialisation at open commit 21886725d58e92188159731c7c1aac803dd6b9dc upstream. Do not use uninitialised termios data to determine when to configure the device at open. This also prevents stack data from leaking to userspace. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 96341bdc2ab71183c94d8fa5332eef78ada20423 Author: Johan Hovold Date: Mon Jun 10 18:29:38 2013 +0200 USB: pl2303: fix device initialisation at open commit 2d8f4447b58bba5f8cb895c07690434c02307eaf upstream. Do not use uninitialised termios data to determine when to configure the device at open. This also prevents stack data from leaking to userspace in the OOM error path. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 139f4ebcf7a9f71a55666a3537c16b3eaa7f18f0 Author: Alexander Shishkin Date: Tue Jun 11 13:41:48 2013 +0300 usb: chipidea: fix id change handling commit 0c3f3dc68bb6e6950e8cd7851e7778c550e8dfb4 upstream. Re-enable chipidea irq even if there's no role changing to do. This is a problem since b183c19f ("USB: chipidea: re-order irq handling to avoid unhandled irqs"); when it manifests, chipidea irq gets disabled for good. Signed-off-by: Alexander Shishkin Signed-off-by: Greg Kroah-Hartman commit 37759cb02de6e811c7e3d2c11ac2944666710392 Author: Benjamin Herrenschmidt Date: Sat Jun 15 12:13:40 2013 +1000 powerpc: Fix missing/delayed calls to irq_work commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream. When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. Signed-off-by: Benjamin Herrenschmidt Tested-by: Steven Rostedt Signed-off-by: Greg Kroah-Hartman commit 5cf1b34ca312b3626201ab7bafe5dc884e028616 Author: Paul Mackerras Date: Fri Jun 14 20:07:41 2013 +1000 powerpc: Fix emulation of illegal instructions on PowerNV platform commit bf593907f7236e95698a76b7c7a2bbf8b1165327 upstream. Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. Signed-off-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit a1fb10e6d2ee287036139b4f4a75763ffbf9b70b Author: Michael Ellerman Date: Thu Jun 13 21:04:56 2013 +1000 powerpc: Fix stack overflow crash in resume_kernel when ftracing commit 0e37739b1c96d65e6433998454985de994383019 upstream. It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] Signed-off-by: Michael Ellerman Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit d20f2aa7720253fee3cac2ac0dc179517a16efca Author: Matthew Garrett Date: Sat Jun 1 16:06:20 2013 -0400 Modify UEFI anti-bricking code commit f8b8404337de4e2466e2e1139ea68b1f8295974f upstream. This patch reworks the UEFI anti-bricking code, including an effective reversion of cc5a080c and 31ff2f20. It turns out that calling QueryVariableInfo() from boot services results in some firmware implementations jumping to physical addresses even after entering virtual mode, so until we have 1:1 mappings for UEFI runtime space this isn't going to work so well. Reverting these gets us back to the situation where we'd refuse to create variables on some systems because they classify deleted variables as "used" until the firmware triggers a garbage collection run, which they won't do until they reach a lower threshold. This results in it being impossible to install a bootloader, which is unhelpful. Feedback from Samsung indicates that the firmware doesn't need more than 5KB of storage space for its own purposes, so that seems like a reasonable threshold. However, there's still no guarantee that a platform will attempt garbage collection merely because it drops below this threshold. It seems that this is often only triggered if an attempt to write generates a genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to create a variable larger than the remaining space. This should fail, but if it somehow succeeds we can then immediately delete it. I've tested this on the UEFI machines I have available, but I don't have a Samsung and so can't verify that it avoids the bricking problem. Signed-off-by: Matthew Garrett Signed-off-by: Lee, Chun-Y [ dummy variable cleanup ] Signed-off-by: Matt Fleming Signed-off-by: Greg Kroah-Hartman commit 507761dc72908cfc95a18b4ffabfed746f9cfcd5 Author: Sage Weil Date: Mon Mar 25 10:26:30 2013 -0700 libceph: wrap auth methods in a mutex commit e9966076cdd952e19f2dd4854cd719be0d7cbebc upstream. The auth code is called from a variety of contexts, include the mon_client (protected by the monc's mutex) and the messenger callbacks (currently protected by nothing). Avoid chaos by protecting all auth state with a mutex. Nothing is blocking, so this should be simple and lightweight. Signed-off-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 1a14fad287936577daf0533caa94f5c5038842e7 Author: Sage Weil Date: Mon Mar 25 10:26:14 2013 -0700 libceph: wrap auth ops in wrapper functions commit 27859f9773e4a0b2042435b13400ee2c891a61f4 upstream. Use wrapper functions that check whether the auth op exists so that callers do not need a bunch of conditional checks. Simplifies the external interface. Signed-off-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit d2c7223497cf8228416c70e3f4238ddd6c5bdf3c Author: Sage Weil Date: Mon Mar 25 10:26:01 2013 -0700 libceph: add update_authorizer auth method commit 0bed9b5c523d577378b6f83eab5835fe30c27208 upstream. Currently the messenger calls out to a get_authorizer con op, which will create a new authorizer if it doesn't yet have one. In the meantime, when we rotate our service keys, the authorizer doesn't get updated. Eventually it will be rejected by the server on a new connection attempt and get invalidated, and we will then rebuild a new authorizer, but this is not ideal. Instead, if we do have an authorizer, call a new update_authorizer op that will verify that the current authorizer is using the latest secret. If it is not, we will build a new one that does. This avoids the transient failure. This fixes one of the sorry sequence of events for bug http://tracker.ceph.com/issues/4282 Signed-off-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit def71c018b0cc6c1eb7f30aa3268e764f48e9cf0 Author: Sage Weil Date: Mon Mar 25 10:25:49 2013 -0700 libceph: fix authorizer invalidation commit 4b8e8b5d78b8322351d44487c1b76f7e9d3412bc upstream. We were invalidating the authorizer by removing the ticket handler entirely. This was effective in inducing us to request a new authorizer, but in the meantime it mean that any authorizer we generated would get a new and initialized handler with secret_id=0, which would always be rejected by the server side with a confusing error message: auth: could not find secret_id=0 cephx: verify_authorizer could not get service secret for service osd secret_id=0 Instead, simply clear the validity field. This will still induce the auth code to request a new secret, but will let us continue to use the old ticket in the meantime. The messenger code will probably continue to fail, but the exponential backoff will kick in, and eventually the we will get a new (hopefully more valid) ticket from the mon and be able to continue. Signed-off-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 5725b3e6e06bc80daaed802e4fd58e094c20a191 Author: Sage Weil Date: Mon Mar 25 09:30:13 2013 -0700 libceph: clear messenger auth_retry flag when we authenticate commit 20e55c4cc758e4dccdfd92ae8e9588dd624b2cd7 upstream. We maintain a counter of failed auth attempts to allow us to retry once before failing. However, if the second attempt succeeds, the flag isn't cleared, which makes us think auth failed again later when the connection resets for other reasons (like a socket error). This is one part of the sorry sequence of events in bug http://tracker.ceph.com/issues/4282 Signed-off-by: Sage Weil Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit aed4802d2a0d2f6c7179a2c03e184860741788c0 Author: Ben Skeggs Date: Mon Jun 3 16:40:14 2013 +1000 drm/nv50/kms: use dac loadval from vbios, where it's available commit d40ee48acde16894fb3b241d7e896d5fa84e0f10 upstream. Regression from merging the old nv50/nvd9 code together, and may be needed to fully fix fdo#64904. The value is ignored completely by the hardware starting from nva3. Reported-by: Emil Velikov Signed-off-by: Ben Skeggs commit b242745947ed7562ad91dad9e9fde1a92e40d666 Author: Ben Skeggs Date: Mon Jun 3 16:07:06 2013 +1000 drm/nv50/disp: force dac power state during load detect commit ea9197cc323839ef3d5280c0453b2c622caa6bc7 upstream. fdo#64904 Reported-by: Gerhard Bräunlich Signed-off-by: Ben Skeggs commit 3aaf1b31e90c528ce6ff6e9aa71497cd50f31b34 Author: Kees Cook Date: Wed Jun 5 11:47:18 2013 -0700 x86: Fix typo in kexec register clearing commit c8a22d19dd238ede87aa0ac4f7dbea8da039b9c1 upstream. Fixes a typo in register clearing code. Thanks to PaX Team for fixing this originally, and James Troup for pointing it out. Signed-off-by: Kees Cook Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net Cc: PaX Team Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 661b492837f4fc200be6aea14306c0563c4310aa Author: Yinghai Lu Date: Fri May 31 08:53:07 2013 -0700 x86: Fix adjust_range_size_mask calling position commit 7de3d66b1387ddf5a37d9689e5eb8510fb75c765 upstream. Commit 8d57470d x86, mm: setup page table in top-down causes a kernel panic while setting mem=2G. [mem 0x00000000-0x000fffff] page 4k [mem 0x7fe00000-0x7fffffff] page 1G [mem 0x7c000000-0x7fdfffff] page 1G [mem 0x00100000-0x001fffff] page 4k [mem 0x00200000-0x7bffffff] page 2M for last entry is not what we want, we should have [mem 0x00200000-0x3fffffff] page 2M [mem 0x40000000-0x7bffffff] page 1G Actually we merge the continuous ranges with same page size too early. in this case, before merging we have [mem 0x00200000-0x3fffffff] page 2M [mem 0x40000000-0x7bffffff] page 2M after merging them, will get [mem 0x00200000-0x7bffffff] page 2M even we can use 1G page to map [mem 0x40000000-0x7bffffff] that will cause problem, because we already map [mem 0x7fe00000-0x7fffffff] page 1G [mem 0x7c000000-0x7fdfffff] page 1G with 1G page, aka [0x40000000-0x7fffffff] is mapped with 1G page already. During phys_pud_init() for [0x40000000-0x7bffffff], it will not reuse existing that pud page, and allocate new one then try to use 2M page to map it instead, as page_size_mask does not include PG_LEVEL_1G. At end will have [7c000000-0x7fffffff] not mapped, loop in phys_pmd_init stop mapping at 0x7bffffff. That is right behavoir, it maps exact range with exact page size that we ask, and we should explicitly call it to map [7c000000-0x7fffffff] before or after mapping 0x40000000-0x7bffffff. Anyway we need to make sure ranges' page_size_mask correct and consistent after split_mem_range for each range. Fix that by calling adjust_range_size_mask before merging range with same page size. -v2: update change log. -v3: add more explanation why [7c000000-0x7fffffff] is not mapped, and it causes panic. Bisected-by: "Xie, ChanglongX" Bisected-by: Yuanhan Liu Reported-and-tested-by: Yuanhan Liu Signed-off-by: Yinghai Lu Link: http://lkml.kernel.org/r/1370015587-20835-1-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 27b67d8738f469deb6a1515a6c713a16e6d13959 Author: Naoya Horiguchi Date: Wed Jun 12 14:05:04 2013 -0700 mm: migration: add migrate_entry_wait_huge() commit 30dad30922ccc733cfdbfe232090cf674dc374dc upstream. When we have a page fault for the address which is backed by a hugepage under migration, the kernel can't wait correctly and do busy looping on hugepage fault until the migration finishes. As a result, users who try to kick hugepage migration (via soft offlining, for example) occasionally experience long delay or soft lockup. This is because pte_offset_map_lock() can't get a correct migration entry or a correct page table lock for hugepage. This patch introduces migration_entry_wait_huge() to solve this. Signed-off-by: Naoya Horiguchi Reviewed-by: Rik van Riel Reviewed-by: Wanpeng Li Reviewed-by: Michal Hocko Cc: Mel Gorman Cc: Andi Kleen Cc: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 77646c60ccd2acc62510764d52bdcaad442397e6 Author: Tomasz Stanislawski Date: Wed Jun 12 14:05:02 2013 -0700 mm/page_alloc.c: fix watermark check in __zone_watermark_ok() commit 026b08147923142e925a7d0aaa39038055ae0156 upstream. The watermark check consists of two sub-checks. The first one is: if (free_pages <= min + lowmem_reserve) return false; The check assures that there is minimal amount of RAM in the zone. If CMA is used then the free_pages is reduced by the number of free pages in CMA prior to the over-mentioned check. if (!(alloc_flags & ALLOC_CMA)) free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES); This prevents the zone from being drained from pages available for non-movable allocations. The second check prevents the zone from getting too fragmented. for (o = 0; o < order; o++) { free_pages -= z->free_area[o].nr_free << o; min >>= 1; if (free_pages <= min) return false; } The field z->free_area[o].nr_free is equal to the number of free pages including free CMA pages. Therefore the CMA pages are subtracted twice. This may cause a false positive fail of __zone_watermark_ok() if the CMA area gets strongly fragmented. In such a case there are many 0-order free pages located in CMA. Those pages are subtracted twice therefore they will quickly drain free_pages during the check against fragmentation. The test fails even though there are many free non-cma pages in the zone. This patch fixes this issue by subtracting CMA pages only for a purpose of (free_pages <= min + lowmem_reserve) check. Laura said: We were observing allocation failures of higher order pages (order 5 = 128K typically) under tight memory conditions resulting in driver failure. The output from the page allocation failure showed plenty of free pages of the appropriate order/type/zone and mostly CMA pages in the lower orders. For full disclosure, we still observed some page allocation failures even after applying the patch but the number was drastically reduced and those failures were attributed to fragmentation/other system issues. Signed-off-by: Tomasz Stanislawski Signed-off-by: Kyungmin Park Tested-by: Laura Abbott Cc: Bartlomiej Zolnierkiewicz Acked-by: Minchan Kim Cc: Mel Gorman Tested-by: Marek Szyprowski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8ccf6cfb157419847f3cb2bfdfbcdbd39860e8e9 Author: NeilBrown Date: Wed Jun 12 11:01:22 2013 +1000 md/raid1,raid10: use freeze_array in place of raise_barrier in various places. commit e2d59925221cd562e07fee38ec8839f7209ae603 upstream. Various places in raid1 and raid10 are calling raise_barrier when they really should call freeze_array. The former is only intended to be called from "make_request". The later has extra checks for 'nr_queued' and makes a call to flush_pending_writes(), so it is safe to call it from within the management thread. Using raise_barrier will sometimes deadlock. Using freeze_array should not. As 'freeze_array' currently expects one request to be pending (in handle_read_error - the only previous caller), we need to pass it the number of pending requests (extra) to ignore. The deadlock was made particularly noticeable by commits 050b66152f87c7 (raid10) and 6b740b8d79252f13 (raid1) which appeared in 3.4, so the fix is appropriate for any -stable kernel since then. This patch probably won't apply directly to some early kernels and will need to be applied by hand. Reported-by: Alexander Lyakas Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit 8f44eed48ec962b50e6b7a7067aa76f41f6de3f3 Author: H. Peter Anvin Date: Wed Jun 12 07:37:43 2013 -0700 md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place commit 5026d7a9b2f3eb1f9bda66c18ac6bc3036ec9020 upstream. There are cases where the kernel will believe that the WRITE SAME command is supported by a block device which does not, in fact, support WRITE SAME. This currently happens for SATA drivers behind a SAS controller, but there are probably a hundred other ways that can happen, including drive firmware bugs. After receiving an error for WRITE SAME the block layer will retry the request as a plain write of zeroes, but mdraid will consider the failure as fatal and consider the drive failed. This has the effect that all the mirrors containing a specific set of data are each offlined in very rapid succession resulting in data loss. However, just bouncing the request back up to the block layer isn't ideal either, because the whole initial request-retry sequence should be inside the write bitmap fence, which probably means that md needs to do its own conversion of WRITE SAME to write zero. Until the failure scenario has been sorted out, disable WRITE SAME for raid1, raid5, and raid10. [neilb: added raid5] This patch is appropriate for any -stable since 3.7 when write_same support was added. Signed-off-by: H. Peter Anvin Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit db9b5815031211b16f24fec0851553e8204b38a1 Author: Alex Lyakas Date: Tue Jun 4 20:42:21 2013 +0300 md/raid1: consider WRITE as successful only if at least one non-Faulty and non-rebuilding drive completed it. commit 3056e3aec8d8ba61a0710fb78b2d562600aa2ea7 upstream. Without that fix, the following scenario could happen: - RAID1 with drives A and B; drive B was freshly-added and is rebuilding - Drive A fails - WRITE request arrives to the array. It is failed by drive A, so r1_bio is marked as R1BIO_WriteError, but the rebuilding drive B succeeds in writing it, so the same r1_bio is marked as R1BIO_Uptodate. - r1_bio arrives to handle_write_finished, badblocks are disabled, md_error()->error() does nothing because we don't fail the last drive of raid1 - raid_end_bio_io() calls call_bio_endio() - As a result, in call_bio_endio(): if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) clear_bit(BIO_UPTODATE, &bio->bi_flags); this code doesn't clear the BIO_UPTODATE flag, and the whole master WRITE succeeds, back to the upper layer. So we returned success to the upper layer, even though we had written the data onto the rebuilding drive only. But when we want to read the data back, we would not read from the rebuilding drive, so this data is lost. [neilb - applied identical change to raid10 as well] This bug can result in lost data, so it is suitable for any -stable kernel. Signed-off-by: Alex Lyakas Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit 2ce822c553f52a08a1e4d72adc6cffe22bbd3240 Author: Rafael Aquini Date: Wed Jun 12 14:04:49 2013 -0700 swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion commit cbab0e4eec299e9059199ebe6daf48730be46d2b upstream. read_swap_cache_async() can race against get_swap_page(), and stumble across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought into the swapcache yet. This transient swap_map state is expected to be transitory, but the actual placement of discard at scan_swap_map() inserts a wait for I/O completion thus making the thread at read_swap_cache_async() to loop around its -EEXIST case, while the other end at get_swap_page() is scheduled away at scan_swap_map(). This can leave the system deadlocked if the I/O completion happens to be waiting on the CPU waitqueue where read_swap_cache_async() is busy looping and !CONFIG_PREEMPT. This patch introduces a cond_resched() call to make the aforementioned read_swap_cache_async() busy loop condition to bail out when necessary, thus avoiding the subtle race window. Signed-off-by: Rafael Aquini Acked-by: Johannes Weiner Acked-by: KOSAKI Motohiro Acked-by: Hugh Dickins Cc: Shaohua Li Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0cfab5ee6f9122303c47c50220c33a540a3b6818 Author: Daniel Vetter Date: Mon Jun 10 09:47:58 2013 +0200 drm/i915: prefer VBT modes for SVDO-LVDS over EDID commit c3456fb3e4712d0448592af3c5d644c9472cd3c1 upstream. In commit 53d3b4d7778daf15900867336c85d3f8dd70600c Author: Egbert Eich Date: Tue Jun 4 17:13:21 2013 +0200 drm/i915/sdvo: Use &intel_sdvo->ddc instead of intel_sdvo->i2c for DDC Egbert Eich fixed a long-standing bug where we simply used a non-working i2c controller to read the EDID for SDVO-LVDS panels. Unfortunately some machines seem to not be able to cope with the mode provided in the EDID. Specifically they seem to not be able to cope with a 4x pixel mutliplier instead of a 2x one, which seems to have been worked around by slightly changing the panels native mode in the VBT so that the dotclock is just barely above 50MHz. Since it took forever to notice the breakage it's fairly safe to assume that at least for SDVO-LVDS panels the VBT contains fairly sane data. So just switch around the order and use VBT modes first. v2: Also add EDID modes just in case, and spell Egbert correctly. v3: Elaborate a bit more about what's going on on Chris' machine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65524 Reported-and-tested-by: Chris Wilson Cc: Egbert Eich Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman commit 2977621c945f771688d6dc94a58763c6d080004b Author: Luciano Coelho Date: Fri May 10 10:19:38 2013 +0300 wl12xx: fix minimum required firmware version for wl127x multirole commit 60c28cf18f970e1c1bd40d615596eeab6efbd9d7 upstream. There was a typo in commit 8675f9 (wlcore/wl12xx/wl18xx: verify multi-role and single-role fw versions), which was causing the multirole firmware for wl127x (WiLink6) to be rejected. The actual minimum version needed for wl127x multirole is 6.5.7.0.42. Reported-by: Levi Pearson Reported-by: Michael Scott Signed-off-by: Luciano Coelho Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 76fc35ceb84355a7f881731494666bbd60311517 Author: Andrey Vagin Date: Wed Jun 12 14:04:42 2013 -0700 memcg: don't initialize kmem-cache destroying work for root caches commit f101a9464bfbda42730b54a66f926d75ed2cd31e upstream. struct memcg_cache_params has a union. Different parts of this union are used for root and non-root caches. A part with destroying work is used only for non-root caches. BUG: unable to handle kernel paging request at 0000000fffffffe0 IP: kmem_cache_alloc+0x41/0x1f0 Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G D 3.10.0-rc1+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: kmem_cache_alloc+0x41/0x1f0 Call Trace: getname_flags.part.34+0x30/0x140 getname+0x38/0x60 do_sys_open+0xc5/0x1e0 SyS_open+0x22/0x30 system_call_fastpath+0x16/0x1b Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49 RIP [] kmem_cache_alloc+0x41/0x1f0 Signed-off-by: Andrey Vagin Cc: Konstantin Khlebnikov Cc: Glauber Costa Cc: Johannes Weiner Cc: Balbir Singh Cc: KAMEZAWA Hiroyuki Reviewed-by: Michal Hocko Cc: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 98f22a82a3a43983a0201d2e28efccad6bc38e67 Author: Stephen M. Cameron Date: Wed Jun 12 14:04:47 2013 -0700 cciss: fix broken mutex usage in ioctl commit 03f47e888daf56c8e9046c674719a0bcc644eed5 upstream. If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked (as is normal with the Array Configuration Utility) the process will hang as below. It attempts to acquire the same mutex twice, once in do_ioctl() and once in cciss_unlocked_open(). The BKL was recursive, the mutex isn't. Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013 [...] acu D 0000000000000001 0 3246 3191 0x00000080 Call Trace: schedule+0x29/0x70 schedule_preempt_disabled+0xe/0x10 __mutex_lock_slowpath+0x17b/0x220 mutex_lock+0x2b/0x50 cciss_unlocked_open+0x2f/0x110 [cciss] __blkdev_get+0xd3/0x470 blkdev_get+0x5c/0x1e0 register_disk+0x182/0x1a0 add_disk+0x17c/0x310 cciss_add_disk+0x13a/0x170 [cciss] cciss_update_drive_info+0x39b/0x480 [cciss] rebuild_lun_table+0x258/0x370 [cciss] cciss_ioctl+0x34f/0x470 [cciss] do_ioctl+0x49/0x70 [cciss] __blkdev_driver_ioctl+0x28/0x30 blkdev_ioctl+0x200/0x7b0 block_ioctl+0x3c/0x40 do_vfs_ioctl+0x89/0x350 SyS_ioctl+0xa1/0xb0 system_call_fastpath+0x16/0x1b This mutex usage was added into the ioctl path when the big kernel lock was removed. As it turns out, these paths are all thread safe anyway (or can easily be made so) and we don't want ioctl() to be single threaded in any case. Signed-off-by: Stephen M. Cameron Cc: Jens Axboe Cc: Mike Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3864881bab5235d7e1e7a49298370fbc26d99be5 Author: Kees Cook Date: Wed Jun 12 14:04:39 2013 -0700 kmsg: honor dmesg_restrict sysctl on /dev/kmsg commit 637241a900cbd982f744d44646b48a273d609b34 upstream. The dmesg_restrict sysctl currently covers the syslog method for access dmesg, however /dev/kmsg isn't covered by the same protections. Most people haven't noticed because util-linux dmesg(1) defaults to using the syslog method for access in older versions. With util-linux dmesg(1) defaults to reading directly from /dev/kmsg. To fix /dev/kmsg, let's compare the existing interfaces and what they allow: - /proc/kmsg allows: - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive single-reader interface (SYSLOG_ACTION_READ). - everything, after an open. - syslog syscall allows: - anything, if CAP_SYSLOG. - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if dmesg_restrict==0. - nothing else (EPERM). The use-cases were: - dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs. - sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the destructive SYSLOG_ACTION_READs. AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't clear the ring buffer. Based on the comments in devkmsg_llseek, it sounds like actions besides reading aren't going to be supported by /dev/kmsg (i.e. SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive syslog syscall actions. To this end, move the check as Josh had done, but also rename the constants to reflect their new uses (SYSLOG_FROM_CALL becomes SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC). SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC allows destructive actions after a capabilities-constrained SYSLOG_ACTION_OPEN check. - /dev/kmsg allows: - open if CAP_SYSLOG or dmesg_restrict==0 - reading/polling, after open Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192 [akpm@linux-foundation.org: use pr_warn_once()] Signed-off-by: Kees Cook Reported-by: Christian Kujau Tested-by: Josh Boyer Cc: Kay Sievers Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 67842e50729b9f9c9ebd1449735f2f3d129e2325 Author: Robin Holt Date: Wed Jun 12 14:04:37 2013 -0700 reboot: rigrate shutdown/reboot to boot cpu commit cf7df378aa4ff7da3a44769b7ff6e9eef1a9f3db upstream. We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f2dc63 ("kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()"). The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt Tested-by: Shawn Guo Cc: "Srivatsa S. Bhat" Cc: H. Peter Anvin Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b0a52d36f57da68631c4500ed83aadaf7eb0d3c9 Author: Srivatsa S. Bhat Date: Wed Jun 12 14:04:36 2013 -0700 CPU hotplug: provide a generic helper to disable/enable CPU hotplug commit 16e53dbf10a2d7e228709a7286310e629ede5e45 upstream. There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Robin Holt Cc: H. Peter Anvin Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Shawn Guo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e5ccf170714c26fbc135a8926d576d31595bb1f6 Author: Sujith Manoharan Date: Thu Jun 6 10:06:29 2013 +0530 ath9k: Use minstrel rate control by default commit 5efac94999ff218e0101f67a059e44abb4b0b523 upstream. The ath9k rate control algorithm has various architectural issues that make it a poor fit in scenarios like congested environments etc. An example: https://bugzilla.redhat.com/show_bug.cgi?id=927191 Change the default to minstrel which is more robust in such cases. The ath9k RC code is left in the driver for now, maybe it can be removed altogether later on. Signed-off-by: Sujith Manoharan Cc: Jouni Malinen Cc: Linus Torvalds Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 6aebd1e4af6dea065fb735201f0ab29c1103276c Author: Felix Fietkau Date: Mon Jun 3 11:18:57 2013 +0200 Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity" commit 96005931785238e1a24febf65ffb5016273e8225 upstream. This reverts commit 68d9e1fa24d9c7c2e527f49df8d18fb8cf0ec943 This change reduces rx sensitivity with no apparent extra benefit. It looks like it was meant for testing in a specific scenario, but it was never properly validated. Signed-off-by: Felix Fietkau Cc: rmanohar@qca.qualcomm.com Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 7777c471687ebcc7f2ffef072fd31291247ea2fc Author: Sujith Manoharan Date: Sat Jun 1 07:08:09 2013 +0530 ath9k: Disable PowerSave by default commit 531671cb17af07281e6f28c1425f754346e65c41 upstream. Almost all the DMA issues which have plagued ath9k (in station mode) for years are related to PS. Disabling PS usually "fixes" the user's connection stablility. Reports of DMA problems are still trickling in and are sitting in the kernel bugzilla. Until the PS code in ath9k is given a thorough review, disbale it by default. The slight increase in chip power consumption is a small price to pay for improved link stability. Signed-off-by: Sujith Manoharan Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 15794d484f6404b4794867a0da892c01b20ccf24 Author: Johan Hedberg Date: Wed May 29 09:51:29 2013 +0300 Bluetooth: Fix mgmt handling of power on failures commit 96570ffcca0b872dc8626e97569d2697f374d868 upstream. If hci_dev_open fails we need to ensure that the corresponding mgmt_set_powered command gets an appropriate response. This patch fixes the missing response by adding a new mgmt_set_powered_failed function that's used to indicate a power on failure to mgmt. Since a situation with the device being rfkilled may require special handling in user space the patch uses a new dedicated mgmt status code for this. Signed-off-by: Johan Hedberg Acked-by: Marcel Holtmann Signed-off-by: Gustavo Padovan Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 22b74ee1ace2536bb4889c33f1de932d2814376a Author: Johan Hedberg Date: Tue May 28 13:46:30 2013 +0300 Bluetooth: Fix missing length checks for L2CAP signalling PDUs commit cb3b3152b2f5939d67005cff841a1ca748b19888 upstream. There has been code in place to check that the L2CAP length header matches the amount of data received, but many PDU handlers have not been checking that the data received actually matches that expected by the specific PDU. This patch adds passing the length header to the specific handler functions and ensures that those functions fail cleanly in the case of an incorrect amount of data. Signed-off-by: Johan Hedberg Signed-off-by: Gustavo Padovan Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit a534b5b5b7c501151f8a337a59525d7493733a35 Author: Patrik Jakobsson Date: Sat Jun 8 20:23:08 2013 +0200 drm/gma500/cdv: Unpin framebuffer on crtc disable commit 22e7c385a80d771aaf3a15ae7ccea3b0686bbe10 upstream. The framebuffer needs to be unpinned in the crtc->disable callback because of previous pinning in psb_intel_pipe_set_base(). This will fix a memory leak where the framebuffer was released but not unpinned properly. This patch only affects Cedarview. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=889511 Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=812113 Reviewed-by: Daniel Vetter Signed-off-by: Patrik Jakobsson Signed-off-by: Greg Kroah-Hartman commit 96a04bcecfc634032c29719ad1903d61bff77e4f Author: Patrik Jakobsson Date: Wed Jun 5 14:24:01 2013 +0200 drm/gma500/psb: Unpin framebuffer on crtc disable commit 820de86a90089ee607d7864538c98a23b503c846 upstream. The framebuffer needs to be unpinned in the crtc->disable callback because of previous pinning in psb_intel_pipe_set_base(). This will fix a memory leak where the framebuffer was released but not unpinned properly. This patch only affects Poulsbo. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=889511 Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=812113 Reviewed-by: Daniel Vetter Signed-off-by: Patrik Jakobsson Signed-off-by: Greg Kroah-Hartman commit 15b62555751201d0e6c548b0c14a4027152aad0d Author: Tony Lindgren Date: Wed Jun 12 14:04:48 2013 -0700 drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree commit 24b8256a1fb28d357bc6fa09184ba29b4255ba5c upstream. When booted in legacy mode device_init_wakeup() gets called by drivers/mfd/twl-core.c when the children are initialized. However, when booted using device tree, the children are created with of_platform_populate() instead add_children(). This means that the RTC driver will not have device_init_wakeup() set, and we need to call it from the driver probe like RTC drivers typically do. Without this we cannot test PM wake-up events on omaps for cases where there may not be any physical wake-up event. Signed-off-by: Tony Lindgren Reported-by: Kevin Hilman Cc: Alessandro Zummo Cc: Jingoo Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 528a7a8bf60a2a2aa3c10a6f1e6c03b4fd1f1f09 Author: Alex Elder Date: Thu May 16 15:04:20 2013 -0500 rbd: don't destroy ceph_opts in rbd_add() commit 7262cfca430a1a0e0707149af29ae86bc0ded230 upstream. Whether rbd_client_create() successfully creates a new client or not, it takes responsibility for getting the ceph_opts structure it's passed destroyed. If successful, the structure becomes associated with the created client; if not, rbd_client_create() will destroy it. Previously, rbd_get_client() would call ceph_destroy_options() if rbd_get_client() failed, and that meant it got called twice. That led freeing various pointers more than once, which is never a good idea. This resolves: http://tracker.ceph.com/issues/4559 Reported-by: Dan van der Ster Signed-off-by: Alex Elder Reviewed-by: Josh Durgin Signed-off-by: Greg Kroah-Hartman commit 4a798296068abd043608c10ccb69209a7c2f0579 Author: Jim Schutt Date: Wed May 15 13:03:35 2013 -0500 ceph: ceph_pagelist_append might sleep while atomic commit 39be95e9c8c0b5668c9f8806ffe29bf9f4bc0f40 upstream. Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc() while holding a lock, but it's spoiled because ceph_pagelist_addpage() always calls kmap(), which might sleep. Here's the result: [13439.295457] ceph: mds0 reconnect start [13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58 [13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1 . . . [13439.376225] Call Trace: [13439.378757] [] __might_sleep+0xfc/0x110 [13439.384353] [] ceph_pagelist_append+0x120/0x1b0 [libceph] [13439.391491] [] ceph_encode_locks+0x89/0x190 [ceph] [13439.398035] [] ? _raw_spin_lock+0x49/0x50 [13439.403775] [] ? lock_flocks+0x15/0x20 [13439.409277] [] encode_caps_cb+0x41f/0x4a0 [ceph] [13439.415622] [] ? igrab+0x28/0x70 [13439.420610] [] ? iterate_session_caps+0xe8/0x250 [ceph] [13439.427584] [] iterate_session_caps+0x115/0x250 [ceph] [13439.434499] [] ? set_request_path_attr+0x2d0/0x2d0 [ceph] [13439.441646] [] send_mds_reconnect+0x238/0x450 [ceph] [13439.448363] [] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph] [13439.455250] [] check_new_map+0x352/0x500 [ceph] [13439.461534] [] ceph_mdsc_handle_map+0x1bd/0x260 [ceph] [13439.468432] [] ? mutex_unlock+0xe/0x10 [13439.473934] [] extra_mon_dispatch+0x22/0x30 [ceph] [13439.480464] [] dispatch+0xbc/0x110 [libceph] [13439.486492] [] process_message+0x1ad/0x1d0 [libceph] [13439.493190] [] ? read_partial_message+0x3e8/0x520 [libceph] . . . [13439.587132] ceph: mds0 reconnect success [13490.720032] ceph: mds0 caps stale [13501.235257] ceph: mds0 recovery completed [13501.300419] ceph: mds0 caps renewed Fix it up by encoding locks into a buffer first, and when the number of encoded locks is stable, copy that into a ceph_pagelist. [elder@inktank.com: abbreviated the stack info a bit.] Signed-off-by: Jim Schutt Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 7662e04a958484d22fa9f18c7c6c35e0230d9207 Author: Jim Schutt Date: Wed May 15 13:03:35 2013 -0500 ceph: add cpu_to_le32() calls when encoding a reconnect capability commit c420276a532a10ef59849adc2681f45306166b89 upstream. In his review, Alex Elder mentioned that he hadn't checked that num_fcntl_locks and num_flock_locks were properly decoded on the server side, from a le32 over-the-wire type to a cpu type. I checked, and AFAICS it is done; those interested can consult Locker::_do_cap_update() in src/mds/Locker.cc and src/include/encoding.h in the Ceph server code (git://github.com/ceph/ceph). I also checked the server side for flock_len decoding, and I believe that also happens correctly, by virtue of having been declared __le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h. Signed-off-by: Jim Schutt Reviewed-by: Alex Elder Signed-off-by: Greg Kroah-Hartman commit 7b1b7a82e10d5d2eef133260bca9c5709f8f257c Author: Alex Elder Date: Wed May 15 16:28:33 2013 -0500 libceph: must hold mutex for reset_changed_osds() commit 14d2f38df67fadee34625fcbd282ee22514c4846 upstream. An osd client has a red-black tree describing its osds, and occasionally we would get crashes due to one of these trees tree becoming corrupt somehow. The problem turned out to be that reset_changed_osds() was being called without protection of the osd client request mutex. That function would call __reset_osd() for any osd that had changed, and __reset_osd() would call __remove_osd() for any osd with no outstanding requests, and finally __remove_osd() would remove the corresponding entry from the red-black tree. Thus, the tree was getting modified without having any lock protection, and was vulnerable to problems due to concurrent updates. This appears to be the only osd tree updating path that has this problem. It can be fairly easily fixed by moving the call up a few lines, to just before the request mutex gets dropped in kick_requests(). This resolves: http://tracker.ceph.com/issues/5043 Signed-off-by: Alex Elder Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit a274282929a27092f580702f963da551a7ca880a Author: Rafael J. Wysocki Date: Mon Jun 10 13:00:29 2013 +0200 ACPI / video: Do not bind to device objects with a scan handler commit 8c9b7a7b2fc2750af418ddc28e707c42e78aa0bf upstream. With the introduction of ACPI scan handlers, ACPI device objects with an ACPI scan handler attached to them must not be bound to by ACPI drivers any more. Unfortunately, however, the ACPI video driver attempts to do just that if there is a _ROM ACPI control method defined under a device object with an ACPI scan handler. Prevent that from happening by making the video driver's "add" routine check if the device object already has an ACPI scan handler attached to it and return an error code in that case. That is not sufficient, though, because acpi_bus_driver_init() would then clear the device object's driver_data that may be set by its scan handler, so for the fix to work acpi_bus_driver_init() has to be modified to leave driver_data as is on errors. References: https://bugzilla.kernel.org/show_bug.cgi?id=58091 Bisected-and-tested-by: Dmitry S. Demin Reported-and-tested-by: Jason Cassell Tracked-down-by: Aaron Lu Signed-off-by: Rafael J. Wysocki Reviewed-by: Aaron Lu Signed-off-by: Greg Kroah-Hartman commit e8cc120c909edf7498b38f249ce375741951ca17 Author: Kees Cook Date: Fri May 10 14:48:21 2013 -0700 b43: stop format string leaking into error msgs commit e0e29b683d6784ef59bbc914eac85a04b650e63c upstream. The module parameter "fwpostfix" is userspace controllable, unfiltered, and is used to define the firmware filename. b43_do_request_fw() populates ctx->errors[] on error, containing the firmware filename. b43err() parses its arguments as a format string. For systems with b43 hardware, this could lead to a uid-0 to ring-0 escalation. CVE-2013-2852 Signed-off-by: Kees Cook Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 979d73667f6eac3e7d6010e88de9de04245ad1c3 Author: Oleg Nesterov Date: Wed Jun 12 14:04:46 2013 -0700 audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE commit f000cfdde5de4fc15dead5ccf524359c07eadf2b upstream. audit_log_start() does wait_for_auditd() in a loop until audit_backlog_wait_time passes or audit_skb_queue has a room. If signal_pending() is true this becomes a busy-wait loop, schedule() in TASK_INTERRUPTIBLE won't block. Thanks to Guy for fully investigating and explaining the problem. (akpm: that'll cause the system to lock up on a non-preemptible uniprocessor kernel) (Guy: "Our customer was in fact running a uniprocessor machine, and they reported a system hang.") Signed-off-by: Oleg Nesterov Reported-by: Guy Streeter Cc: Eric Paris Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman