From 42a06f2166f2f6f7bf04f32b4e823eacdceafdc9 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Sun, 17 May 2009 10:23:52 -0400 Subject: [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call * Rafael J. Wysocki (rjw@sisk.pl) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ The patches linked above depend on the following patch to remove circular locking dependency : cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call (the following issue was faced when using cancel_delayed_work_sync() in the timer teardown (which fixes a race). * KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote: > Hi > > my box output following warnings. > it seems regression by commit 7ccc7608b836e58fbacf65ee4f8eefa288e86fac. > > A: work -> do_dbs_timer() -> cpu_policy_rwsem > B: store() -> cpu_policy_rwsem -> cpufreq_governor_dbs() -> work > > Hrm, I think it must be due to my attempt to fix the timer teardown race in ondemand governor mixed with new locking behavior in 2.6.30-rc. The rwlock seems to be taken around the whole call to cpufreq_governor_dbs(), when it should be only taken around accesses to the locked data, and especially *not* around the call to dbs_timer_exit(). Reverting my fix attempt would put the teardown race back in place (replacing the cancel_delayed_work_sync by cancel_delayed_work). Instead, a proper fix would imply modifying this critical section : cpufreq.c: __cpufreq_remove_dev() ... if (cpufreq_driver->target) __cpufreq_governor(data, CPUFREQ_GOV_STOP); unlock_policy_rwsem_write(cpu); To make sure the __cpufreq_governor() callback is not called with rwsem held. This would allow execution of cancel_delayed_work_sync() without being nested within the rwsem. Applies on top of the 2.6.30-rc5 tree. Required to remove circular dep in teardown of both conservative and ondemande governors so they can use cancel_delayed_work_sync(). CPUFREQ_GOV_STOP does not modify the policy, therefore this locking seemed unneeded. Signed-off-by: Mathieu Desnoyers CC: KOSAKI Motohiro Cc: Greg KH CC: Ingo Molnar CC: "Rafael J. Wysocki" CC: Ben Slusky CC: Chris Wright CC: Andrew Morton Signed-off-by: Dave Jones --- drivers/cpufreq/cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'drivers') diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d270e8eb3e6..47d2ad0ae07 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1070,11 +1070,11 @@ static int __cpufreq_remove_dev(struct sys_device *sys_dev) spin_unlock_irqrestore(&cpufreq_driver_lock, flags); #endif + unlock_policy_rwsem_write(cpu); + if (cpufreq_driver->target) __cpufreq_governor(data, CPUFREQ_GOV_STOP); - unlock_policy_rwsem_write(cpu); - kobject_put(&data->kobj); /* we need to make sure that the underlying kobj is actually -- cgit v1.2.3 From b253d2b2d28ead6fed012feb54694b3d0562839a Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Sun, 17 May 2009 10:29:33 -0400 Subject: [CPUFREQ] fix timer teardown in conservative governor * Rafael J. Wysocki (rjw@sisk.pl) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ > (re-send with updated changelog) cpufreq fix timer teardown in conservative governor The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the workqueue handler to exit. The ondemand governor does not seem to be affected because the "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns immediately without rescheduling the work. The conservative governor in 2.6.30-rc has the same check as the ondemand governor, which makes things usually run smoothly. However, if the governor is quickly stopped and then started, this could lead to the following race : dbs_enable could be reenabled and multiple do_dbs_timer handlers would run. This is why a synchronized teardown is required. Depends on patch cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call The following patch applies to 2.6.30-rc2. Stable kernels have a similar issue which should also be fixed, but the code changed between 2.6.29 and 2.6.30, so this patch only applies to 2.6.30-rc. Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: gregkh@suse.de CC: stable@kernel.org CC: cpufreq@vger.kernel.org CC: Ingo Molnar CC: rjw@sisk.pl CC: Ben Slusky Signed-off-by: Dave Jones --- drivers/cpufreq/cpufreq_conservative.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'drivers') diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index 2ecd95e4ab1..7a74d175287 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -91,6 +91,9 @@ static unsigned int dbs_enable; /* number of CPUs using this policy */ * (like __cpufreq_driver_target()) is being called with dbs_mutex taken, then * cpu_hotplug lock should be taken before that. Note that cpu_hotplug lock * is recursive for the same process. -Venki + * DEADLOCK ALERT! (2) : do_dbs_timer() must not take the dbs_mutex, because it + * would deadlock with cancel_delayed_work_sync(), which is needed for proper + * raceless workqueue teardown. */ static DEFINE_MUTEX(dbs_mutex); @@ -542,7 +545,7 @@ static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info) static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info) { dbs_info->enable = 0; - cancel_delayed_work(&dbs_info->work); + cancel_delayed_work_sync(&dbs_info->work); } static int cpufreq_governor_dbs(struct cpufreq_policy *policy, -- cgit v1.2.3 From b14893a62c73af0eca414cfed505b8c09efc613c Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Sun, 17 May 2009 10:30:45 -0400 Subject: [CPUFREQ] fix timer teardown in ondemand governor * Rafael J. Wysocki (rjw@sisk.pl) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ > (updated changelog) cpufreq fix timer teardown in ondemand governor The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the workqueue handler to exit. The ondemand governor does not seem to be affected because the "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns immediately without rescheduling the work. The conservative governor in 2.6.30-rc has the same check as the ondemand governor, which makes things usually run smoothly. However, if the governor is quickly stopped and then started, this could lead to the following race : dbs_enable could be reenabled and multiple do_dbs_timer handlers would run. This is why a synchronized teardown is required. The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2. Depends on patch cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call Signed-off-by: Mathieu Desnoyers CC: Andrew Morton CC: gregkh@suse.de CC: stable@kernel.org CC: cpufreq@vger.kernel.org CC: Ingo Molnar CC: rjw@sisk.pl CC: Ben Slusky Signed-off-by: Dave Jones --- drivers/cpufreq/cpufreq_ondemand.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'drivers') diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c index 338f428a15b..e741c339df7 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -98,6 +98,9 @@ static unsigned int dbs_enable; /* number of CPUs using this policy */ * (like __cpufreq_driver_target()) is being called with dbs_mutex taken, then * cpu_hotplug lock should be taken before that. Note that cpu_hotplug lock * is recursive for the same process. -Venki + * DEADLOCK ALERT! (2) : do_dbs_timer() must not take the dbs_mutex, because it + * would deadlock with cancel_delayed_work_sync(), which is needed for proper + * raceless workqueue teardown. */ static DEFINE_MUTEX(dbs_mutex); @@ -562,7 +565,7 @@ static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info) static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info) { dbs_info->enable = 0; - cancel_delayed_work(&dbs_info->work); + cancel_delayed_work_sync(&dbs_info->work); } static int cpufreq_governor_dbs(struct cpufreq_policy *policy, -- cgit v1.2.3