Linux动态调频cpufreq framework_cpu动态调频-程序员宅基地

技术标签：个人总结 linux 操作系统

一、前言

linux kernel主要通过三类机制实现SMP系统CPU core的电源管理功能：

cpu hotplug。根据应用场景，enable/disable CPU core.
cpuidle framework。在没有进程调度的时候，让CPU core进入idle状态.
cpufreq framework。根据使用场景和系统负荷，调整CPU core的电压(voltage)和频率（frequency）.

对CPU core来说，功耗和性能是一对不可调和的矛盾，通过调整CPU的电压和频率，可以在功耗和性能之间找一个平衡点。

由于调整是在系统运行的过程中，因此cpufreq framework的功能也称作动态电压/频率调整（Dynamic Voltage/Frequency Scaling，DVFS）。

二、cpufreq framework概述

cpufreq framework的核心功能，是通过调整CPU core的电压和频率，兼顾系统的性能和功耗。在不需要高性能时，降低电压和频率，以降低功耗；在需要高性能时，提高电压和频率，以提高性能。

cpufreq framework中的几个概念：policy(策略)选择合适调频范围，governor(调节器)来决定如何计算合适的频率，cpufreq_driver来实现真正的调频工作（平台相关）

cpufreq framework实现了两种调频方式:

对于可以自动调频的CPU，CPU根据自身的负荷，自动调整电压和频率，cpufreq framework只需提供频率的调整范围，和大致的应用场景（例如，是高性能场景，还是低性能场景），无需governor参与。
对于不可以自动调频的CPU，需要governor根据应用场景计算合适的频率，通过driver控制CPU的频率和电压（基于clock framework和regulator framework）。

1.常用governor类型

性能（Performance）：总是将CPU置于最高能耗也是最高性能的状态，即硬件所支持的最高频/最高压。
节能（Powersaving）：总是将CPU置于最低能耗也是最差性能的状态，即硬件所支持的最低频/最低压。
按需（Ondemand）：设置CPU负载的阈值T，当负载低于T时，调节至一个刚好能够满足当前负载需求的最低频/最低压；当负载高于T时，立即提升到最高性能状态。
保守（Conservative）：跟Ondemand策略类似，设置CPU负载的阈值T，当负载低于T时，调节至一个刚好能够满足当前负载需求的最低频/最低压；但当负载高于T时，不是立即设置为最高性能状态，而是逐级升高主频/电压。
用户（Userspace）：将控制接口通过sysfs开放给用户，由用户进行自定义策略。
调度信息（Schedutil）：这是从Linux-4.7版本开始才引入的策略，其原理是根据调度器所提供的CPU利用率信息进行电压/频率调节，效果上类似于Ondemand策略，但是更加精确和自然（因为调度器掌握了最好的CPU使用情况）。

2.用户层接口

sysfs目录在/sys/devices/system/cpu/cpu0/cpufreq

名称	说明
cpuinfo_max_freq/cpuinfo_min_freq	CPU 硬件所支持的最高运行频率及最低运行频率
cpuinfo_cur_freq	从 CPU 硬件寄存器中读取 CPU 当前所处的运行频率
affected_cpus	该cpufreq策略影响到哪些cpu(没有显示处于offline状态的cpu)
related_cpus	该cpufreq策略影响到哪些cpu(包括了online+offline的所有cpu)
scaling_max_freq/scaling_min_freq	cpufreq策略支持的最高运行频率及最低运行频率
scaling_cur_freq	cpufreq策略当前设置的运行频率
scaling_available_governors	当前系统支持的governors
scaling_available_frequencies	支持的调频频率
scaling_driver	当前使用的调频驱动
scaling_governor	当前使用的governor
scaling_setspeed	需将governor切换为userspace才能使用，往这个文件echo数值，会切换频率
以下是将governor切换为ondemand后生成的ondemand文件夹下出现的配置文件
sampling_rate	当前使用的采样间隔，单位：微秒
sampling_rate_min	允许使用的最短采样间隔
sampling_rate_max	允许使用的最长采样间隔
up_threshold	阈值，系统负载超过该值时，governor会自动提高CPU的运行频率
ignore_nice_load	可以设置为0或1（0 是默认设置）。当这个参数设置为 1 时，任何具有“nice”值的处理器不计入总处理器利用率。在设置为0时，所有处理器都计入利用率。

三、cpufreq 软件架构

本节从概念上介绍软件架构分层，以及各部分的主要功能和相关结构体的主要成员，具体实现下节介绍。

在这里插入图片描述

cpufreq core：把一些公共的逻辑和接口代码抽象出来，这些代码与平台无关，也与governor无关.
governor core：governor计算合适的频率，需要我们提供必要的频率范围和参数(阈值等)，governor core把一些公共的逻辑和接口代码抽象出来，这些代码与具体的governor无关
governor：具体的governor实现，每种governor计算频率的方式不同，governor的实现与平台无关.
cpufreq driver:完成平台相关的初始化，基于cpu subsystem driver、OPP、clock framework、regulator framework等模块，提供对CPU频率和电压的控制。

1.cpufreq core

cpufreq core是cpufreq framework的核心模块，和kernel其它framework类似，它主要实现三类功能：

对上，以sysfs的形式向用户空间提供统一的接口，以notifier的形式向其它driver提供频率变化的通知；
对下，提供CPU频率和电压控制的驱动框架，方便底层driver的开发；同时，提供governor框架，用于实现不同的频率调整机制；
内部，封装各种逻辑，实现所需功能。这些逻辑主要围绕struct cpufreq_policy、struct cpufreq_driver和struct cpufreq_governor三个数据结构进行。

1.1 struct cpufreq_policy

kernel使用cpufreq policy（即“调频策略”）来抽象cpufreq。所谓的调频策略，即频率调整的范围，它从一定程度上，代表了cpufreq的属性。这就是struct cpufreq_policy结构的含义。

在这里插入图片描述

kernel抽象出一个CPU bus，所有的CPU device都挂在这个bus上。cpufreq是CPU device的一类特定功能，被抽象为一个subsys interface(kernel使用struct subsys_interface结构表示),bus下所有的这一类特定功能（subsys interface）都挂载在bus的struct subsys_private变量的“interface”链表上

简单来讲就是将cpufreq_freq作为所有cpu设备的一个功能，注册到了cpu_subsys总线上，匹配过程如下

(1)register_cpu

//drivers/base/cpu.c
register_cpu
    cpu->dev.bus = &cpu_subsys;
    device_register
        device_add
            bus_add_device
                error = device_add_groups(dev, bus->dev_groups);//向总线注册设备
                klist_add_tail(&dev->p->knode_bus, &bus->p->klist_devices);//向subsys_private变量的设备列表注册设备

每个cpu都将自己的dev注册到cpu_subsys总线的bus->p->klist_devices

（2）cpufreq初始化

//driver/cpufreq/cpufreq.c
cpufreq_register_driver
	subsys_interface_register(&cpufreq_interface);//入口

static struct subsys_interface cpufreq_interface = {
    
	.name		= "cpufreq",
	.subsys		= &cpu_subsys,
	.add_dev	= cpufreq_add_dev,
	.remove_dev	= cpufreq_remove_dev,
};

int subsys_interface_register(struct subsys_interface *sif)
{
    
	/**/
	list_add_tail(&sif->node, &subsys->p->interfaces);//将cpufreq注册到上文提到的subsys->p->interfaces链表中
	if (sif->add_dev) {
    
		subsys_dev_iter_init(&iter, subsys, NULL, NULL);
		//去cpu_subsys 这个bus->p->klist_devices遍历设备，每个设备都会调用add_dev回调函数
		while ((dev = subsys_dev_iter_next(&iter)))
			sif->add_dev(dev, sif);
		subsys_dev_iter_exit(&iter);
	}
	/**/
}

理解了cpufreq_policy在总线上的结构之后，我们继续看他的定义

//driver/cpufreq/cpufreq.h
struct cpufreq_cpuinfo {
    
	unsigned int		max_freq;//cpu硬件支持的最大频率
	unsigned int		min_freq;//cpu硬件支持的最小频率

	/* in 10^(-9) s = nanoseconds */
	unsigned int		transition_latency;//频率转换延迟
};

/*有精简*/
struct cpufreq_policy {
    
	/* CPUs sharing clock, require sw coordination */
	cpumask_var_t		cpus;	/* Online CPUs only */
	cpumask_var_t		related_cpus; /* Online + Offline CPUs */
	cpumask_var_t		real_cpus; /* Related and present */

	unsigned int		cpu;    /* cpu managing this policy, must be online */

	unsigned int		min;    /* 策略支持的最小频率(kHz)*/
	unsigned int		max;    /* 策略支持的最大频率(kHz)*/
	unsigned int		cur;    /* 策略中设置的当前频率(kHz), only needed if cpufreq governors are used */

	struct cpufreq_governor	*governor; /* 该策略关联的governor */
	void			*governor_data;		/* 与governor交互的数据，例如阈值，采样频率等 */

	struct work_struct	update; /* 更新策略的回调函数，也用于cpufreq Qos*/

	struct freq_constraints	constraints;//cpufreq Qos频率约束
	struct freq_qos_request	*min_freq_req;
	struct freq_qos_request	*max_freq_req;

	struct cpufreq_frequency_table	*freq_table;//当前可用频率表

	/* cpufreq-stats */
	struct cpufreq_stats	*stats;

	/* Pointer to the cooling device if used for thermal mitigation */
	struct thermal_cooling_device *cdev;

	struct notifier_block nb_min;//通知机制，用于cpufreq Qos
	struct notifier_block nb_max;
};

有些平台，所有cpu core的频率和电压时统一控制的，即改变某个core上的频率，其它core同样受影响。此时只需要实现其中一个core（通常为cpu0）的cpufreq即可，其它core的cpufreq直接是cpu0的符号链接。

而另一些些平台，不同core可以单独控制，这时不同cpu目录下的cpufreq就不一样了。

到底某一个cpufreq可以控制多少cpu core呢？可通过affected_cpus和related_cpus两个sysfs文件查看，区别是：affected_cpus表示该cpufreq影响到哪些cpu(没有显示处于offline状态的cpu)，related_cpus则包括了online+offline的所有cpu。affected_cpus对应policy->cpus,related_cpus对应policy->related_cpus

1.2 struct cpufreq_driver

struct cpufreq_driver用于抽象cpufreq驱动，是平台驱动工程师关注最多的结构，其定义如下：

//driver/cpufreq/cpufreq.h
/*有精简*/
struct cpufreq_driver {
    
	char	name[CPUFREQ_NAME_LEN];//该driver的名字，需要唯一，因为cpufreq framework允许同时注册多个driver，用户可以根据实际情况选择使用哪个driver。
	u8		flags;
	void	*driver_data;

	/* needed by all drivers */
	int		(*init)(struct cpufreq_policy *policy);//初始化函数，填充policy内容
	int		(*verify)(struct cpufreq_policy *policy);//验证policy中的内容是否符合硬件要求

	/* define one out of two */
	int		(*setpolicy)(struct cpufreq_policy *policy);//第一套接口，用于设置CPU动态频率调整的范围

	/*driver需要实现这两个接口中的一个（target为旧接口，不推荐使用），用于设置cpu频率（同时修改对应的电压)*/
	int		(*target)(struct cpufreq_policy *policy,
				  unsigned int target_freq,
				  unsigned int relation);	/* Deprecated */
	int		(*target_index)(struct cpufreq_policy *policy,
					unsigned int index);

	/* platform specific boost support code */
	bool		boost_enabled;
	int		(*set_boost)(int state);
};

1.3 struct cpufreq_governor

对于不能自动调频的CPU，必须由软件计算合适的频率。根据使用场景的不同，会有不同的方案，这是由governor模块负责的

//driver/cpufreq/cpufreq.h
/*有精简*/
struct cpufreq_governor {
    
	char	name[CPUFREQ_NAME_LEN];//governor的名称

	/*governor模块需要实现回调函数，包括初始化，开始，停止等*/
	int	(*init)(struct cpufreq_policy *policy);
	void	(*exit)(struct cpufreq_policy *policy);
	int	(*start)(struct cpufreq_policy *policy);
	void	(*stop)(struct cpufreq_policy *policy);
	void	(*limits)(struct cpufreq_policy *policy);
	
	/*用于提供sysfs “setspeed” attribute文件的回调函数*/
	ssize_t	(*show_setspeed)	(struct cpufreq_policy *policy, char *buf);
	int	(*store_setspeed)	(struct cpufreq_policy *policy, unsigned int freq);

	struct list_head	governor_list;//用于注册当前系统中可用的governor
};

2. cpufreq driver

实现cpufreq core抽象出来的struct cpufreq_driver回调函数，平台相关

3. cpufreq governor

cpufreq governor模块需要实现cpufreq core抽象出来的struct cpufreq_governor,内核提供了多种governor，在之前的内核版本中，每种governor几乎是独立的代码，它们各自用自己的方式实现对系统的负载进行监测，很多时候，检测的逻辑其实是很相似的，各个governor最大的不同之处其实是根据检测的结果，计算合适频率的方式。所以，为了减少代码的重复，在新版本中将一些公共的逻辑代码被单独抽象出来(drivers/cpufreq/cpufreq_governor.c),不同的governor实现回调函数即可。

struct dbs_governor是核心数据结构：

//drivers/cpufreq/cpufreq_governor.h
/*有精简*/
struct dbs_governor {
    
	struct cpufreq_governor gov;//与cpufreq core交互的接口，即policy->governor
	struct kobj_type kobj_type;

	/*
	 * Common data for platforms that don't set
	 * CPUFREQ_HAVE_GOVERNOR_PER_POLICY
	 */
	struct dbs_data *gdbs_data;//与cpufreq core交互的数据，即policy->governor_data

	/*每个governor实例需要实现的回调函数*/
	unsigned int (*gov_dbs_update)(struct cpufreq_policy *policy);//调频接口！！！
	struct policy_dbs_info *(*alloc)(void);
	void (*free)(struct policy_dbs_info *policy_dbs);
	int (*init)(struct dbs_data *dbs_data);
	void (*exit)(struct dbs_data *dbs_data);
	void (*start)(struct cpufreq_policy *policy);
};

struct dbs_data是governor计算频率使用的相关参数，包括阈值，采样率等

//drivers/cpufreq/cpufreq_governor.h
struct dbs_data {
    
	struct gov_attr_set attr_set;
	void *tuners;
	unsigned int ignore_nice_load;
	unsigned int sampling_rate;
	unsigned int sampling_down_factor;
	unsigned int up_threshold;
	unsigned int io_is_busy;
};

struct policy_dbs_info是policy和governor传递的私有数据，可以将他看成struct dbs_data的封装

//drivers/cpufreq/cpufreq_governor.h
struct policy_dbs_info {
    
	struct cpufreq_policy *policy;
	/*
	 * Per policy mutex that serializes load evaluation from limit-change
	 * and work-handler.
	 */
	struct mutex update_mutex;

	u64 last_sample_time;
	s64 sample_delay_ns;
	atomic_t work_count;
	struct irq_work irq_work;
	struct work_struct work;
	/* dbs_data may be shared between multiple policy objects */
	struct dbs_data *dbs_data;
	struct list_head list;
	/* Multiplier for increasing sample delay temporarily. */
	unsigned int rate_mult;
	unsigned int idle_periods;	/* For conservative */
	/* Status indicators */
	bool is_shared;		/* This object is used by multiple CPUs */
	bool work_in_progress;	/* Work is being queued up or in progress */
};

struct cpu_dbs_info把计算cpu负载需要使用到的一些辅助变量整合在了一起，percpu变量

//drivers/cpufreq/cpufreq_governor.h
struct cpu_dbs_info {
    
	u64 prev_cpu_idle;//上一次统计时刻该cpu停留在idle状态的总时间。
	u64 prev_update_time;//上一次统计时刻对应的总工作时间
	u64 prev_cpu_nice;
	/*
	 * Used to keep track of load in the previous interval. However, when
	 * explicitly set to zero, it is used as a flag to ensure that we copy
	 * the previous load to the current interval only once, upon the first
	 * wake-up from idle.
	 */
	unsigned int prev_load;//上一次统计的负载
	struct update_util_data update_util;//调频函数
	struct policy_dbs_info *policy_dbs;//policy和governor传递的参数
};

四、cpufreq core

上节介绍了各个数据结构主要内容，本节通过cpufreq的初始化流程入手来了解动态调频的相关功能。

cpufreq_register_driver函数为cpufreq驱动注册的入口，驱动程序通过调用该函数进行初始化，并传入相关的的struct cpufreq_driver,介绍struct cpufreq_policy时提到cpufreq_register_driver会调用subsys_interface_register，最终执行回调函数cpufreq_add_dev，下面来详细介绍一下

//driver/cpufreq/cpufreq.c
cpufreq_add_dev
	cpufreq_online//初始化的核心函数
		policy = cpufreq_policy_alloc(cpu);//第一次初始化需要为每个cpu申请policy结构，最终保存为per_cpu_data
			// cpufreq_global_kobject表示/sys/devices/system/cpu/cpufreq目录，此处创建cpufreq基本sysfs节点(1)
			ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, cpufreq_global_kobject, "policy%u", cpu);
			freq_constraints_init(&policy->constraints);//初始化约束条件(2)
			freq_qos_add_notifier//注册约束条件通知链
			INIT_WORK(&policy->update, handle_update);
		cpufreq_driver->init(policy);//调用驱动程序init回调函数，初始化policy结构体，填充了cpufreq table
		cpufreq_table_validate_and_sort//解析并排序cpufreq table，填充policy->max和policy->cpuinfo.max
		add_cpu_dev_symlink//创建/sys/device/system/cpu/cpux目录下的cpufreq符号链接
		freq_qos_add_request//添加约束条件请求
		blocking_notifier_call_chain//通知已创建policy(3)
		cpufreq_add_dev_interface//创建sys节点，/sys/device/system/cpu/cpufreq/policyx目录下的一些可选属性(4)
		cpufreq_stats_create_table(policy);//CONFIG_CPU_FREQ_STAT功能(5)
		cpufreq_times_create_policy(policy);//CONFIG_CPU_FREQ_TIMES功能(6)
		cpufreq_init_policy//使用默认govner初始化policy (7)
		cpufreq_thermal_control_enabled//温控设备(8)

(1)cpufreq_global_kobject

//driver/cpufreq/cpufreq.c
cpufreq_core_init
	cpufreq_global_kobject = kobject_create_and_add("cpufreq", 
						&cpu_subsys.dev_root->kobj);//表示在cpu总线上创建一个cpufreq目录

(2)约束条件
即policy->max和policy->min，限制动态调频的最大/小值，为了管理该约束条件，引入了cpufreq Qos策略，详见后文《freq Qos策略和限频流程》

(3)cpufreq notifiers
cpufreq的通知系统使用了内核的标准通知接口。它对外提供了两个通知事件：policy通知和transition通知。

policy通知用于通知其它模块policy改变，事件分别是：

CPUFREQ_CREATE_POLICY===>cpufreq_online
CPUFREQ_REMOVE_POLICY====>cpufreq_policy_free

transition通知链用于在驱动实施调整cpu的频率时，用于通知相关的注册者。每次调整频率时，该通知会发出两次通知事件：

CPUFREQ_PRECHANGE 调整前的通知。
CPUFREQ_POSTCHANGE 完成调整后的通知。

对应结构体
SRCU_NOTIFIER_HEAD_STATIC(cpufreq_transition_notifier_list);
static BLOCKING_NOTIFIER_HEAD(cpufreq_policy_notifier_list);

(4)cpufreq_add_dev_interface
创建sys节点，/sys/device/system/cpu/cpufreq/policyx目录下的一些可选属性，包括

driver提供一些额外的sysfs attribute（需要驱动支持，cpufreq_driver->attr）
cpuinfo_cur_freq：cpu当前运行频率（需要驱动支持cpufreq_driver->get）
scaling_cur_freq：policy当前设置的运行频率（必定创建）
bios_limit：硬件限制（需要驱动支持cpufreq_driver->bios_limit）

(5)CONFIG_CPU_FREQ_STAT功能
记录每个CPU在policy中各个频率的时间,单位为 jiffies，对应节点如下

root@kylin-cp302l2:/sys/devices/system/cpu/cpu0/cpufreq/stats# cat time_in_state 
2000000 5582
1750000 38
1500000 33
1250000 186
1000000 955
750000 2147

在这里插入图片描述

当cpufreq policy频率改变完成时，cpufreq driver通过cpufreq_freq_transition_end->cpufreq_notify_transition(CPUFREQ_POSTCHANGE)发出通知，最终cpufreq_stats模块cpufreq_stats_record_transition函数记录切换到哪一个目标频点并更新对应的时间。

cpufreq_stats模块代码位于drivers/cpufreq/cpufreq_stats.c

(6)CONFIG_CPU_FREQ_TIMES功能
记录每个进程在policy中各个频率的时间, 单位为每个tick的时间，对应节点如下

root@kylin-cp302l2:/proc/1099# cat time_in_state 
cpu0
2000000 81
1750000 25
1500000 18
1250000 14
1000000 10
750000 32
cpu1
2000000 186
1750000 0
1500000 0
1250000 0
1000000 0
750000 0
cpu2
2000000 260
1750000 0
1500000 0
1250000 0
1000000 0
750000 0
cpu3
2000000 322
1750000 0
1500000 0
1250000 0
1000000 0
750000 0

在这里插入图片描述

当cpufreq policy频率改变完成时，cpufreq driver通过cpufreq_freq_transition_end->cpufreq_notify_transition(CPUFREQ_POSTCHANGE)发出通知，最终cpufreq_times模块cpufreq_times_record_transition函数记录切换到哪一个目标频点。
cputime模块接收到timer中断后，会调用cpufreq_acct_update_power将该tick添加到cpufreq_times模块当前任务的频点统计上。

cpufreq_stats模块代码位于drivers/cpufreq/cpufreq_times.c

上面这两个功能必须要有freq table

(7)使用默认governor初始化policy
实现setpolicy则直接调用该回调,若实现target，则初始化相关governor，详见下节《cpufreq governor》

(8)温控设备联动

五、cpufreq driver

cpufreq driver主要完成平台相关的CPU频率/电压的控制，它在cpufreq framework中是非常简单的一个模块，主要包括：

平台相关的初始化动作，包括cpu的clock/regulator获取、初始化等。
生成frequency table，即cpu所支持的频率/电压列表。
定义一个struct cpufreq_driver变量，填充必要的字段，并根据平台的特性，实现其中的回调函数。
调用cpufreq_register_driver将driver注册到cpufreq framework中。
cpufreq core会在CPU设备添加时，调用driver->init接口。driver需要在该接口中初始化struct cpufreq_policy变量。
系统运行过程中，cpufreq core会根据实际情况，调用driver的setpolicy或者target/target_index等接口，设置CPU的调频策略或者频率值。
系统suspend的时中，会将CPU的频率设置为指定的值，或者调用driver的suspend回调函数；系统resume时，调用driver的resume回调函数。

六、frequency table

frequency table是CPU core可以正确运行的一组频率/电压组合，之所以存在的一个思考点是：table是频率和电压之间的一个一一对应的组合，因此cpufreq framework只需要关心频率，所有的策略都称做“调频”策略。而cpufreq driver可以在“调频”的同时，通过table取出和频率对应的电压，进行修改CPU core电压，实现“调压”的功能，这简化了设计。

//driver/cpufreq/cpufreq.h
/* Special Values of .frequency field */
#define CPUFREQ_ENTRY_INVALID   ~0u
#define CPUFREQ_TABLE_END       ~1u
/* Special Values of .flags field */
#define CPUFREQ_BOOST_FREQ      (1 << 0)
 
struct cpufreq_frequency_table {
    
        unsigned int    flags;
        unsigned int    driver_data; /* driver specific data, not used by core */
        unsigned int    frequency; /* kHz - doesn't need to be in ascending
                                    * order */
};

名称	说明
flags	CPUFREQ_BOOST_FREQ，表示这个频率值是一个boost频率
driver_data	由driver使用，具体意义由driver定义，例如电压
frequency	频率值(kHz)，无需排序。 CPUFREQ_ENTRY_INVALID：无效频率值 CPUFREQ_TABLE_END：表示table的结束

Boost表示智能超频技术，是一个在x86平台上的功能，具体可参考“turbo-boost-technology.html”，本文不做过多描述。

init

init回调函数主要功能就是初始化struct cpufreq_policy，对driver而言，不太关心其的内部实现，其实cpufreq framework也在努力实现这个目标，包括将相应的初始化过程封装成一个API等，主要关心的成员：

cpus：告诉cpufreq core，该policy适用于哪些cpu。大多数情况下，系统中所有的cpu都有相同的硬件逻辑，因此只需要一个policy，就可以管理所有的cpu core。
clk：clock指针，cpufreq core可以利用该指针，获取当前实际的频率值。
cpuinfo：该cpu调频相关的固定信息，包括最大频率、最小频率、切换延迟，其中最大频率、最小频率可以通过frequency table推导得出。
min、max：调频策略所对应的最小频率、最大频率，初始化时，可以和上面的cpuinfo中的min、max相同。
freq_table：对应的frequency table。

cpuinfo、min、max、freq_table等都可以通过cpufreq core提供的cpufreq_generic_init接口初始化：

//driver/cpufreq/cpufreq.c
void cpufreq_generic_init(struct cpufreq_policy *policy,
		struct cpufreq_frequency_table *table, unsigned int transition_latency)

该接口以需要初始化的policy、frequency table以及切换延迟为参数，从table中解析policy初始化所需的信息，初始化policy。

一般情况下，在init中调用cpufreq_generic_init即可。

verify

当上层软件需要设定一个新的policy的时候，会调用driver->verify，检查该policy是否合法。cpufreq core封装了下面两个接口，辅助完成这个功能：

//drivers/cpufreq/freq_table.c
int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
                                    struct cpufreq_frequency_table *table);
int cpufreq_generic_frequency_table_verify(struct cpufreq_policy *policy);

cpufreq_frequency_table_verify根据指定的frequency table，检查policy是否合法，检查逻辑很简单：policy的频率范围{min,max}，是否超出policy->cpuinfo的频率范围，是否超出frequency table中的频率范围。
cpufreq_generic_frequency_table_verify是前者的封装，使用policy->freq_table。

cpufreq framework中“频率”的几个层次。
(1))最底层:frequency table中定义的频率，有限的离散频率，代表了cpu的调频能力。
(2)往上，是policy->cpuinfo中的频率范围，它对cpu调频进行的简单的限制，该限制可以和frequency table一致，也可以小于table中的范围。必须在driver初始化时给定，之后不能再修改。
(3)再往上，是policy的频率范围，代表调频策略。对于可以自动调频的CPU，只需要把这个范围告知CPU即可，此时它是调频的基本单位。对于不可以自动调频的CPU，它是软件层面的一个限制。该范围可以通过sysfs修改。
(4)最上面，是policy->cur，对那些不可以调频的CPU，该值就是CPU的运行频率。

setpolicy

对于可以自动调频的CPU，driver需要提供该接口，通过该接口，将调频范围告知CPU。

target

不建议使用了，就不讲了。

target_index

对于不可以自动调频的CPU，该接口用于指定CPU的运行频率。index表示frequency table中的index。

driver需要通过index，将频率值取出，通过clock framework提供的API，将CPU的频率设置为对应的值。

同时，driver可以调用OPP interface，获取该频率对应的电压值，通过regulator framework提供的API，将CPU的电压设置为对应的值。

get

用于获取指定cpu的频率值，如果可以的话，driver应尽可能提供。如果在init接口中给policy->clk赋值的话，则可以使用cpufreq framework提供的通用接口：

//driver/cpufreq/cpufreq.c
unsigned int cpufreq_generic_get(unsigned int cpu);

该接口会直接调用clock framework API，从policy->clk中获取频率值。

freq_attr

如果cpufreq driver需要提供一些额外的sysfs attribute，可以通过如下的attribute宏设置，然后保存在cpufreq_driver->attr数组中，例如freq table，内核已经定义好cpufreq_freq_attr_scaling_available_freqs，可直接使用

//include/linux/cpufreq.h
struct freq_attr {
    
	struct attribute attr;
	ssize_t (*show)(struct cpufreq_policy *, char *);
	ssize_t (*store)(struct cpufreq_policy *, const char *, size_t count);
};

#define cpufreq_freq_attr_ro(_name)		\
static struct freq_attr _name =			\
__ATTR(_name, 0444, show_##_name, NULL)

#define cpufreq_freq_attr_ro_perm(_name, _perm)	\
static struct freq_attr _name =			\
__ATTR(_name, _perm, show_##_name, NULL)

#define cpufreq_freq_attr_rw(_name)		\
static struct freq_attr _name =			\
__ATTR(_name, 0644, show_##_name, store_##_name)

#define cpufreq_freq_attr_wo(_name)		\
static struct freq_attr _name =			\
__ATTR(_name, 0200, NULL, store_##_name)

其他

exit，和init对应，在CPU device被remove时调用。
stop_cpu，在CPU被stop时调用。
suspend、resume回调函数:系统给suspend的时候，clock、regulator等driver有可能被suspend，因此需要在这之前将CPU设置为一个确定的频率值。driver可以通过suspend回调设置，也可以通过policy中的suspend_freq字段设置（cpufreq core会自动切换）

七、cpufreq governor

1. 注册cpufreq_governor

系统中可以同时存在多个governor，一个policy通过cpufreq_policy->governor指针和某个governor相关联。要想一个governor能够被使用，首先要把该governor注册到cpufreq framework中

//driver/cpufreq/cpufreq.c
int cpufreq_register_governor(struct cpufreq_governor *governor)
{
    
	/**/
	if (__find_governor(governor->name) == NULL) {
    
		err = 0;
		list_add(&governor->governor_list, &cpufreq_governor_list);
	}
	/**/
}

cpufreq core定义了一个全局链表变量：cpufreq_governor_list，注册函数首先根据governor的名称，通过__find_governor()函数查找该governor是否已經被注册过，如果没有被注册过，则把代表该governor的结构体添加到cpufreq_governor_list链表中。

该函数会在所有governor模块驱动的入口函数调用，只要编译该模块，就会注册到cpufreq framework中

2. 初始化流程

在这里插入图片描述

内核通过CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE配置来指定一个默认的governor，即def_gov。
若要使用传统的governor初始化，需要cpufreq_driver提供target接口,例如intel_pstate（启动参数添加intel_pstate=passive）

在动态调频驱动初始化流程如下

//driver/cpufreq/cpufreq.c
cpufreq_online
	cpufreq_init_policy
		def_gov = cpufreq_default_governor();
		if (has_target()) {
    
			gov = def_gov;
			new_policy.governor = gov;
		} else {
    
			cpufreq_parse_policy(def_gov->name, &new_policy);//(1)
		}

		cpufreq_set_policy(policy, &new_policy);

(1)如果驱动提供了driver->setpolicy回调，则说明CPU可以在policy指定的频率范围内，自行确定运行频率，无需governor参与。但如果此时的governor是performace和powersave两种，则有必要通知到驱动，以便driver->setpolicy接口可以根据实际情况正确设置频率范围。

通过policy->policy变量通知驱动，可选值：CPUFREQ_POLICY_PERFORMANCE和CPUFREQ_POLICY_POWERSAVE

接下来调用cpufreq_set_policy

//driver/cpufreq/cpufreq.c
int cpufreq_set_policy(struct cpufreq_policy *policy,
		       struct cpufreq_policy *new_policy)
{
    
	struct cpufreq_governor *old_gov;
	int ret;

	pr_debug("setting new policy for CPU %u: %u - %u kHz\n",
		 new_policy->cpu, new_policy->min, new_policy->max);

	memcpy(&new_policy->cpuinfo, &policy->cpuinfo, sizeof(policy->cpuinfo));

	/*
	 * PM QoS framework collects all the requests from users and provide us
	 * the final aggregated value here.
	 */
	new_policy->min = freq_qos_read_value(&policy->constraints, FREQ_QOS_MIN);
	new_policy->max = freq_qos_read_value(&policy->constraints, FREQ_QOS_MAX);

	/* verify the cpu speed can be set within this limit */
	ret = cpufreq_driver->verify(new_policy);
	if (ret)
		return ret;

	policy->min = new_policy->min;
	policy->max = new_policy->max;
	trace_cpu_frequency_limits(policy);

	policy->cached_target_freq = UINT_MAX;

	pr_debug("new min and max freqs are %u - %u kHz\n",
		 policy->min, policy->max);

	//如果有setpolicy接口，则直接调用，不再进行后续的governor操作
	if (cpufreq_driver->setpolicy) {
    
		policy->policy = new_policy->policy;
		pr_debug("setting range\n");
		return cpufreq_driver->setpolicy(policy);
	}

	//如果新旧governor相同，直接返回
	if (new_policy->governor == policy->governor) {
    
		pr_debug("governor limits update\n");
		cpufreq_governor_limits(policy);
		return 0;
	}

	pr_debug("governor switch\n");

	/* save old, working values */
	old_gov = policy->governor;
	/* end old governor */
	if (old_gov) {
    
		cpufreq_stop_governor(policy);
		cpufreq_exit_governor(policy);
	}

	/* start new governor */
	policy->governor = new_policy->governor;
	ret = cpufreq_init_governor(policy);//初始化governor的核心函数
	if (!ret) {
    
		ret = cpufreq_start_governor(policy);//启动governor
		if (!ret) {
    
			pr_debug("governor change\n");
			sched_cpufreq_governor_change(policy, old_gov);
			return 0;
		}
		cpufreq_exit_governor(policy);
	}

	/* new governor failed, so re-start old one */
	pr_debug("starting governor %s failed\n", policy->governor->name);
	if (old_gov) {
    
		policy->governor = old_gov;
		if (cpufreq_init_governor(policy))
			policy->governor = NULL;
		else
			cpufreq_start_governor(policy);
	}

	return ret;
}

//driver/cpufreq/cpufreq_governor.c
cpufreq_init_governor(policy);
   policy->governor->init(policy);//此处就会进入governor core(1)
		policy_dbs = alloc_policy_dbs_info(policy, gov);//申请每个policy和governor传递的私有数据
		dbs_data = gov->gdbs_data;
		if (dbs_data) {
    //(2)
			if (WARN_ON(have_governor_per_policy())) {
    
				ret = -EINVAL;
				goto free_policy_dbs_info;
			}
			policy_dbs->dbs_data = dbs_data;
			policy->governor_data = policy_dbs;
	
			gov_attr_set_get(&dbs_data->attr_set, &policy_dbs->list);
			goto out;
		}
		dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
		gov->init(dbs_data);//指定governor实现的回调函数
		dbs_data->sampling_rate = max_t(unsigned int,
						CPUFREQ_DBS_MIN_SAMPLING_INTERVAL,
						cpufreq_policy_transition_delay_us(policy));
		if (!have_governor_per_policy())//(2)
			gov->gdbs_data = dbs_data;
		policy_dbs->dbs_data = dbs_data;
		policy->governor_data = policy_dbs;

(1)governor core将一些通用的初始化放到一起，调用通用的初始化函数cpufreq_dbs_governor_init
(2)policy_dbs是dbs_data的封装,policy_dbs保存在每个cpu的policy结构体中，当cpufreq_driver->flag存在CPUFREQ_HAVE_GOVERNOR_PER_POLICY，表示不同的CPU，有不同的频率控制方式，若没有该标志，则将policy_dbs->dbs_data保存在governor->gdbs_data（即共用该结构）。

启动governor

cpufreq_start_governor(policy);
	policy->governor->start(policy);//governor core通用函数cpufreq_dbs_governor_start
		gov->start(policy);//指定governor实现的回调函数
		gov_set_update_util(policy_dbs, sampling_rate);
			cpufreq_add_update_util_hook(cpu, &cdbs->update_util, //设置调频回调函数！！！
					     dbs_update_util_handler);

启动governor中比较重要的是设置调频回调函数,该函数是真正调频时计算合适频率的函数

3. 计算方式简介

例如ondemand策略中的调频接口od_dbs_update

//drivers/cpufreq/cpufreq_ondemand.c
od_dbs_update
	od_update

static void od_update(struct cpufreq_policy *policy)
{
    
	struct policy_dbs_info *policy_dbs = policy->governor_data;
	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
	struct dbs_data *dbs_data = policy_dbs->dbs_data;
	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
	unsigned int load = dbs_update(policy);//负载(百分比)(1)

	dbs_info->freq_lo = 0;

	/* Check for frequency increase */
	if (load > dbs_data->up_threshold) {
    //如果负载大于策略设置的阈值，则直接切换到最大频率
		/* If switching to max speed, apply sampling_down_factor */
		if (policy->cur < policy->max)
			policy_dbs->rate_mult = dbs_data->sampling_down_factor;
		dbs_freq_increase(policy, policy->max);
	} else {
    
		/* Calculate the next frequency proportional to load */
		unsigned int freq_next, min_f, max_f;

		min_f = policy->cpuinfo.min_freq;
		max_f = policy->cpuinfo.max_freq;
		freq_next = min_f + load * (max_f - min_f) / 100;//按照负载百分比，在频率范围内选择合适频率

		/* No longer fully busy, reset rate_mult */
		policy_dbs->rate_mult = 1;

		if (od_tuners->powersave_bias)//(2)
			freq_next = od_ops.powersave_bias_target(policy,
								 freq_next,
								 CPUFREQ_RELATION_L);

		__cpufreq_driver_target(policy, freq_next, CPUFREQ_RELATION_C);//设置频率
	}
}

(1)当前负载load = 100 * (time_elapsed - idle_time) / time_elapsed
idle_time = 本次idle时间 - 上次idle时间
time_elapsed = 本次总运行时间 - 上次总运行时间

(2)表明我们为了进一步节省电力，我们希望在计算出来的新频率的基础上，再乘以一个powersave_bias设定的百分比，作为真正的运行频率，powersave_bias的值从0-1000，每一步代表0.1%

powersave_bias可参考Documentation/admin-guide/pm/cpufreq.rst:514

八、cpufreq 频率设置函数分析

上述文章介绍了cpufreq framework各部分功能即初始化过程，本节介绍调频流程。

performance 和 powersave让 CPU 一直跑在最高/低频率，没有调频过程。
ondemand 和 conservation则和调度器有关。

ondemand 和 conservation 之前的调频方式是开启一个 timer，定期去计算各个 CPU 的负载，在commit
cpufreq: governor: Replace timers with utilization update callbacks之后则改为和调度器相关

percpu变量update_util_data是调度器与调频驱动沟通的桥梁，cpufreq_update_util函数访问其中的回调函数进行调频。

//kernel/sched/cpufreq.c

DEFINE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
//注册调频函数
void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
			void (*func)(struct update_util_data *data, u64 time,
				     unsigned int flags))
{
    
	if (WARN_ON(!data || !func))
		return;

	if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu)))
		return;

	data->func = func;
	rcu_assign_pointer(per_cpu(cpufreq_update_util_data, cpu), data);
}


//kernel/sched/sched.h

struct update_util_data {
    
       void (*func)(struct update_util_data *data, u64 time, unsigned int flags);
};

//调用调频函数
static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
{
    
	struct update_util_data *data;

	data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
						  cpu_of(rq)));
	if (data)
		data->func(data, rq_clock(rq), flags);
}

触发调频的时机，以调度节拍为例，调度器CFS

时钟中断 --> 
scheduler_tick --> 
curr->sched_class->task_tick(CFS：task_tick_fair) --> 
entity_tick --> 
update_load_avg --> 
cfs_rq_util_change --> 
cpufreq_update_util(rq, flags)

前面介绍启动governor时中注册了调频函数dbs_update_util_handler，下面分析一下调频流程

//driver/cpufreq/cpufreq_governor.c
dbs_update_util_handler
	irq_work_queue(&policy_dbs->irq_work);//init_irq_work(&policy_dbs->irq_work, dbs_irq_work);
		schedule_work_on(smp_processor_id(), &policy_dbs->work);//INIT_WORK(&policy_dbs->work, dbs_work_handler);
			gov_update_sample_delay(policy_dbs, gov->gov_dbs_update(policy));//gov_dbs_update回调以ondemand为例
			od_dbs_update
				od_update
					dbs_update//计算负载
					__cpufreq_driver_target//调用cpufreq_driver中的target回调函数设置频率

对于可以自动调频的CPU，driver中同样调用cpufreq_add_update_util_hook来注册调频函数，原理同上。

九、freq Qos和限频流程

freq Qos主要用于cpufreq framework中policy->max和policy->min的调整。简单来说每个policy对应最大频率和最小频率两个约束，某个模块想要更改policy的频率限制时，就会调用相关接口向freq Qos中请求，freq Qos根据自己的规则判断使用哪个频率限制，若有更新则利用notifier block通知cpufreq framework更新policy。

本节介绍freq Qos的设计实现，并结合cpufrq framework说明如何应用。

1. 初始化

//kernel/power/qos.c
#define FREQ_QOS_MIN_DEFAULT_VALUE    0
#define FREQ_QOS_MAX_DEFAULT_VALUE    S32_MAX

enum pm_qos_type {
    
    PM_QOS_UNITIALIZED,
    PM_QOS_MAX,        /* return the largest value */
    PM_QOS_MIN,        /* return the smallest value */
};

struct pm_qos_constraints {
    
    struct plist_head list; //其他模块的所有频率限制请求链表，从小到大排列
    /* Do not change to 64 bit */
    s32 target_value; //当前使用的频率限制值
    s32 default_value; 
    s32 no_constraint_value;
    enum pm_qos_type type;
    struct blocking_notifier_head *notifiers;
};

//policy->constraints成员，对应min和max频率约束
struct freq_constraints {
    
    struct pm_qos_constraints min_freq;
    struct blocking_notifier_head min_freq_notifiers;
    struct pm_qos_constraints max_freq;
    struct blocking_notifier_head max_freq_notifiers;
};

/**
 * freq_constraints_init - Initialize frequency QoS constraints.
 * @qos: Frequency QoS constraints to initialize.
 */
void freq_constraints_init(struct freq_constraints *qos)
{
    
	struct pm_qos_constraints *c;

	c = &qos->min_freq;
	plist_head_init(&c->list);
	c->target_value = FREQ_QOS_MIN_DEFAULT_VALUE;
	c->default_value = FREQ_QOS_MIN_DEFAULT_VALUE;
	c->no_constraint_value = FREQ_QOS_MIN_DEFAULT_VALUE;
	c->type = PM_QOS_MAX;
	c->notifiers = &qos->min_freq_notifiers;
	BLOCKING_INIT_NOTIFIER_HEAD(c->notifiers);

	c = &qos->max_freq;
	plist_head_init(&c->list);
	c->target_value = FREQ_QOS_MAX_DEFAULT_VALUE;
	c->default_value = FREQ_QOS_MAX_DEFAULT_VALUE;
	c->no_constraint_value = FREQ_QOS_MAX_DEFAULT_VALUE;
	c->type = PM_QOS_MIN;
	c->notifiers = &qos->max_freq_notifiers;
	BLOCKING_INIT_NOTIFIER_HEAD(c->notifiers);
}

新分配一个freq_constraints结构后可以直接调用freq_constraints_init，函数中分别对min_freq和max_freq进行初始化

注意其type的初始化:min_freq这个限制赋值的type竟然是PM_QOS_MAX，而max_freq这个限制赋值的type竟然是PM_QOS_MIN！

这样当限制最大频率的时候，pm_qos_constraints->list链表上生效的就是最小值，也就是说对最大频率的限制，谁限制的小谁生效。当限制最小频率的时候，pm_qos_constraints->list链表上生效的就是最大值，也就是说对最小频率的限制，谁限制的大谁生效。在后续函数介绍中还会详细说明。

2. 通知机制

//kernel/power/qos.c
/**
 * freq_qos_add_notifier - Add frequency QoS change notifier.
 * @qos: List of requests to add the notifier to.
 * @type: Request type.
 * @notifier: Notifier block to add.
 */
int freq_qos_add_notifier(struct freq_constraints *qos,
			  enum freq_qos_req_type type,
			  struct notifier_block *notifier)
{
    
	int ret;

	if (IS_ERR_OR_NULL(qos) || !notifier)
		return -EINVAL;

	switch (type) {
    
	case FREQ_QOS_MIN:
		ret = blocking_notifier_chain_register(qos->min_freq.notifiers,
						       notifier);
		break;
	case FREQ_QOS_MAX:
		ret = blocking_notifier_chain_register(qos->max_freq.notifiers,
						       notifier);
		break;
	default:
		WARN_ON(1);
		ret = -EINVAL;
	}

	return ret;
}


//driver/cpufreq/cpufreq.c
cpufreq_policy_alloc
	policy->nb_min.notifier_call = cpufreq_notifier_min;
	policy->nb_max.notifier_call = cpufreq_notifier_max;

	freq_qos_add_notifier(&policy->constraints, FREQ_QOS_MIN, &policy->nb_min);
	freq_qos_add_notifier(&policy->constraints, FREQ_QOS_MAX, &policy->nb_max);

注册一个notifier，在频率限制生效时，会发出一个通知

3. 规则

在初始化的时候提到过，对最大频率的限制，谁限制的小谁生效；对最小频率的限制，谁限制的大谁生效。体现在函数pm_qos_get_value，该函数从当前freq Qos中选择一个合适的频率限制


static int pm_qos_get_value(struct pm_qos_constraints *c)
{
    
    if (plist_head_empty(&c->list))
        return c->no_constraint_value;

    switch (c->type) {
    
    case PM_QOS_MIN:
        return plist_first(&c->list)->prio; //最小就返回第一个元素

    case PM_QOS_MAX:
        return plist_last(&c->list)->prio; //最大返回最后一个元素

    default:
        WARN(1, "Unknown PM QoS type in %s\n", __func__);
        return PM_QOS_DEFAULT_VALUE;
    }
}

pm_qos_constraints->list链表是由小到大排列的，min_freq这个限制的type是PM_QOS_MAX，则返回最后一个元素，即最大值，而max_freq这个限制的type是PM_QOS_MIN，则返回第一个元素，即最小值，这就是freq Qos的核心规则。

4. 添加请求

//kernel/power/qos.c
/**
 * freq_qos_add_request - Insert new frequency QoS request into a given list.
 * @qos: Constraints to update.
 * @req: Preallocated request object.
 * @type: Request type.
 * @value: Request value.
 *
 * Insert a new entry into the @qos list of requests, recompute the effective
 * QoS constraint value for that list and initialize the @req object.  The
 * caller needs to save that object for later use in updates and removal.
 *
 * Return 1 if the effective constraint value has changed, 0 if the effective
 * constraint value has not changed, or a negative error code on failures.
 */
int freq_qos_add_request(struct freq_constraints *qos,
			 struct freq_qos_request *req,
			 enum freq_qos_req_type type, s32 value)
{
    
	int ret;

	if (IS_ERR_OR_NULL(qos) || !req)
		return -EINVAL;

	if (WARN(freq_qos_request_active(req),
		 "%s() called for active request\n", __func__))
		return -EINVAL;

	req->qos = qos;
	req->type = type;
	ret = freq_qos_apply(req, PM_QOS_ADD_REQ, value);
	if (ret < 0) {
    
		req->qos = NULL;
		req->type = 0;
	}

	return ret;
}


/**
 * freq_qos_apply - Add/modify/remove frequency QoS request.
 * @req: Constraint request to apply.
 * @action: Action to perform (add/update/remove).
 * @value: Value to assign to the QoS request.
 */
static int freq_qos_apply(struct freq_qos_request *req,
			  enum pm_qos_req_action action, s32 value)
{
    
	int ret;

	switch(req->type) {
    
	case FREQ_QOS_MIN:
		ret = pm_qos_update_target(&req->qos->min_freq, &req->pnode,
					   action, value);
		break;
	case FREQ_QOS_MAX:
		ret = pm_qos_update_target(&req->qos->max_freq, &req->pnode,
					   action, value);
		break;
	default:
		ret = -EINVAL;
	}

	return ret;
}

可以发现freq_qos_add_request最终调用pm_qos_update_target，向min或max添加请求，如下

//kernel/power/qos.c
/**
 * pm_qos_update_target - manages the constraints list and calls the notifiers
 *  if needed
 * @c: constraints data struct
 * @node: request to add to the list, to update or to remove
 * @action: action to take on the constraints list
 * @value: value of the request to add or update
 *
 * This function returns 1 if the aggregated constraint value has changed, 0
 *  otherwise.
 */
int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node,
			 enum pm_qos_req_action action, int value)
{
    
	unsigned long flags;
	int prev_value, curr_value, new_value;
	int ret;

	spin_lock_irqsave(&pm_qos_lock, flags);
	prev_value = pm_qos_get_value(c);//根据Qos的规则获取当前的频率限制
	if (value == PM_QOS_DEFAULT_VALUE)
		new_value = c->default_value;
	else
		new_value = value;

	switch (action) {
    
	case PM_QOS_REMOVE_REQ:
		plist_del(node, &c->list);
		break;
	case PM_QOS_UPDATE_REQ:
		/*
		 * to change the list, we atomically remove, reinit
		 * with new value and add, then see if the extremal
		 * changed
		 */
		plist_del(node, &c->list);
		/* fall through */
	case PM_QOS_ADD_REQ:
		plist_node_init(node, new_value);
		plist_add(node, &c->list);//将你的请求挂到对应约束的链表上，从小到大排序
		break;
	default:
		/* no action */
		;
	}

	curr_value = pm_qos_get_value(c);//插入了一个新的请求后，再次获取当前频率限制
	pm_qos_set_value(c, curr_value);

	spin_unlock_irqrestore(&pm_qos_lock, flags);

	trace_pm_qos_update_target(action, prev_value, curr_value);
	if (prev_value != curr_value) {
    //如果频率限制发生变化，则发出一个通知
		ret = 1;
		if (c->notifiers)
			blocking_notifier_call_chain(c->notifiers,
						     (unsigned long)curr_value,
						     NULL);
	} else {
    
		ret = 0;
	}
	return ret;
}

到这里可以发现，我们添加了频率限制，不一定就会更新，要满足规则才可以。

另外还有一个函数freq_qos_update_request，它会清空所有的request，重新添加当前request


int freq_qos_update_request(struct freq_qos_request *req, s32 new_value)
{
    
	if (!req)
		return -EINVAL;

	if (WARN(!freq_qos_request_active(req),
		 "%s() called for unknown object\n", __func__))
		return -EINVAL;

	if (req->pnode.prio == new_value)
		return 0;

	return freq_qos_apply(req, PM_QOS_UPDATE_REQ, new_value);
}
EXPORT_SYMBOL_GPL(freq_qos_update_request);

用户层使用echo xx > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq,实际驱动就是调用了freq_qos_update_reques

5. 频率限制更新

下面列出cpufreq framework注册的回调函数的频率限制更新流程

//driver/cpufreq/cpufreq.c
cpufreq_notifier_min
	schedule_work(&policy->update);//INIT_WORK(&policy->update, handle_update);
		handle_update
			refresh_frequency_limits
				cpufreq_set_policy
					new_policy->min = freq_qos_read_value(&policy->constraints, FREQ_QOS_MIN);//获取频率限制值
					cpufreq_driver->setpolicy(policy);

超频

mips上的自动调核

参考资料

本文为学习以下资料，汇总整理所记笔记，侵删。

属性说明
https://blog.csdn.net/leerobin83/article/details/7476386

流程介绍
https://blog.csdn.net/DroidPhone/article/details/9346981
https://blog.csdn.net/DroidPhone/article/details/9385745
https://blog.csdn.net/droidphone/article/details/9532999

调频时机
https://www.cnblogs.com/hellokitty2/p/14909197.html
http://kernel.meizu.com/cpufreq-sched.html
https://www.cnblogs.com/hellokitty2/p/15690200.html

wowo科技的cpufreq架构介绍
http://www.wowotech.net/pm_subsystem/cpufreq_overview.html
http://www.wowotech.net/pm_subsystem/cpufreq_driver.html
http://www.wowotech.net/pm_subsystem/cpufreq_core.html
http://www.wowotech.net/pm_subsystem/cpufreq_governor.html

Qos策略，freq Qos和限制频率流程
https://www.cnblogs.com/hellokitty2/p/15685048.html

超频介绍
https://blog.csdn.net/yin262/article/details/45697221

时间统计功能
https://blog.csdn.net/feelabclihu/article/details/121299114

本文链接：https://blog.csdn.net/bsp_mpu6050/article/details/123715731

原作者删帖不实内容删帖广告或垃圾文章投诉

智能推荐

如何配置DNS服务的正反向解析_dns反向解析-程序员宅基地

文章浏览阅读3k次，点赞3次，收藏13次。root@server ~]# vim /etc/named.rfc1912.zones #添加如下内容，也可直接更改模板。[root@server ~]# vim /etc/named.conf #打开主配置文件，将如下两处地方修改为。注意：ip地址必须反向书写，这里文件名需要和反向解析数据文件名相同。新建或者拷贝一份进行修改。nslookup命令。_dns反向解析

设置PWM占空比中TIM_SetCompare1，TIM_SetCompare2,TIM_SetCompare3,TIM_SetCompare4分别对应引脚和ADC通道对应引脚-程序员宅基地

文章浏览阅读2.5w次，点赞16次，收藏103次。这个函数TIM_SetCompare1，这个函数有四个，分别是TIM_SetCompare1，TIM_SetCompare2，TIM_SetCompare3，TIM_SetCompare4。位于CH1那一行的GPIO口使用TIM_SetCompare1这个函数,位于CH2那一行的GPIO口使用TIM_SetCompare2这个函数。使用stm32f103的除了tim6和tim7没有PWM..._tim_setcompare1

多线程_进程和线程，并发与并行，线程优先级，守护线程，实现线程的四种方式，线程周期；线程同步，线程中的锁，Lock类，死锁，生产者和消费者案例-程序员宅基地

文章浏览阅读950次，点赞33次，收藏19次。多线程_进程和线程，并发与并行，线程优先级，守护线程，实现线程的四种方式，线程周期；线程同步，线程中的锁，Lock类，死锁，生产者和消费者案例

在 Linux 系统的用户目录下安装 ifort 和 MKL 库并配置_在linux系统的用户目录下安装ifort和mkl库并配置-程序员宅基地

文章浏览阅读2.9k次。ifort 编译器的安装ifort 编译器可以在 intel 官网上下载。打开https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/fortran-compiler.html#gs.7iqrsm点击网页中下方处的 Download, 选择 Intel Fortran Compiler Classic and Intel Fortran Compiler(Beta) 下方对应的版本。我选择的是 l_在linux系统的用户目录下安装ifort和mkl库并配置

使用ftl文件生成图片中图片展示无样式，不显示_ftl格式pdf的样式调整-程序员宅基地

文章浏览阅读689次，点赞7次，收藏8次。些项目时需要一个生成图片的方法，我在网上找到比较方便且适合我去设置一些样式的生成方式之一就是使用Freemarker，在对应位置上先写好一个html格式的ftl文件，在对应位置用${参数名}填写上。还记得当时为了解决图片大小设置不上，搜索了好久资料，不记得是在哪看到的需要在里面使用width与height直接设置，而我当时用style去设置，怎么都不对。找不到，自己测试链接，准备将所有含有中文的图片链接复制一份，在服务器上存储一份不带中文的文件。突然发现就算无中文，有的链接也是打不开的。_ftl格式pdf的样式调整

orin Ubuntu 20.04 配置 Realsense-ROS_opt/ros/noetic/lib/nodelet/nodelet: symbol lookup -程序员宅基地

文章浏览阅读1.5k次，点赞6次，收藏12次。拉取librealsense。_opt/ros/noetic/lib/nodelet/nodelet: symbol lookup error: /home/admin07/reals

随便推点

操作系统精选习题——第四章_系统抖动现象的发生由什么引起的-程序员宅基地

文章浏览阅读3.4k次，点赞3次，收藏29次。一.单选题二.填空题三.判断题一.单选题静态链接是在( )进行的。A、编译某段程序时B、装入某段程序时C、紧凑时D、装入程序之前Pentium处理器(32位)最大可寻址的虚拟存储器地址空间为( )。A、由内存的容量而定B、4GC、2GD、1G分页系统中,主存分配的单位是( )。A、字节B、物理块C、作业D、段在段页式存储管理中，当执行一段程序时，至少访问()次内存。A、1B、2C、3D、4在分段管理中，（）。A、以段为单位分配，每._系统抖动现象的发生由什么引起的

UG NX 12零件工程图基础_ug-nx工程图-程序员宅基地

文章浏览阅读2.4k次。在实际的工作生产中，零件的加工制造一般都需要二维工程图来辅助设计。UG NX 的工程图主要是为了满足二维出图需要。在绘制工程图时，需要先确定所绘制图形要表达的内容，然后根据需要并按照视图的选择原则，绘制工程图的主视图、其他视图以及某些特殊视图，最后标注图形的尺寸、技术说明等信息，即可完成工程图的绘制。1.视图选择原则工程图合理的表达方案要综合运用各种表达方法，清晰完整地表达出零件的结构形状，并便于看图。确定工程图表达方案的一般步骤如下：口分析零件结构形状由于零件的结构形状以及加工位置或工作位置的不._ug-nx工程图

智能制造数字化工厂智慧供应链大数据解决方案（PPT）-程序员宅基地

文章浏览阅读920次，点赞29次，收藏18次。原文《智能制造数字化工厂智慧供应链大数据解决方案》PPT格式主要从智能制造数字化工厂智慧供应链大数据解决方案框架图、销量预测+S&OP大数据解决方案、计划统筹大数据解决方案、订单履约大数据解决方案、库存周转大数据解决方案、采购及供应商管理大数据模块、智慧工厂大数据解决方案、设备管理大数据解决方案、质量管理大数据解决方案、仓储物流与网络优化大数据解决方案、供应链决策分析大数据解决方案进行建设。适用于售前项目汇报、项目规划、领导汇报。

网络编程socket accept函数的理解_当在函数 'main' 中调用 'open_socket_accept'时.line: 8. con-程序员宅基地

文章浏览阅读2w次，点赞38次，收藏102次。在服务器端，socket()返回的套接字用于监听（listen）和接受（accept）客户端的连接请求。这个套接字不能用于与客户端之间发送和接收数据。 accept()接受一个客户端的连接请求，并返回一个新的套接字。所谓“新的”就是说这个套接字与socket()返回的用于监听和接受客户端的连接请求的套接字不是同一个套接字。与本次接受的客户端的通信是通过在这个新的套接字上发送和接收数_当在函数 'main' 中调用 'open_socket_accept'时.line: 8. connection request fa

C#对象销毁_c# 销毁对象及其所有引用-程序员宅基地

文章浏览阅读4.3k次。对象销毁对象销毁的标准语法Close和Stop何时销毁对象销毁对象时清除字段对象销毁的标准语法Framework在销毁对象的逻辑方面遵循一套规则，这些规则并不限用于.NET Framework或C#语言；这些规则的目的是定义一套便于使用的协议。这些协议如下:一旦销毁，对象不可恢复。对象不能被再次激活，调用对象的方法或者属性抛出ObjectDisposedException异常重复地调用对象的Disposal方法会导致错误如果一个可销毁对象x 包含或包装或处理另外一个可销毁对象y，那么x的Disp_c# 销毁对象及其所有引用

笔记-中项/高项学习期间的错题笔记1_大型设备可靠性测试可否拆解为几个部分进行测试-程序员宅基地

文章浏览阅读1.1w次。这是记录，在中项、高项过程中的错题笔记；https://www.zenwu.site/post/2b6d.html1. 信息系统的规划工具在制订计划时，可以利用PERT图和甘特图；访谈时，可以应用各种调查表和调查提纲；在确定各部门、各层管理人员的需求，梳理流程时，可以采用会谈和正式会议的方法。为把企业组织结构与企业过程联系起来，说明每个过程与组织的联系，指出过程决策人，可以采用建立过程／组织（Process/Organization，P/O）矩阵的方法。例如，一个简单的P/O矩阵示例，其中._大型设备可靠性测试可否拆解为几个部分进行测试