一:背景
1. 讲故事
要想获取此类信息,看 dump 肯定是没有用的,只能给程序安装一个摄像头,在 Windows 平台上可以在 perfview 上配一个 Microsoft-Windows-DotNETRuntime:ContentionKeyword
事件轻松搞定,截图如下:
二:探究 dotnet-trace
1. 如何监控 lock 竞争
dotnet-trace
是 CLR 团队写的一个跨平台的小工具,专门用于获取 .NET 程序的各种事件,可以理解成 PerfView 的一个子集,这里安装就不说了,详见官方文档:https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace--clrevents 中配
contention
事件即可,详情参见文档:https://learn.microsoft.com/en-us/dotnet/fundamentals/diagnostics/runtime-contention-events2. 测试案例
锁护送 现象,参考代码如下:
internal class Program { public static object lockMe = new object(; static void Main(string[] args { long i = 10; Parallel.For(0, int.MaxValue, new ParallelOptions( { MaxDegreeOfParallelism = 4 }, (j => { lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; }; } }
将程序跑起来后,使用
dotnet-trace ps
找到 PID,再用dotnet-trace
进行跟踪,这里持续跟踪 1分钟。[root@localhost ~]# dotnet-trace ps 3316 dotnet /usr/share/dotnet/dotnet dotnet ConsoleApp3.dll [root@localhost ~]# dotnet-trace collect -p 3316 --clrevents contention --duration 00:00:01:00 Provider Name Keywords Level Enabled By Microsoft-Windows-DotNETRuntime 0x0000000000004000 Informational(4 --clrevents Process : /usr/share/dotnet/dotnet Output File : /root/dotnet_20230509_105906.nettrace Trace Duration : 00:00:01:00 [00:00:01:00] Recording trace 29.7885 (MB Press <Enter> or <Ctrl+C> to exit...148 (MB Stopping the trace. This may take several minutes depending on the application being traced. Trace completed. [root@localhost ~]# ls anaconda-ks.cfg dotnet_20230509_105906.nettrace Music Templates Desktop Downloads Pictures Videos Documents initial-setup-ks.cfg Public
3. nettrace 文件分析
至于分析
dotnet_20230509_105906.nettrace
的工具就特别多了,dotnet-trace,perf,perfview,visualstudio,不过我个人建议还是使用prefview
,因为它的洞察能力会更好,用 perfview 打开之后点击EventStats
观察统计信息:200w 的 start 和 stop 事件。
Events 面板中的
Microsoft-Windows-DotNETRuntime/Contention/Start
事件,可以看到记录中每一次争抢的开始时间。4232 号线程就得到了两次连续执行。
HasStack="True" ThreadID="3,316" ProcessorNumber="0" ContentionFlags="Managed" ClrInstanceID="0" 中的
HasStack="True"
就是告诉当前是有调用栈信息的,在Time MSec
列点击右键选择Open Any Stacks
。Main 方法中的
Parallel.For
诱发的,非常清楚。三:总结
EventPipe 之上,特点就是跨平台,除了对锁竞争外,还有其他的各种有趣的事件,有兴趣的朋友可以查阅查阅。