聊一聊 dotnet-trace 调查 lock锁竞争

科技资讯 投稿 5700 0 评论

聊一聊 dotnet-trace 调查 lock锁竞争

一:背景

1. 讲故事

要想获取此类信息,看 dump 肯定是没有用的,只能给程序安装一个摄像头,在 Windows 平台上可以在 perfview 上配一个 Microsoft-Windows-DotNETRuntime:ContentionKeyword 事件轻松搞定,截图如下:

二:探究 dotnet-trace

1. 如何监控 lock 竞争

dotnet-trace 是 CLR 团队写的一个跨平台的小工具,专门用于获取 .NET 程序的各种事件,可以理解成 PerfView 的一个子集,这里安装就不说了,详见官方文档:https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace

--clrevents 中配 contention 事件即可,详情参见文档:https://learn.microsoft.com/en-us/dotnet/fundamentals/diagnostics/runtime-contention-events

2. 测试案例

锁护送 现象,参考代码如下:


    internal class Program
    {
        public static object lockMe = new object(;

        static void Main(string[] args
        {
            long i = 10;

            Parallel.For(0, int.MaxValue, new ParallelOptions( { MaxDegreeOfParallelism = 4 }, (j =>
            {
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
                lock (lockMe i++; lock (lockMe i++; lock (lockMe i++; lock (lockMe i++;
            };
        }
    }

将程序跑起来后,使用 dotnet-trace ps 找到 PID,再用 dotnet-trace 进行跟踪,这里持续跟踪 1分钟。


[root@localhost ~]# dotnet-trace ps
 3316  dotnet  /usr/share/dotnet/dotnet  dotnet ConsoleApp3.dll  

[root@localhost ~]# dotnet-trace collect -p 3316 --clrevents contention --duration 00:00:01:00

Provider Name                           Keywords            Level               Enabled By
Microsoft-Windows-DotNETRuntime         0x0000000000004000  Informational(4    --clrevents

Process        : /usr/share/dotnet/dotnet
Output File    : /root/dotnet_20230509_105906.nettrace
Trace Duration : 00:00:01:00
[00:00:01:00]	Recording trace 29.7885  (MB
Press <Enter> or <Ctrl+C> to exit...148  (MB
Stopping the trace. This may take several minutes depending on the application being traced.

Trace completed.

[root@localhost ~]# ls

anaconda-ks.cfg  dotnet_20230509_105906.nettrace  Music     Templates
Desktop          Downloads                        Pictures  Videos
Documents        initial-setup-ks.cfg             Public

3. nettrace 文件分析

至于分析 dotnet_20230509_105906.nettrace 的工具就特别多了,dotnet-trace,perf,perfview,visualstudio,不过我个人建议还是使用 prefview,因为它的洞察能力会更好,用 perfview 打开之后点击 EventStats 观察统计信息:

200w 的 start 和 stop 事件。

Events 面板中的 Microsoft-Windows-DotNETRuntime/Contention/Start 事件,可以看到记录中每一次争抢的开始时间。

4232 号线程就得到了两次连续执行。

HasStack="True" ThreadID="3,316" ProcessorNumber="0" ContentionFlags="Managed" ClrInstanceID="0" 中的 HasStack="True" 就是告诉当前是有调用栈信息的,在 Time MSec 列点击右键选择 Open Any Stacks

Main 方法中的 Parallel.For 诱发的,非常清楚。

三:总结

EventPipe 之上,特点就是跨平台,除了对锁竞争外,还有其他的各种有趣的事件,有兴趣的朋友可以查阅查阅。

编程笔记 » 聊一聊 dotnet-trace 调查 lock锁竞争

赞同 (23) or 分享 (0)
游客 发表我的评论   换个身份
取消评论

表情
(0)个小伙伴在吐槽