Bug 48707 - Mono Queryable implementation is leaking memory upon materialization
Summary: Mono Queryable implementation is leaking memory upon materialization
Alias: None
Product: Runtime
Classification: Mono
Component: General (show other bugs)
Version: master
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Aleksey Kliger
Depends on:
Reported: 2016-12-02 14:06 UTC by Ali Osman ISIK
Modified: 2018-03-19 13:37 UTC (History)
7 users (show)

See Also:
Tags: bugpool-archived
Is this bug a regression?: ---
Last known good build:

Sample code (758 bytes, text/plain)
2016-12-02 14:06 UTC, Ali Osman ISIK
dynamic method creation master and 3.4.1 (4.48 KB, text/plain)
2017-01-03 19:13 UTC, Vlad Brezae
dynamic method vs no dynamic method (825 bytes, text/plain)
2017-01-03 19:14 UTC, Vlad Brezae

Description Ali Osman ISIK 2016-12-02 14:06:05 UTC
Created attachment 18753 [details]
Sample code

I came across a possible memory leak issue in one of my assignments and when I dig into the problem, I suspected of Queryable implementation of Mono. I reproduced the leak in the program attached.

I built the program both with .Net and Mono compiler but result did not change. I'm running the program on Ubuntu 14.04, Mono 4.6.2.

When I run the same program on Windows, there is no memory leak.

Also, when I change the line 

                result = Foos.AsQueryable().FirstOrDefault(foo => foo.Bar == "Bar1");


                result = Foos.FirstOrDefault(foo => foo.Bar == "Bar1");

there is no leak.

Comment 1 Marek Safar 2016-12-20 18:06:40 UTC
I can confirm the memory just goes up and up but there is nothing specific in the code so I suspect reflection leak or sgen bug
Comment 2 Vlad Brezae 2017-01-03 19:12:01 UTC
This is not a gc issue. The leak comes from managed code. More specifically each invocation leads to the compilation of a dynamic method which is leaked.

The first question would be if we care about the leak itself. As far as I know we don't free jit-ed code and I don't think it is in our main priorities. Maybe vargaz can confirm here.

Probably a more important question would be why we emit a method every time we invoke that. The answer to this probably lies in the managed world, related to linq expressions. I've attached another test case which shows great performance difference between 2 ways of accessing the first element of an IQueryable (the .First case does the dyanmic method compilation). I also attached two stack traces one on master and one on an ancient 3.4 that I had, which show the point where the method is created.
Comment 3 Vlad Brezae 2017-01-03 19:13:58 UTC
Created attachment 19075 [details]
dynamic method creation master and 3.4.1
Comment 4 Vlad Brezae 2017-01-03 19:14:37 UTC
Created attachment 19076 [details]
dynamic method vs no dynamic method
Comment 5 Zoltan Varga 2017-01-03 19:24:18 UTC
We attempt to free memory for dynamic methods, but there could be a leak there.
Comment 6 Ludovic Henry 2018-01-16 20:08:28 UTC
Partial fix: https://github.com/mono/mono/pull/6522
Comment 7 Vlad Brezae 2018-01-16 20:55:08 UTC
Additional leaks seem related to generic type creation. The leaked memory resides in the mempool for a MonoImage and is allocated in mono_metadata_parse_type_internal and mono_metadata_parse_generic_param. Memory leakage is fairly slow, it can take a few tens of minutes to leak 100MB.
Comment 8 Aleksey Kliger 2018-02-26 17:04:44 UTC
I believe the metadata leaks are fixed on mono master by https://github.com/mono/mono/commit/dabdacd99b40d144d93e46fb89e0bb27b7e82931

The fix for mono_metadata_parse_type_internal() was to create more transient MonoGenericInst instances in mono_metadata_parse_generic_inst().  This could be cherry-picked to earlier branches.

The fix for mono_metadata_parse_generic_param was to change how an anonymous MonoGenericParam and its MonoClass are cached.  Previously we cached in the MonoImage just the MonoClass for an anon gparam.  The underlying MonoGenericParam was allocated from the MonoImage mempool every time it was encountered.  In pathological cases like the repros in this bug, every loop iteration would allocate an anonymous gparam which would grow the mempool by a couple of bytes.

The solution was to instead cache the anonymous gparams (so that repeated loop iterations would find the previous anonymous gparam).  The MonoClass is now cached in the gparam itself (in MonoGenericParamInfo:pklass).  In order to make this change, we did away with the distinction between "small" and "full" generic params.  A "small" generic param was a MonoGenericParam and a full one was MonoGenericParamFull - the difference being an additional MonoGenericParamInfo field.  Now all params are MonoGenericParamFull and MonoGenericParam is just a typedef for a full param.

Because this second change is quite fundamental, I this it's pretty risky to cherry-pick to stable branches.

Note You need to log in before you can comment on or make changes to this bug.