Bug 17944

Summary: mono: mini-amd64.c:492: amd64_patch: Assertion `0' failed.
Product: [Mono] Runtime Reporter: Alexandre Faria <spigaz>
Component: GeneralAssignee: Bugzilla <bugzilla>
Status: ASSIGNED ---    
Severity: normal CC: apartamail, ludovic, lupus, mono-bugs+mono, mono-bugs+runtime, vargaz, vladimir.kargov
Priority: ---    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Tags: Is this bug a regression?: ---
Last known good build:
Bug Depends on: 17977    
Bug Blocks:    

Description Alexandre Faria 2014-02-21 16:24:39 UTC
I have run a sample I have to test mono to the limits and it crashed after creating the AppDomain number 65523.

I have no production test case so far, that requires this, but if I turn my CI from console to server I might.

So I have to ask, is this a limit or a bug? Because 65523 is way too close to 2^16.

Are the AppDomains being unloaded? As linux reports using more memory than it should.

Sample:
using System;

public class Example
{
  public static void Main()
  {
    for(int i=0; i<100000; i++)
    {
      System.Console.WriteLine("\n\nIteration " + i);

      AppDomain ad = AppDomain.CreateDomain("ChildDomain");

      AppDomain.Unload(ad);
    }
  }
}



Stacktrace:

Iteration 65523
mono: mini-amd64.c:492: amd64_patch: Assertion `0' failed.
Stacktrace:

  at <unknown> <0xffffffff>
  at System.Collections.Generic.Dictionary`2.Init (int,System.Collections.Generic.IEqualityComparer`1<TKey>) <0x00053>
  at System.Collections.Generic.Dictionary`2..ctor () <0x0001f>
  at System.Security.Cryptography.CryptoConfig/CryptoHandler..ctor (System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x000a1>
  at System.Security.Cryptography.CryptoConfig.LoadConfig (string,System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x00093>
  at System.Security.Cryptography.CryptoConfig.Initialize () <0x01a5f>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string,object[]) <0x00081>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string) <0x0001a>
  at System.Security.Cryptography.RandomNumberGenerator.Create (string) <0x0001a>
  at System.Security.Cryptography.RandomNumberGenerator.Create () <0x0001a>
  at System.Guid.NewGuid () <0x00081>
  at System.Runtime.Remoting.RemotingServices.NewUri () <0x00081>
  at System.Runtime.Remoting.RemotingServices.Marshal (System.MarshalByRefObject,string,System.Type) <0x002b9>
  at System.AppDomain.GetMarshalledDomainObjRef () <0x0002c>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_object__this__ (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Reflection.MonoMethod.InternalInvoke (System.Reflection.MonoMethod,object,object[],System.Exception&) <0xffffffff>
  at System.AppDomain.InvokeInDomain (System.AppDomain,System.Reflection.MethodInfo,object,object[]) <0x000a2>
  at System.Runtime.Remoting.RemotingServices.GetDomainProxy (System.AppDomain) <0x00055>
  at System.AppDomain.CreateDomain (string,System.Security.Policy.Evidence,System.AppDomainSetup) <0x00205>
  at System.AppDomain.CreateDomain (string) <0x00014>
  at Example.Main () <0x0006e>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

	mono() [0x4b7bd8]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0xfbb0) [0x7f190efffbb0]
	/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7f190ec5ef77]
	/lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f190ec625e8]
	/lib/x86_64-linux-gnu/libc.so.6(+0x2fd43) [0x7f190ec57d43]
	/lib/x86_64-linux-gnu/libc.so.6(+0x2fdf2) [0x7f190ec57df2]
	mono() [0x4eb9af]
	mono() [0x514155]
	mono() [0x4bad9e]
	mono() [0x41fd76]
	mono() [0x4efa64]
	mono() [0x4207cc]
	mono() [0x4223d7]
	mono() [0x42586f]
	mono() [0x4267e2]
	mono(mono_runtime_invoke+0x3d) [0x5b57cd]
	mono() [0x5b5d5c]
	[0x406ce86e]
Comment 1 Alexandre Faria 2014-02-22 14:40:53 UTC
I re-run this test case and it crashed apparently on the same exact spot.

But I found out something strange, if I use Parallel.For it crashes way faster:
(The stackstrace if I'm not mistaken varies and isn't clear).

Original:
real	285m1.612s
user	283m44.342s
sys	0m53.125s

Parallel.For:
real	9m41.530s
user	38m16.772s
sys	2m57.618s


using System;
using System.Threading;
using System.Threading.Tasks;

public class Example
{
  public static void Main()
  {
    Parallel.For(0, 100000, i =>
    {
      System.Console.WriteLine("\n\nIteration " + i);
      AppDomain ad = AppDomain.CreateDomain("ChildDomain");
      AppDomain.Unload(ad);
    });
  }
}


Stacktrace:


Native stacktrace:

	mono() [0x4baee8]
	mono() [0x51389b]
	mono() [0x425272]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0xfbb0) [0x7f9965aaebb0]
	mono(mono_domain_free+0x16d) [0x5aec7d]
	mono() [0x5a8e1e]
	mono() [0x63791e]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f6e) [0x7f9965aa6f6e]
	/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f99657d19cd]
Comment 2 Zoltan Varga 2014-02-24 14:32:35 UTC
This is probably a dup of #17805. Could you try again with master ?
Comment 3 Alexandre Faria 2014-02-24 15:13:45 UTC
I'm afraid its not, I reported #17805 also, and that one is fixed.

This one is a heavier version of the same test case, it tries to create 10x more AppDomains. I only run this one, when the other one passes.
Comment 4 Zoltan Varga 2014-02-24 15:59:57 UTC
The crash with the first testcase could be due to memory fragmentation, since at each unload we have to free a lot of memory and over time that could fragment memory. Its hard to debug it since it takes so much time to occur.

The crash with the second testcase is a completely different problem, opened
https://bugzilla.xamarin.com/show_bug.cgi?id=17977
for it.
Comment 5 Alexandre Faria 2014-02-24 19:53:18 UTC
If I'm not mistaken it happens exactly on the same iteration.

I'll have to work more on these issues, as if I can crash mono with that test case I'll for sure crash it much more in production.

Is there a way to avoid memory fragmentation?

I'm having an issue in with a non-linear increase in execution time, possibly exponential, might the memory fragmentation be the cause?
Comment 6 Zoltan Varga 2014-02-26 15:04:08 UTC
Could you try the first testcase again with master ?
Comment 7 Alexandre Faria 2014-02-26 16:10:06 UTC
Sure ASAP, but only tomorrow.
Comment 8 Alexandre Faria 2014-02-27 10:33:49 UTC
I tried the first one (regular for, single threaded) again, but it still crashed in the same iteration with the same error and stacktrace.
Comment 9 Alexandre Faria 2014-02-27 15:22:08 UTC
Now the parallel version also crashes I guess with the same problem.
Apparently it should have created and unloaded more or less the same amount of AppDomains.

Iteration 70687
mono: mini-amd64.c:492: amd64_patch: Assertion `0' failed.
Stacktrace:

  at <unknown> <0xffffffff>
  at System.Collections.Generic.Dictionary`2.Init (int,System.Collections.Generic.IEqualityComparer`1<TKey>) <0x00053>
  at System.Collections.Generic.Dictionary`2..ctor () <0x00017>
  at System.Security.Cryptography.CryptoConfig/CryptoHandler..ctor (System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x0009b>
  at System.Security.Cryptography.CryptoConfig.LoadConfig (string,System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x00087>
  at System.Security.Cryptography.CryptoConfig.Initialize () <0x01877>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string,object[]) <0x0007f>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string) <0x00013>
  at System.Security.Cryptography.RandomNumberGenerator.Create (string) <0x00013>
  at System.Security.Cryptography.RandomNumberGenerator.Create () <0x00013>
  at System.Guid.NewGuid () <0x0007b>
  at System.Runtime.Remoting.RemotingServices.NewUri () <0x00083>
  at System.Runtime.Remoting.RemotingServices.Marshal (System.MarshalByRefObject,string,System.Type) <0x0027b>
  at System.AppDomain.GetMarshalledDomainObjRef () <0x0001f>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_object__this__ (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Reflection.MonoMethod.InternalInvoke (System.Reflection.MonoMethod,object,object[],System.Exception&) <0xffffffff>
  at System.AppDomain.InvokeInDomain (System.AppDomain,System.Reflection.MethodInfo,object,object[]) <0x0009b>
  at System.Runtime.Remoting.RemotingServices.GetDomainProxy (System.AppDomain) <0x0006b>
  at System.AppDomain.CreateDomain (string,System.Security.Policy.Evidence,System.AppDomainSetup) <0x0029b>
  at System.AppDomain.CreateDomain (string) <0x00017>
  at Example.<Main>m__0 (int) <0x00047>
  at System.Threading.Tasks.Parallel/<For>c__AnonStorey0.<>m__0 (int,System.Threading.Tasks.ParallelLoopState) <0x00024>
  at System.Threading.Tasks.Parallel/<For>c__AnonStorey1.<>m__0 (int,System.Threading.Tasks.ParallelLoopState,object) <0x0002e>
  at System.Threading.Tasks.Parallel/<For>c__AnonStorey2`1.<>m__0 () <0x002ca>
  at System.Threading.Tasks.TaskActionInvoker/ActionInvoke.Invoke (System.Threading.Tasks.Task,object,System.Threading.Tasks.Task) <0x0001a>
  at System.Threading.Tasks.Task.InnerInvoke () <0x0006c>
  at System.Threading.Tasks.Task.ThreadStart () <0x0028f>
  at System.Threading.Tasks.Task.Execute () <0x00013>
  at System.Threading.Tasks.TpScheduler.<QueueTask>m__0 (object) <0x0003f>
  at System.Threading.Thread.StartInternal () <0x0009d>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:



Iteration 8183
	mono() [0x4bb318]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0xfbb0) [0x7f052adbabb0]
	/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7f052aa19f77]
	/lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7f052aa1d5e8]
	/lib/x86_64-linux-gnu/libc.so.6(+0x2fd43) [0x7f052aa12d43]
	/lib/x86_64-linux-gnu/libc.so.6(+0x2fdf2) [0x7f052aa12df2]
	mono() [0x4ef77f]
	mono() [0x5182a5]
	mono() [0x4be4ce]
	mono(mono_resolve_patch_target+0x6b6) [0x420b46]
	mono() [0x4f3b84]
	mono() [0x42159c]
	mono() [0x424918]
	mono() [0x426acf]
	mono() [0x427a42]
	mono(mono_runtime_invoke+0x3d) [0x5b9afd]
	mono() [0x5ba08c]
	[0x40bff86e]
Comment 10 Paolo Molaro 2014-03-11 09:41:14 UTC
There is indeed a 16bit limit for domain IDs, though IDs are reused and in theory it shouldn't be an issue (when running in debug mode, though, we don't really free/unload the domain data, but that should not be the issue here).
Comment 11 Aleksey Sotnikov 2014-05-15 09:01:00 UTC
I have the same problem. reproduced on 3.4.0. very similar to a memory leak when creating and deleting domains. 
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
21174 user  20   0  841m 238m 7980 S 91.3 24.0 130:17.30 mono
21174 user  20   0  842m 240m 7980 S 124.9 24.1 130:30.84 mono
21174 user  20   0  843m 241m 7980 S 129.4 24.2 130:44.01 mono
21174 user  20   0  842m 239m 7980 S 133.2 24.1 130:58.55 mono
...
21174 user  20   0  875m 413m 7980 S 101.6 41.5 328:12.18 mono
21174 user  20   0  874m 412m 7980 S 101.5 41.4 328:22.59 mono
21174 user  20   0  875m 412m 7980 S 99.3 41.4 328:32.92 mono
21174 user  20   0  875m 412m 7980 S 107.3 41.4 328:43.08 mono
21174 user  20   0  875m 413m 7980 S 97.5 41.5 328:52.47 mono
21174 user  20   0  876m 414m 7980 S 113.3 41.6 329:02.96 mono
21174 user  20   0  874m 411m 7980 S 87.5 41.3 329:13.25 mono

after starting, used only 2.6% of the memory and 41.3% of the memory before crash mono (was running 2 copies of the test) 
 8451 user  20   0  676m  26m 7904 S 153.7  2.6   0:18.95 mono
Comment 12 Vladimir Kargov 2014-09-04 21:42:23 UTC
I can confirm that the original test case is still failing on current trunk (mono 4.8.1., master/4b2ddc7). It crashes after iteration 43682 for me with the same stack trace:

mono: mini-amd64.c:482: amd64_patch: Assertion `0' failed.

Stacktrace:
  at <unknown> <0xffffffff>
  at System.Collections.Generic.Dictionary`2.Init (int,System.Collections.Generic.IEqualityComparer`1<TKey>) <0x0004b>
  at System.Collections.Generic.Dictionary`2..ctor () <0x0001b>
  at System.Security.Cryptography.CryptoConfig/CryptoHandler..ctor (System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x000a1>
  at System.Security.Cryptography.CryptoConfig.LoadConfig (string,System.Collections.Generic.IDictionary`2<string, System.Type>,System.Collections.Generic.IDictionary`2<string, string>) <0x00093>
  at System.Security.Cryptography.CryptoConfig.Initialize () <0x01a5f>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string,object[]) <0x0009a>
  at System.Security.Cryptography.CryptoConfig.CreateFromName (string) <0x00016>
  at System.Security.Cryptography.RandomNumberGenerator.Create (string) <0x00017>
  at System.Security.Cryptography.RandomNumberGenerator.Create () <0x0001a>
  at System.Guid.NewGuid () <0x00089>
  at System.Runtime.Remoting.RemotingServices.NewUri () <0x00081>
  at System.Runtime.Remoting.RemotingServices.Marshal (System.MarshalByRefObject,string,System.Type) <0x002b7>
  at System.AppDomain.GetMarshalledDomainObjRef () <0x0002c>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_object__this__ (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Reflection.MonoMethod.InternalInvoke (System.Reflection.MonoMethod,object,object[],System.Exception&) <0xffffffff>
  at System.AppDomain.InvokeInDomain (System.AppDomain,System.Reflection.MethodInfo,object,object[]) <0x000a2>
  at System.Runtime.Remoting.RemotingServices.GetDomainProxy (System.AppDomain) <0x00055>
  at System.AppDomain.CreateDomain (string,System.Security.Policy.Evidence,System.AppDomainSetup) <0x00205>
  at System.AppDomain.CreateDomain (string) <0x00010>
  at Example.Main () <0x0006e>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>

Native mono stacktrace:
#0  0x00007ffff711abb9 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff711dfc8 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7113a76 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff7113b22 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x000000000054b6f7 in amd64_patch (code=0x42d536b7 "t\377H\213", target=0x42d5373b)
    at mini-amd64.c:482
#5  0x000000000054b730 in mono_amd64_patch (code=0x42d536b7 "t\377H\213", target=0x42d5373b)
    at mini-amd64.c:488
#6  0x00000000005aaf92 in mono_arch_create_rgctx_lazy_fetch_trampoline (slot=131053,
    info=0x7fffffffbd08, aot=0) at tramp-amd64.c:869
#7  0x000000000050d617 in mono_create_rgctx_lazy_fetch_trampoline (offset=131053)
    at mini-trampolines.c:1608
#8  0x000000000041b478 in mono_resolve_patch_target (method=0xc5d600, domain=0xe93e6d0,
    code=0x42d68080 "H\203\354\070L\211<$L\211T$\bH\213|$\b\220\350Xpx\361L\213\370I\213\377H\276\230\330\364\367\377\177", patch_info=0xeac5d70, run_cctors=1) at mini.c:3646
#9  0x0000000000592f00 in mono_arch_patch_code (method=0xc5d600, domain=0xe93e6d0,
    code=0x42d68080 "H\203\354\070L\211<$L\211T$\bH\213|$\b\220\350Xpx\361L\213\370I\213\377H\276\230\330\364\367\377\177", ji=0xeac5f28, dyn_code_mp=0x0, run_cctors=1) at mini-amd64.c:6430
#10 0x000000000041cd16 in mono_codegen (cfg=0xe8983d0) at mini.c:4243
#11 0x000000000042114f in mini_method_compile (method=0xb3f520, opts=370239999, domain=0xe93e6d0,
    flags=JIT_FLAG_RUN_CCTORS, parts=0) at mini.c:5712
#12 0x0000000000421ec9 in mono_jit_compile_method_inner (method=0xb3f520, target_domain=0xe93e6d0,
    opt=370239999, jit_ex=0x7fffffffc398) at mini.c:6027
#13 0x0000000000422b73 in mono_jit_compile_method_with_opt (method=0xb3f520, opt=370239999,
    ex=0x7fffffffc398) at mini.c:6299
#14 0x0000000000423559 in mono_jit_runtime_invoke (method=0xb3f520, obj=0x0, params=0x0,
    exc=0x7fffffffc5a0) at mini.c:6574
#15 0x000000000068dfd9 in mono_runtime_invoke (method=0xb3f520, obj=0x0, params=0x0,
    exc=0x7fffffffc5a0) at object.c:2831
#16 0x0000000000688c35 in mono_runtime_class_init_full (vtable=0xe96a078, raise_exception=1)
    at object.c:376
#17 0x00000000006888af in mono_runtime_class_init (vtable=0xe96a078) at object.c:261
#18 0x000000000050bebb in mono_generic_class_init_trampoline (regs=0x7fffffffc848,
    code=0x426d0f6c "H\213D$ H\213\070ff\220\350\204;\223\375L\213\320I\273:\020mB",
    vtable=0xe96a078, tramp=0x400151b2 "\350i\325\376\377\004") at mini-trampolines.c:924
...
Comment 13 Vladimir Kargov 2017-11-08 08:51:50 UTC
This bug reflects an overall issue with Mono that various places of the runtime leak memory during domain destruction/creation.

I fixed the most of those issues, but there are more. Here is an incomplete list of some of the relevant fixes:
https://github.com/mono/mono/commit/1009c92386c68c721f6fe508a530af521ada722d
https://github.com/mono/mono/commit/f2dc087948dbcd319f276b6800c5496073d9d0a0
https://github.com/mono/mono/commit/d24633b39c15ae89e5b0e67e425438c3d59751de
https://github.com/mono/mono/commit/7b3f134caaebee3850edfbc69abbcffabad7aebd
https://github.com/mono/mono/commit/7dc414e181237ce02e54e9def1fe75991b0c5a79

Until they are resolved, Mono will be leaking memory whenever a new domain is created, which inevitably will lead to a crash at various points.

The full of currently known potential leaks is here:
https://github.com/vkargov/monolog/blob/master/dom.md
Some of those may be false positives, some may have the same root causes, so the list will not be as large.