Bug 30722 - Application crashes during full GC (GREF leak?)
Summary: Application crashes during full GC (GREF leak?)
Status: CONFIRMED
Alias: None
Product: Android
Classification: Xamarin
Component: Mono runtime / AOT Compiler (show other bugs)
Version: 5.1
Hardware: PC Windows
: Normal normal
Target Milestone: ---
Assignee: Alex Rønne Petersen
URL:
: 30913 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-06-03 10:11 UTC by Jeremy Kolb
Modified: 2017-08-15 15:34 UTC (History)
9 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
taabt.txt (17.69 KB, text/plain)
2015-07-10 12:55 UTC, Jonathan Pryor
Details

Description Jeremy Kolb 2015-06-03 10:11:41 UTC
A number of people have hit an issue with the latest stable where the GC isn't collecting.

See: http://forums.xamarin.com/discussion/42460/is-gc-broken-xamarin-visualstudio-3-11-44

Note that I haven't upgraded yet so haven't reproduced this.
Comment 1 Jonathan Pryor 2015-06-03 11:31:40 UTC
[Input. More Input!](https://www.youtube.com/watch?v=Pj-qBUWOYfE&t=9).

Could there be a GC bug? Yes. Have *I* seen any evidence of one? Not yet.

There is not enough information here to say anything at all.

Refresher: a GREF is used for every instance of a `Java.Lang.Object` subclass. For example, this block of code will require 2000 GREFs:

    var strings = new List<Java.Lang.String>();
    for (int i = 0; i < 2000; ++i) {
        strings.Add (new Java.Lang.String (i.ToString ());
    }

That will ~immediately crash on an emulator, as emulators have a limit of 2000 GREFs. (*BOOM*)

Note, however, that the above code does *not* exhibit a GREF leak: all of those instances *are* alive, are being referenced, and *cannot* be collected by the GC.

What we need is the [GREF log](http://developer.xamarin.com/guides/android/troubleshooting/troubleshooting/#Global_Reference_Messages):

    adb shell setprop debug.mono.log gref

If running on Xamarin.Android 5.1 or later with a Debug build, this will generate a [`files/.__override__/grefs.txt`][grefs] log file.

[grefs]: http://developer.xamarin.com/releases/android/xamarin.android_5/xamarin.android_5.1/#GREF_Logging_Changes

Please provide sample code which causes the crash and the corresponding GREF log, which will let us know how many GREFs have been created and where they were created from. (We need the source to ascertain if it's an actual GC bug/leak or a "bug" in the code, e.g. needing to reference "too many" instances at the same time.)
Comment 2 Brad Crandall 2015-06-07 10:30:17 UTC
Here is how to reproduce the problem I am seeing;
1.Create a new project in Visual Studio using the Visual C#->Android->Blank App (Android) template.
2.Edit MainActivity.cs and add GC.Collect(); on the button click.
3.Deploy the app and run it, then click the button and BOOM.

Tested on the Android Emulator API 15.

protected override void OnCreate(Bundle bundle) {
        base.OnCreate(bundle);

        // Set our view from the "main" layout resource
        SetContentView(Resource.Layout.Main);

        // Get our button from the layout resource,
        // and attach an event to it
        Button button = FindViewById<Button>(Resource.Id.MyButton);

        button.Click += delegate { 
            button.Text = string.Format("called GC.Collect {0} times", count++);
            Android.Util.Log.Info("TestGC", "Calling GC.Collect()");
            GC.Collect();
            Android.Util.Log.Info("TestGC", "Completed calling GC.Collect()");
        };
    }
}

06-04 21:59:35.061 I/TestGC ( 2407): Calling GC.Collect()
 06-04 21:59:35.061 I/monodroid-lref( 2407): -l- lrefc 1 handle 0xb3950a00/L from thread '(null)'(1)
 06-04 21:59:35.061 I/monodroid-lref( 2407): -l- lrefc 0 handle 0xb3950a40/L from thread '(null)'(1)
 06-04 21:59:35.070 I/monodroid-gref( 2407): +w+ grefc 13 gwrefc 1 obj-handle 0x1d3001f2/G -> new-handle 0x1d200003/W from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): -g- grefc 13 gwrefc 1 handle 0x1d3001f2/G from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): +w+ grefc 12 gwrefc 2 obj-handle 0x1d2001ea/G -> new-handle 0x1d200007/W from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): -g- grefc 12 gwrefc 2 handle 0x1d2001ea/G from thread 'finalizer'(2407)
 06-04 21:59:35.070 D/dalvikvm( 2407): GC_EXPLICIT freed 109K, 2% free 9105K/9283K, paused 0ms+1ms
 06-04 21:59:35.070 I/monodroid-gref( 2407): +g+ grefc 12 gwrefc 2 obj-handle 0x1d200003/W -> new-handle 0x1d4001f2/G from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): -w- grefc 12 gwrefc 1 handle 0x1d200003/W from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): +g+ grefc 13 gwrefc 1 obj-handle 0x1d200007/W -> new-handle 0x1d3001ea/G from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gref( 2407): -w- grefc 13 gwrefc 0 handle 0x1d200007/W from thread 'finalizer'(2407)
 06-04 21:59:35.070 I/monodroid-gc( 2407): GC cleanup summary: 2 objects tested - resurrecting 2.
 06-04 21:59:35.080 D/Mono ( 2407): GC_OLD_BRIDGE num-objects 2 num_hash_entries 2 sccs size 2 init 0.00ms df1 0.00ms sort 0.04ms dfs2 0.06ms setup-cb 0.04ms free-data 0.05ms links 1/1/1/1 dfs passes 5/3
 06-04 21:59:35.080 D/Mono ( 2407): GC_MAJOR: (user request) pause 1.37ms, total 2.38ms, bridge 11.05ms major 800K/320K los 8K/56K
 06-04 21:59:35.080 E/mono-rt ( 2407): Stacktrace:
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 E/mono-rt ( 2407): Attempting native Android stacktrace:
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 I/monodroid-assembly( 2407): Trying to load library '/data/data/TestGC.TestGC/lib/libunwind.so'
 06-04 21:59:35.080 I/monodroid-assembly( 2407): Trying to load library '/data/data/TestGC.TestGC/lib/libcorkscrew.so'
 06-04 21:59:35.080 E/mono-rt ( 2407): Could not unwind with libunwind.so: Cannot load library: load_library[1091]: Library '/data/data/TestGC.TestGC/lib/libunwind.so' not found
 06-04 21:59:35.080 E/mono-rt ( 2407): Could not unwind with libcorkscrew.so: Cannot load library: load_library[1091]: Library '/data/data/TestGC.TestGC/lib/libcorkscrew.so' not found
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 E/mono-rt ( 2407): No options left to get a native stacktrace :-(
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 E/mono-rt ( 2407): =================================================================
 06-04 21:59:35.080 E/mono-rt ( 2407): Got a SIGSEGV while executing native code. This usually indicates
 06-04 21:59:35.080 E/mono-rt ( 2407): a fatal error in the mono runtime or one of the native libraries 
 06-04 21:59:35.080 E/mono-rt ( 2407): used by your application.
 06-04 21:59:35.080 E/mono-rt ( 2407): =================================================================
 06-04 21:59:35.080 E/mono-rt ( 2407): 
 06-04 21:59:35.080 F/libc ( 2407): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=128)
 06-04 21:59:35.080 I/monodroid-lref( 2407): +l+ lrefc 1 handle 0xb394b308/L from thread '(null)'(1)
 06-04 21:59:35.080 I/monodroid-lref( 2407): +l+ lrefc 2 handle 0xb394a3b8/L from thread '(null)'(1)
 06-04 21:59:35.080 I/TestGC ( 2407): Completed calling GC.Collect()
 06-04 21:59:35.080 I/monodroid-lref( 2407): -l- lrefc 1 handle 0xb394b308/L from thread '(null)'(1)
 06-04 21:59:35.080 I/monodroid-lref( 2407): -l- lrefc 0 handle 0xb394a3b8/L from thread '(null)'(1)
 06-04 21:59:35.610 I/DEBUG ( 774): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
 06-04 21:59:35.610 I/DEBUG ( 774): Build fingerprint: 'generic_x86/sdk_x86/generic_x86:4.0.4/IMM76D/eng.juntian.20120418.185032:eng/test-keys'
 06-04 21:59:35.610 I/DEBUG ( 774): pid: 2407, tid: 2420 >>> TestGC.TestGC <<<
 06-04 21:59:35.610 I/DEBUG ( 774): signal 11 (SIGSEGV), code 128 (?), fault addr 00000000
 06-04 21:59:36.490 D/Zygote ( 777): Process 2407 terminated by signal (11)
Comment 3 Brad Crandall 2015-06-30 21:25:07 UTC
I notice the status for this bug is NEEDINFO. What more info do you need that has not been provided?

Today I tried the latest release Xamarin.VisualStudio_3.11.666. The issue is still there and can easily be reproduced by calling GC.Collect().

Is anyone looking at this issue? I would think this would be top priority in regards that it prevents anyone from using the latest version since 3.9.547.
Comment 4 Jonathan Pryor 2015-07-10 12:47:18 UTC
> Tested on the Android Emulator API 15.

This happens on the *x86* emulator, not the armeabi-v7a emulator. It also doesn't happen on ARM hardware (e.g. Nexus 5) -- at least not for me. I don't know about x86 hardware.
Comment 5 Jonathan Pryor 2015-07-10 12:55:57 UTC
Created attachment 11978 [details]
taabt.txt

(gdb) `t a a bt` contents after running the app within gdb.

Note that the thread that triggered the SIGSEGV has no backtrace info.
Comment 6 Mark Probst 2015-07-10 13:42:07 UTC
This should be trivial to bisect.
Comment 7 Jonathan Pryor 2015-08-01 09:17:07 UTC
*** Bug 30913 has been marked as a duplicate of this bug. ***
Comment 8 Marek Habersack 2015-08-20 09:49:40 UTC
Tested with the current master. 

I was unable to reproduce this issue on:

 * Nexus 9
 * Nexus 5
 * Dell Venue 7 - this one is x86
 * API 15 ARM Google emulator
 * API 18 x86 Google emulator
 * API 19 x86 Google emulator

I was able to get the crash on 

 * API 15 x86 Google emulator

@Peter, could QA test it on a wider selection of devices/emulators? 

Currently it appears that the issue is limited to Google x86 emulator image for API 15, therefore I've reduced the severity to normal. 

@Brad, were you able to reproduce this issue on *anything* other than the API 15 x86 emulator?
Comment 9 Brad Crandall 2015-08-20 13:48:33 UTC
@Marek,

I ran my test and app again using the latest release Xamarin.VisualStudio_3.11.836. I am only seeing it on API 15 x86 emulator. The others seem fine.

Here is what I tested:

-------------
GC Breaks on
-------------
Android 4.0.3, API 15, Intel Atom (x86) Rev 2


--------------
Seems fine on
--------------
Android 4.4.2, API 19, Intel Atom (x86) Rev 3
Android 2.3.3, API 10, ARM (armeabi)
Android 5.0.1, API 21, Intel Atom (x86) Rev 3
Android 5.1.1, API 22, Intel Atom (x86) Rev 1
Google API 4.4.2 API 19, Intel Atom (x86) Rev 15
Comment 10 Peter Collins 2015-08-20 18:32:42 UTC
@Marek I ran this against 223 devices on test cloud and it passed on 222. The one failure case was not related to this bug, and appeared to be caused by a performance hiccup in the UI Test script.

I was able to reproduce the crash on an x86 API 15 Google emu using XA 5.1.5. I was _not_ able to reproduce the failure on the same device after building the app against XA 4.20-series, so I can confirm that this is a regression introduced in 5.x.
Comment 11 Marek Habersack 2015-08-27 11:14:56 UTC
Caught the crash in gdb (finally):

#0  0xabd370eb in resize_spill_info (bank=<optimized out>, cfg=<optimized out>) at /home/grendel/vc/xamarin/mono/mono/mini/mini-codegen.c:302
#1  mono_spillvar_offset (bank=<optimized out>, spillvar=<optimized out>, cfg=<optimized out>) at /home/grendel/vc/xamarin/mono/mono/mini/mini-codegen.c:320
#2  spill_vreg (cfg=<optimized out>, bb=bb@entry=0xabfcbc14 <_GLOBAL_OFFSET_TABLE_>, last=0xa915f82c, reg=30, bank=<optimized out>, ins=<optimized out>) at /home/grendel/vc/xamarin/mono/mono/mini/mini-codegen.c:815
#3  0xabd3742c in free_up_hreg (cfg=cfg@entry=0x9ad8a30, bb=0xabfcbc14 <_GLOBAL_OFFSET_TABLE_>, bb@entry=0x9adc2f8, last=<optimized out>, last@entry=0xa915f82c, hreg=<optimized out>, hreg@entry=0, bank=<optimized out>, bank@entry=0, ins=<optimized out>)
    at /home/grendel/vc/xamarin/mono/mono/mini/mini-codegen.c:933
#4  0xabd396da in mono_local_regalloc (cfg=cfg@entry=0x9ad8a30, bb=bb@entry=0x9adc2f8) at /home/grendel/vc/xamarin/mono/mono/mini/mini-codegen.c:1913
#5  0xabca511c in mono_codegen (cfg=cfg@entry=0x9ad8a30) at /home/grendel/vc/xamarin/mono/mono/mini/mini.c:2391
#6  0xabca6dee in mini_method_compile (method=method@entry=0x9ade6f8, opts=opts@entry=378628607, domain=domain@entry=0x9862800, flags=flags@entry=JIT_FLAG_RUN_CCTORS, parts=parts@entry=0, aot_method_index=aot_method_index@entry=-1)
    at /home/grendel/vc/xamarin/mono/mono/mini/mini.c:3859
#7  0xabca7fc4 in mono_jit_compile_method_inner (method=method@entry=0x9ade6f8, target_domain=target_domain@entry=0x9862800, opt=opt@entry=378628607, jit_ex=jit_ex@entry=0xa915fbf0) at /home/grendel/vc/xamarin/mono/mono/mini/mini.c:4063
#8  0xabcacf5f in mono_jit_compile_method_with_opt (method=method@entry=0x9ade6f8, opt=378628607, ex=ex@entry=0xa915fbf0) at /home/grendel/vc/xamarin/mono/mono/mini/mini-runtime.c:1894
#9  0xabcad63f in mono_jit_compile_method (method=0x9ade6f8) at /home/grendel/vc/xamarin/mono/mono/mini/mini-runtime.c:1931
#10 0xabe33303 in mono_compile_method (method=0x9ade6f8) at /home/grendel/vc/xamarin/mono/mono/metadata/object.c:577
#11 0xabe2d781 in mono_gc_run_finalize (obj=obj@entry=0xa9184390, data=data@entry=0x0) at /home/grendel/vc/xamarin/mono/mono/metadata/gc.c:223
#12 0xabe649b4 in sgen_client_run_finalize (obj=obj@entry=0xa9184390) at /home/grendel/vc/xamarin/mono/mono/metadata/sgen-mono.c:486
#13 0xabe76ab7 in sgen_gc_invoke_finalizers () at /home/grendel/vc/xamarin/mono/mono/sgen/sgen-gc.c:2572
#14 0xabe649d5 in mono_gc_invoke_finalizers () at /home/grendel/vc/xamarin/mono/mono/metadata/sgen-mono.c:492
#15 0xabe2dffe in finalizer_thread (unused=0x0) at /home/grendel/vc/xamarin/mono/mono/metadata/gc.c:1127
#16 0xabe1b202 in start_wrapper_internal (data=0x98f2168) at /home/grendel/vc/xamarin/mono/mono/metadata/threads.c:723
#17 start_wrapper (data=0x98f2168) at /home/grendel/vc/xamarin/mono/mono/metadata/threads.c:770
#18 0xabeb9b85 in inner_start_thread (arg=0xbfe606a4) at /home/grendel/vc/xamarin/mono/mono/utils/mono-threads-posix.c:97
#19 0xb7fe3ba2 in __thread_entry () from /home/grendel/Projects/xamarin/TestApp/gdb-symbols/libc.so
#20 0x00000000 in ?? ()
Comment 12 Marek Habersack 2015-08-27 11:17:09 UTC
The crash is with XA/master and Mono 4.2.0
Comment 13 Rodrigo Kumpera 2016-06-28 00:13:05 UTC
Hey Alex,

Can you take a look at this one?
Comment 15 Jeremy Kolb 2016-07-15 13:27:16 UTC
Why are you removing the milestone?  I really hope this gets fixed.

Note You need to log in before you can comment on or make changes to this bug.