Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
Developer Community or GitHub with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
Created attachment 8056 [details]
Basic error message information is as follows, with more detailed log attached. The error does not exist if the libraries are loaded early in the constructor.
E/Surface (14724): dequeueBuffer_DEPRECATED: Fence::wait returned an error:
W/Adreno-EGLSUB(14724): <DequeueBuffer:585>: dequeue native buffer fail:
Interrupted system call, buffer=0x0, handle=0x0
W/Adreno-ES20(14724): <gl2_surface_swap:43>: GL_OUT_OF_MEMORY
W/Adreno-EGL(14724): <qeglDrvAPI_eglSwapBuffers:3590>: EGL_BAD_SURFACE
Could you please provide us the direction/suggestions to check this issue?
Are you running under the debugger when this situation occurs? I have had issues with this on Qualcomm based chipsets with Xamarin under the debugger for a long time. Basically Qualcomm has bugs in their graphics driver where an interrupted system call is not handled correctly. The graphics driver does not free memory it was using when the system call is interrupted or it is just misinterpreting the return code.
It's quite aggravating as it causes hangs or crashes while trying to debug apps on a device. I suspect that this error can also be triggered by the garbage collector signals interrupting the graphics driver system calls.
Aside from working with Qualcomm to address this issue (which wont help existing devices for a long time if ever), I think Xamarin can solve this issue by introducing the SA_RESTART flag on the signal handlers that it registers. This would make it so that system calls from the SIGSEGV exceptions used for the debugger and SIGxxx used for GC wouldn't cause other android subsystem to get unexpected errors.
I also see another weird manifestation of this signal handler behavior when I am doing network requests under the debugger. Often the requests will fail noting an EINTR. Again, the GC likely causes these as well. Normal android apps generally run with no signals occuring, so the extra signals in Xamarin/mono often seem to cause unexpected behaviors...
Yes. This only happens when the debugger is connected.
Created attachment 8287 [details]
I managed to come up with a reproduction. Launch the game with debugger, tap the screen and wait a few seconds. Bug occured on a Galaxy S4. Didn't occur on an LG L65.
It's been three weeks without a sign. Anyone looking at these tickets?
Created attachment 8475 [details]
Was able to reproduce this on a physical samsung galaxy S4, however the reproduction did not occur every attempt. I also noticed that this issue doesn't happen on other samsung galaxy s devices other than the 4. I've attached logs/build output/etc to confirm that I was able to replicate the issue.
I'm rather confused by this.
The crash appears to have nothing to do with lazy-loading of assemblies; the crash appears to be an Out Of Memory condition within OpenGL, and things falling to pieces from there:
> [Adreno-EGLSUB] <DequeueBuffer:593>: dequeue native buffer fail: Interrupted system call, buffer=0x0, handle=0x0
> [Adreno-ES20] <gl2_surface_swap:43>: GL_OUT_OF_MEMORY
> [Adreno-EGL] <qeglDrvAPI_eglSwapBuffers:3661>: EGL_BAD_SURFACE
> Thread finished: <Thread Pool> #14
> [Error in swap buffers] OpenTK.Platform.Android.EglException: EglSwapBuffers failed with error 12301 (0x300d)
> [Error in swap buffers] at OpenTK.Platform.Android.AndroidGraphicsContext.Swap () [0x00090] in /Users/builder/data/lanes/monodroid-mlion-monodroid-4.18-series/3b7ef0a7/source/monodroid/src/OpenGLES/Android/AndroidGraphicsContext.cs:146
> [Error in swap buffers] at OpenTK.Platform.Android.AndroidGameView.SwapBuffers () [0x0001f] in /Users/builder/data/lanes/monodroid-mlion-monodroid-4.18-series/3b7ef0a7/source/monodroid/src/OpenGLES/Android/AndroidGameView.cs:228
> [Error in swap buffers] at Microsoft.Xna.Framework.AndroidGamePlatform.Present () [0x00025] in /Users/dominicnahous/Downloads/Repro/MonoGame/MonoGame.Framework/Android/AndroidGamePlatform.cs:171
The "Repro Case" in Comment #4 appears to be additional log output. I am further confused.
@Adam's URL in Comment #5 is taking an eternity to download; I haven't been able to view it yet. It's been stuck at 99% for several minutes.
Comment #2 is somewhat enlightening -- it's a hardware/driver issue! yay! -- though the advice to use SA_RESTART is further confusing, as Mono itself *does* use SA_RESTART in numerous places, so I'm not sure what would be missing.
The GC signals do use SA_RESTART, so I am wrong about the GC causing the same behavior, but it appears that none of the other posix signal handlers do... e.g. my good friend SIGSEGV.
The only way an EINTR is getting return AFAIK, is if a signal interrupted it. For example, the debugger is interacting with the running app in some way, e.g. pausing the app via signals, a breakpoint is triggered, a conditional breakpoint is evaluated, the UI tries to read some state and sends some signal unbeknownst to the debugger monkey operating it :)
IIRC debug code from mono is generated with a read @ magic page address in between statements so that the magic address can be changed to invalid to cause program execution to pause after each single step event. For me personally, single stepping under the debugger (even in non GUI code) would trigger this issue.
SIGSEGV needs restartable?! The one with a default action that terminates the process?
That said, Mono *does* have support to "delegate" the SIGSEGV chain to the native (dalvik/ART) SIGSEGV handler (added in Xamarin.Android 4.16+) which will cause Android to field the SIGSEGV, invoking debuggerd for additional logcat logging...
But I do not understand why SIGSEGV of all signals should be SA_RESTARTable...
When many signals are delivered, they pause all threads in the process. SIGSEGV is an example. SA_RESTART controls what happens to the other threads pending syscalls when the signal handler returns. It specifically controls whether a pending syscall on results in a return value of error (EINTR) or whether the kernel restarts the pending system call (e.g. a wait type operation). If a signal type is not targeted to a specific thread, then all threads will have their system calls interrupted.
The Xamarin/mono debug system for single stepping (breakpoints?) works by essentially rewriting code to include a potential SIGSEGV between lines. For example.
int foo = 1;
foo *= 2;
would logically become
int foo = 1;
foo *= 2;
The magic_page pointer is changed by the soft debugger to be inaccessible when you are in single stepping mode. This causes the application to generate a SIGSEGV between every statement.
Now consider I have two threads running. One is doing some graphics work (UI thread/render thread/etc), the other is in some code I am trying to debug.
T1: code in the Qualcomm driver
A: result = wait_for_graphics_operation_to_complete();
B: if (result != OK)
T2: my code that needs to be debugged
D: int foo = 1;
F: foo *= 2;
I am in single-stepping mode. Thread 1 pauses at line A waiting for a result. Meanwhile, Thread 2 reaches line E and the memory read generates a SIGSEGV. **If** the SIGSEGV causes the android kernel to "interrupt" all threads, then this would result in result != OK and the driver would crap all over the graphics state.
In theory, a SIGSEGV could be properly targeted to a thread, but sometimes these things are surprising. For example, someone may have decided it would be sensible that if a program thread fails for a SIGSEGV all threads should be stopped so the most consistent handling can be performed.
If SA_RESTART was enabled for SIGSEGV, I don't think there would be any behavior change for the thread that triggered the SIGSEGV. The kernel checks all the buffers for validity up front, so a waiting syscall could never SIGSEGV.
The other signal handlers related to debugging likely also have an impact.
Glancing at the signal handler registration code without having properly looked into all of the handlers actual usage, the SIGINT handler stands out to me. It only is enabled in debug mode, and sounds like it might be used to send a request from the debugger UI to pause the process to query state/change configuration/etc.
SIGINT also seems less likely to be a thread targeted signal. Using a kill(xyz) to deliver a signal doesn't appear to have a way to target a specific thread (linux PID/TID overlap), so I think the API must always deliver them at the process level.
@Zoltan: This looks like a debugger-related issue, as per Comment #3.
Repro is Comment #13; repro steps in Comment #4.
I can't reproduce this with the testcase provided. I don't think SIGSEGV signals cause other threads to be interrupted, but the debugger does use signals to suspend/resume threads. Will look into the SA_RESTART flag.
I think that this EINTR problem is also the cause of some Lollipop issues
Android's mutex implementation can be interrupted by signals (see lock which
returns an error code)
Android's Looper implementation (used by RenderThread) does not check the
result of this lock operation, so it proceeds to
(see line 229)
Although I originally saw that Qualcomm's driver had this issue where adding
new signals to the normal lifetime of Android apps caused unexpected behavior,
it now seems that Android itself has this problem.
The Looper is not new code in Android and it seems to have never expected to be
interrupted. I suspect there are more instances of this across the entire
Android code base, so its critical to work around with SA_RESTART for any new
signals that Xamarin introduces.
RenderThread also uses AutoMutex which doesn't check the result of mLock.lock() so there are new places in Lollipop that cause random crashes due to native corruption from a false acquisition of a lock due to EINTR.
Actually, pthread_mutex_lock never returns EINTR apparently, so this proposed explanation is wrong atleast for the exact mechanism. :( Maybe there is some other android code in the render thread that doesn't handle EINTR well though.
This will hopefully be fixed in xamarin.android 4.20.
What was the problem?
I have checked this issue and observed that application deploy successfully on android device (Samsung Galaxy S4 4.4.2).
I have also observed that when we tap on the screen then it throws an 'Unhandled Exception' after few seconds and application got crashed.
Hence reopening this issue.
ADB Logcat: https://gist.github.com/Parmendrak/a59c496b03e5f4cb87ef
Microsoft Visual Studio Professional 2013
Version 12.0.30723.00 Update 3
Microsoft .NET Framework
Installed Version: Professional
@radek: Please review the OutputLog in Comment #23. Any ideas?
> E/mono (11640): OpenTK.Platform.Android.EglException: MakeCurrent failed with error 12299 (0x300b)
> 11-13 23:24:42.000 E/Surface (11640): dequeueBuffer_DEPRECATED: Fence::wait returned an error: -4
That is -EINTR .. so more rogue signals are a foot...
(In reply to comment #24)
> @radek: Please review the OutputLog in Comment #23. Any ideas?
> > E/mono (11640): OpenTK.Platform.Android.EglException: MakeCurrent failed with error 12299 (0x300b)
Will take a look. I saw such error once when MakeCurrent was called from a wrong thread.
I tried to reproduce on Galaxy S4 and don't see the exception there, so hopefully it is fixed. (tested with XA 18.104.22.168)
I have checked this issue on Galaxy S4 and now its working fine.
Microsoft Visual Studio Professional 2013
Version 12.0.30723.00 Update 3
Microsoft .NET Framework