Bug 15261 - SGEN: Assertion at sgen-alloc.c:425, condition `*p == NULL' not met
Summary: SGEN: Assertion at sgen-alloc.c:425, condition `*p == NULL' not met
Status: NEEDINFO
Alias: None
Product: Runtime
Classification: Mono
Component: GC (show other bugs)
Version: 3.2.x
Hardware: Other Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-10-08 04:54 UTC by Bassam
Modified: 2017-10-14 00:12 UTC (History)
6 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
test case (1.94 KB, application/octet-stream)
2013-10-08 04:54 UTC, Bassam
Details
gdb output (16.67 KB, application/octet-stream)
2013-10-08 04:55 UTC, Bassam
Details
gdb output from crash with/ boehm (19.50 KB, text/plain)
2013-10-11 13:43 UTC, Bassam
Details

Description Bassam 2013-10-08 04:54:39 UTC
Created attachment 5084 [details]
test case

We are seeing a runtime crash frequently on ARMV5TEL machines running latest from mono master. I've narrowed it down to the attached test case. It seems like a race of some sort where the memory in unmapped before memcpy is called leading to a SIGSEGV. 

Environment:
============
Linux syn-bamboo-arm 2.6.32.12 #3776 Sat Aug 17 11:31:02 CST 2013 armv5tel GNU/Linux synology_88f6282_212+

Mono Runtime Engine version 3.2.3 ((no/f4ada63 Sun Oct  5 20:36:39 PDT 2013)
Copyright (C) 2002-2013 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       normal
        Notifications: epoll
        Architecture:  armel,vfp+fallback
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            sgen

There are three variations of this crash. 

Variation 1:
============
syn-bamboo-arm>TMP=/volume1/@tests/tmp mono-sgen SgenBug.exe
Starting Test
* Assertion at sgen-alloc.c:425, condition `*p == NULL' not met

Stacktrace:


Native stacktrace:


Debug info from gdb:

* Assertion at sgen-alloc.c:425, condition `*p == NULL' not met

Unable to attach: program terminated with signal SIGABRT, Aborted.
No threads.
Aborted (core dumped)

Variation 2:
============
syn-bamboo-arm> TMP=/volume1/@tests/tmp ./thirdparty/mono/bin/mono-sgen SgenBug.exe
Starting Test
Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Buffer.BlockCopyInternal (System.Array,int,System.Array,int,int) <0xffffffff>
  at System.IO.FileStream.ReadSegment (byte[],int,int) <0x00073>
  at System.IO.FileStream.ReadInternal (byte[],int,int) <0x00237>
  at System.IO.FileStream.Read (byte[],int,int) <0x00107>
  at System.Security.Cryptography.HashAlgorithm.ComputeHash (System.IO.Stream) <0x0008b>
  at SgenBug.MainClass.ChecksumFile (object) <0x00123>
  at System.Threading.Thread.StartInternal () <0x000cb>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:


Debug info from gdb:


Unhandled Exception:
System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper managed-to-native) System.Buffer:BlockCopyInternal (System.Array,int,System.Array,int,int)
  at System.IO.FileStream.ReadSegment (System.Byte[] dest, Int32 dest_offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.IO.FileStream.ReadInternal (System.Byte[] dest, Int32 offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.IO.FileStream.Read (System.Byte[] array, Int32 offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.Security.Cryptography.HashAlgorithm.ComputeHash (System.IO.Stream inputStream) [0x00000] in <filename unknown>:0
  at SgenBug.MainClass.ChecksumFile (System.Object cfg) [0x00000] in <filename unknown>:0
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0
[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper managed-to-native) System.Buffer:BlockCopyInternal (System.Array,int,System.Array,int,int)
  at System.IO.FileStream.ReadSegment (System.Byte[] dest, Int32 dest_offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.IO.FileStream.ReadInternal (System.Byte[] dest, Int32 offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.IO.FileStream.Read (System.Byte[] array, Int32 offset, Int32 count) [0x00000] in <filename unknown>:0
  at System.Security.Cryptography.HashAlgorithm.ComputeHash (System.IO.Stream inputStream) [0x00000] in <filename unknown>:0
  at SgenBug.MainClass.ChecksumFile (System.Object cfg) [0x00000] in <filename unknown>:0
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0
warning: Unable to fetch general register.
PC register is not available
  Id   Target Id         Frame
* 1    process 15221     PC not available
Quitting: ptrace: No such process.


Variation 3:
============
syn-bamboo-arm>TMP=/volume1/@tests/tmp mono-sgen SgenBug.exe
Starting Test

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Buffer.BlockCopyInternal (System.Array,int,System.Array,int,int) <0xffffffff>
  at System.IO.FileStream.ReadSegment (byte[],int,int) <0x00073>
  at System.IO.FileStream.ReadInternal (byte[],int,int) <0x00237>
  at System.IO.FileStream.Read (byte[],int,int) <0x00107>
  at System.Security.Cryptography.HashAlgorithm.ComputeHash (System.IO.Stream) <0x0008b>
  at SgenBug.MainClass.ChecksumFile (object) <0x00123>
  at System.Threading.Thread.StartInternal () <0x000cb>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

Debug info from gdb:
(too long. attached)

Observations / Questions:
=========================
 * This only happens with mono-sgen. mono-boehm seem to be OK.
 * I've only seen this on ARM platform
 * If I use a MemoryStream instead of a FileStream it seems to work fine.
 * What keeps the managed buffer pinned while FileStream.ReadInternal does an icall to MonoIO.Read? Could the GC be moving the buffer underneath MonoIO.Read?

The test case fails every time for me on this ARM device. If you are unable to reproduce I can arrange for access. Or if you want more debugging information let me know.
Comment 1 Bassam 2013-10-08 04:55:59 UTC
Created attachment 5085 [details]
gdb output
Comment 2 Bassam 2013-10-10 14:54:41 UTC
I was able to reproduce this issue with boehm as well, so this is not specific to sgen as I had initially thought.
Comment 3 Mark Probst 2013-10-10 14:56:16 UTC
How does it manifest with Boehm?
Comment 4 Bassam 2013-10-11 13:42:32 UTC
Its harder to repro with the isolated test cases attached, but I did get it to happen in our application with Boehm. It manifests itself in a similar way:

Stacktrace:

at <unknown> <0xffffffff>
at (wrapper managed-to-native) System.Buffer.BlockCopyInternal (System.Array,int,System.Array,int,int) <IL 0x00026, 0xffffffff>
at System.IO.FileStream.ReadSegment (byte[],int,int) [0x0001c] in /root/build-thirdparty/source/mono-mcs/mcs/class/corlib/System.IO/FileStream.cs:978
at System.IO.FileStream.ReadInternal (byte[],int,int) [0x00058] in /root/build-thirdparty/source/mono-mcs/mcs/class/corlib/System.IO/FileStream.cs:547
at System.IO.FileStream.Read (byte[],int,int) [0x000be] in /root/build-thirdparty/source/mono-mcs/mcs/class/corlib/System.IO/FileStream.cs:520
at System.Security.Cryptography.HashAlgorithm.ComputeHash (System.IO.Stream) [0x0003d] in /root/build-thirdparty/source/mono-mcs/mcs/class/corlib/System.Security.Cryptography/HashAlgorithm.cs:106
at Symform.Core.Crypto.CryptoUtil.GetChecksumSHA1 (System.IO.Stream,long) [0x00009] in c:\symform.trunk\source\core\Symform.Core\Crypto\CryptoUtil.cs:386
at UnitTests.Core.Crypto.CryptoUtilTests.ChecksumFile (object) [0x00023] in c:\symform.trunk\source\core\UnitTests.Core\Crypto\CryptoUtilTests.cs:569
at System.Threading.Thread.StartInternal () [0x0002b] in /root/build-thirdparty/source/mono-mcs/mcs/class/corlib/System.Threading/Thread.cs:682
at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <IL 0x0004e, 0xffffffff>

Native stacktrace:


Debug info from gdb:

(attached)
Comment 5 Bassam 2013-10-11 13:43:27 UTC
Created attachment 5131 [details]
gdb output from crash with/ boehm
Comment 6 Mark Probst 2013-10-11 18:03:08 UTC
Is it possible that you're using the same FileStream in multiple threads?  It looks like this is some buffer/array overflow.
Comment 7 Bassam 2013-10-11 18:16:28 UTC
With the test case attached I don't think thats possible. A separate stream is used for each thread. Note however that all streams (across all threads) are opened on the same file.
Comment 8 Bassam 2013-10-14 15:47:32 UTC
I'm all out of ideas of how to workaround this issue. Any pointers would be appreciated. I tried debugging this as Rodrigo has suggested by walking the nursery but did not make much progress.
Comment 9 Bassam 2013-11-12 16:08:15 UTC
Hey guys, any update on this one? Does the test case work for you? Anything else I can provide?
Comment 12 Daniel Nauck 2013-11-14 08:04:12 UTC
I've a similar issue with the same error message on SLES 11Sp2 x64 with Mono
version 3.2.5 (master/79e0856f Thu Nov  7 21:58:17 CET 2013):



* Assertion at sgen-alloc.c:425, condition `*p == NULL' not met

Stacktrace:

  at <unknown> <0xffffffff>
  at System.Exception.ToString () <0x00169>
  at ServiceStack.Logging.NLogger.NLogLogger.Error (object) <0x00041>
  at ServiceStack.OrmLite.OrmLiteWriteExtensions.PopulateWithSqlReader<T>
(T,System.Data.IDataReader,ServiceStack.OrmLite.FieldDefinition[],System.Collections.Generic.Dictionary`2<string,
int>) <0x001d4>
  at ServiceStack.OrmLite.OrmLiteUtilExtensions.ConvertToList<T>
(System.Data.IDataReader) <0x000df>
  at ServiceStack.OrmLite.OrmLiteReadExtensions.Select<T>
(System.Data.IDbCommand,string,object[]) <0x0008b>
  at
ServiceStack.OrmLite.OrmLiteReadConnectionExtensions/<>c__DisplayClass2`1.<Select>b__1
(System.Data.IDbCommand) <0x0004b>
  at ServiceStack.OrmLite.ReadConnectionExtensions.Exec<T>
(System.Data.IDbConnection,System.Func`2<System.Data.IDbCommand, T>) <0x0014f>
  at ServiceStack.OrmLite.OrmLiteReadConnectionExtensions.Select<T>
(System.Data.IDbConnection,string,object[]) <0x00143>
  at
Fii.Core.Inventory.Import.ReadSystemFromVdbOperation/<Execute>d__0.MoveNext ()
<0x000fb>
  at Rhino.Etl.Core.Enumerables.SingleRowEventRaisingEnumerator.MoveNext ()
<0x00022>
  at Rhino.Etl.Core.Enumerables.EventRaisingEnumerator.MoveNext () <0x00017>
  at
Rhino.Etl.Core.Pipelines.ThreadPoolPipelineExecuter/<>c__DisplayClass1.<DecorateEnumerableForExecution>b__0
(object) <0x0010e>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_void__this___object
(object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

    /usd/as67154a/soft/mono/bin//mono() [0x4aaf2f]
    /lib64/libpthread.so.0(+0xf7c0) [0x7ffff754a7c0]
    /lib64/libc.so.6(gsignal+0x35) [0x7ffff71f6b35]
    /lib64/libc.so.6(abort+0x181) [0x7ffff71f8111]
    /usd/as67154a/soft/mono/bin//mono() [0x635b45]
    /usd/as67154a/soft/mono/bin//mono() [0x635bf7]
    /usd/as67154a/soft/mono/bin//mono() [0x5f1c33]
    /usd/as67154a/soft/mono/bin//mono() [0x5f20cd]
    /usd/as67154a/soft/mono/bin//mono(mono_string_new_size+0x4b) [0x5b1f9b]
    /usd/as67154a/soft/mono/bin//mono(mono_string_new_utf16+0x1f) [0x5b201f]
    /usd/as67154a/soft/mono/bin//mono() [0x5b20f1]
    /usd/as67154a/soft/mono/bin//mono() [0x456b86]
    /usd/as67154a/soft/mono/bin//mono() [0x41ef0a]
    /usd/as67154a/soft/mono/bin//mono() [0x420e23]
    /usd/as67154a/soft/mono/bin//mono() [0x42167b]
    /usd/as67154a/soft/mono/bin//mono() [0x4aed7c]
    /usd/as67154a/soft/mono/bin//mono() [0x4af738]
    [0x40003e06]

Debug info from gdb:


=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

Abort
Comment 13 Bassam 2014-01-16 17:57:56 UTC
Any update on this? We still see this in our product and not sure what to do about it. Looks like others are seeing this too. Does the test case I submitted help?
Comment 14 Rodrigo Kumpera 2017-10-14 00:12:43 UTC
Thank you for your report!

It appears you are running a very old version of Mono. Could you please try to update to any recent version and try to reproduce the issue again.

If the issue still persists please include the version information and change the bug status to NEW.

Note You need to log in before you can comment on or make changes to this bug.