Bug 32122 - SIGSEGV when using Multiple threads and BlockCollection<T> with ConcurrentQueue<T> as data store
Summary: SIGSEGV when using Multiple threads and BlockCollection<T> with ConcurrentQue...
Status: CONFIRMED
Alias: None
Product: Runtime
Classification: Mono
Component: JIT (show other bugs)
Version: 4.0.0
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2015-07-19 18:26 UTC by Jay
Modified: 2016-08-17 01:26 UTC (History)
7 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Jay 2015-07-19 18:26:19 UTC
I'm running a multi threaded Producer-Consumer system that uses BlockingCollection<StrongBox<String>> with ConcurrentQueue<StrongBox<String>> as its backing store. The system rapidly increases threads on launch and fails within a minute.  

The same code base works well on windows without any errors.  I've tried to compile the code against Mono to see if there are any incompatibility but it compiles well.

Here's the message from Native Stacktrace. 

 mono() [0x4b20bc]
        mono() [0x5086ee]
        mono() [0x428f7d]
        /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fbb630df340]
        mono() [0x5ed74b]
        mono() [0x5ed649]
        mono() [0x582efd]
        mono() [0x5ca536]
        mono() [0x5d00db]
        mono() [0x5d066a]
        mono() [0x5e6d60]
        mono() [0x5e6f4b]
        [0x40e8606a]

In some of the runs, I see the following messages in stack trace.

System.NullReferenceException: Object reference not set to an instance of an object
  at (wrapper stelemref) object:virt_stelemref_class (intptr,object)
  at System.Collections.Concurrent.ConcurrentQueue`1+Segment[System.Runtime.CompilerServices.StrongBox`1[System.String]].TryAppend (System.Runtime.CompilerServices.StrongBox`1 value) [0x00000] in <filename unknown>:0
  at System.Collections.Concurrent.ConcurrentQueue`1[System.Runtime.CompilerServices.StrongBox`1[System.String]].Enqueue (System.Runtime.CompilerServices.StrongBox`1 item) [0x00000] in <filename unknown>:0
  at System.Collections.Concurrent.ConcurrentQueue`1[System.Runtime.CompilerServices.StrongBox`1[System.String]].System.Collections.Concurrent.IProducerConsumerCollection<T>.TryAdd (System.Runtime.CompilerServices.StrongBox`1 item) [0x00000] in <filename unknown>:0
  at System.Collections.Concurrent.BlockingCollection`1[System.Runtime.CompilerServices.StrongBox`1[System.String]].TryAddWithNoTimeValidation (System.Runtime.CompilerServices.StrongBox`1 item, Int32 millisecondsTimeout, CancellationToken cancellationToken) [0x00000] in <filename unknown>:0
Comment 1 Jay 2015-07-19 18:53:42 UTC
Ran in debug mode with some global error logging.

System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.Tasks.Task.FinishContinuations () [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.FinishStageThree () [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.FinishStageTwo () [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.Finish (Boolean bUserDelegateExecuted) [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.ExecuteWithThreadLocal (System.Threading.Tasks.Task& currentTaskSlot) [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.ExecuteEntry (Boolean bPreventDoubleExecution) [0x00000] in <filename unknown>:0
  at System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00000] in <filename unknown>:0
  at System.Threading.ThreadPool.<UnsafeQueueCustomWorkItem>m__0 (System.Object obj) [0x00000] in <filename unknown>:0
Stacktrace:


Native stacktrace:

        mono() [0x4b20bc]
        mono() [0x5086ee]
        mono() [0x428f7d]
        /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f99cecb2340]
        mono() [0x5ed74b]
        mono() [0x5ed649]
        mono() [0x582efd]
        mono() [0x5ca536]
        mono() [0x5d00db]
        mono() [0x5d066a]
        mono() [0x5e6c98]
        mono() [0x5e6e4b]
        [0x40bfae6a]

Debug info from gdb:


=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
Comment 2 Marek Safar 2015-07-20 04:24:12 UTC
Could you attach code which causes the crash?
Comment 3 Jay 2015-07-20 07:32:43 UTC
I've tried to recreate the customer code in a simpler way.  Here's the entire code

using System;

namespace MonoTest
{
    class Program
    {
        static System.Collections.Concurrent.ConcurrentQueue<System.Runtime.CompilerServices.StrongBox<String>> dataStore;
        static System.Collections.Concurrent.BlockingCollection<System.Runtime.CompilerServices.StrongBox<String>> collection;

        static void Main(string[] args)
        {
            dataStore = new System.Collections.Concurrent.ConcurrentQueue<System.Runtime.CompilerServices.StrongBox<String>>();
            collection = new System.Collections.Concurrent.BlockingCollection<System.Runtime.CompilerServices.StrongBox<String>>(dataStore);

            Int64 totalItems = 0;

            var c1 = System.Threading.Tasks.Task.Factory.StartNew(() =>
            {
                Console.WriteLine("Starting Serial Consumer");
                StartSerialConsumer();
            });

            var p1 = System.Threading.Tasks.Task.Factory.StartNew(() =>
            {
                Console.WriteLine("Starting Producer 1");
                var items = StartProducer();
                System.Threading.Interlocked.Add(ref totalItems, items);
            });

            var p2 = System.Threading.Tasks.Task.Factory.StartNew(() =>
            {
                Console.WriteLine("Starting Producer 2");
                var items = StartProducer();
                System.Threading.Interlocked.Add(ref totalItems, items);
            });

            // doing some work on this thread
            System.Threading.Tasks.Task.Delay(1000).Wait();

            Console.WriteLine("Waiting for producers to finish enqueue");
            System.Threading.Tasks.Task.WaitAll(p1, p2);

            Console.WriteLine("Total items added to queue: {0}", totalItems);
            collection.CompleteAdding();

            Console.WriteLine("Waiting for serial producer to exit");
            Console.WriteLine("Total items currently in queue: {0}", collection.Count);
            c1.Wait();

            StartParallelConsumer();
        }

        static void StartSerialConsumer()
        {
            do
            {
                Get10Items();
            } while (!collection.IsAddingCompleted);
        }

        static void Get10Items()
        {
            System.Collections.Generic.List<String> items = new System.Collections.Generic.List<String>();
            for (int i = 0; i < 10; i++)
            {
                System.Runtime.CompilerServices.StrongBox<String> item = default(System.Runtime.CompilerServices.StrongBox<String>);
                if (collection.TryTake(out item))
                {
                    if (item != null)
                    {
                        items.Add(item.Value);
                        item.Value = null;
                        item = null;
                    }
                }
            }

            if (items.Count > 0)
            {
                // Mimic web call
                System.Threading.Tasks.Task.Delay(1000).Wait();
            }
            else
            {
                Console.WriteLine("No data. Sleeping for 10 seconds");
                System.Threading.Tasks.Task.Delay(TimeSpan.FromSeconds(10)).Wait();
            }
        }

        static void StartParallelConsumer()
        {
            System.Threading.Tasks.Parallel.For(0, Environment.ProcessorCount, (i, loopState) =>
            {
                Console.WriteLine("Starting parallel producer for core " + (i + 1).ToString());
                do
                {
                    Get10Items();
                } while (!collection.IsCompleted);
            });
        }

        static Int32 StartProducer()
        {
            Int32 iteration = 10000;
            for (int i = 0; i < iteration; i++)
            {
                // Mimic web calls
                System.Threading.Tasks.Task.Delay(100).Wait();

                var obj = new
                {
                    ID = System.IO.Path.GetFileNameWithoutExtension(System.IO.Path.GetRandomFileName()),
                    VALUE = Guid.NewGuid().ToString(),
                    SOURCE = new Random(i).Next(2000000).ToString(),
                    LOCATION = i
                };
                var data = Newtonsoft.Json.JsonConvert.SerializeObject(obj);
                var wrap = new System.Runtime.CompilerServices.StrongBox<String>(data);

                Boolean isAdded = false;
                Int32 attempts = 0;
                do
                {
                    if (collection.TryAdd(wrap))
                    {
                        isAdded = true;
                        attempts++;
                    }
                } while (!isAdded && attempts < 5);
            }
            return iteration;
        }

    }
}




This is the log from the execution...

ubuntu@ip-10-179-207-167:/opt/mt$ sudo mono --server MonoTest.exe
Starting Serial Consumer
Starting Producer 1
No data. Sleeping for 10 seconds
Starting Producer 2
Waiting for producers to finish enqueue
Stacktrace:


Native stacktrace:

        mono() [0x4b20bc]
        mono() [0x5086ee]
        mono() [0x428f7d]
        /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fa342207340]

Debug info from gdb:


=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
Comment 4 Jay 2015-07-20 07:36:21 UTC
For small data, the code executes without any errors.  I tried for 10 iterations, 100 iterations and 1000 iterations.  It worked without errors. At higher numbers, the code fails.  

On .Net Framework version 4.5.1 in windows environment, at even a million iterations, it works without any errors.  

Looks to be an memory allocation and teardown issue.  I tried with boehm GC as well and get the same segmentation fault.
Comment 5 Jay 2015-07-20 12:29:48 UTC
I've verified the issue with the submitted code.  Its a basic Multi threaded Producer-Consumer pattern using BlockingCollection.
Comment 6 Alexander Kyte 2015-07-22 19:53:05 UTC
So I get a hang, but not a segfault.

$ mono Repro.exe
Starting Producer 1
Starting Producer 2
Starting Serial Consumer
No data. Sleeping for 10 seconds
Waiting for producers to finish enqueue


^C

$ mono --version
Mono JIT compiler version 4.3.0 (complex_step/528a30b Wed Jul 22 19:30:28 EDT 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           normal
	SIGSEGV:       altstack
	Notification:  kqueue
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug
	LLVM:          supported, not enabled.
	GC:            sgen

Note You need to log in before you can comment on or make changes to this bug.