Bug 54093 - SecureChannelFailure nearly all the time when restoring nugets
Summary: SecureChannelFailure nearly all the time when restoring nugets
Status: VERIFIED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: io-layer (show other bugs)
Version: master
Hardware: PC Mac OS
: High critical
Target Milestone: 15.3
Assignee: Ludovic Henry
URL:
Depends on:
Blocks:
 
Reported: 2017-03-28 09:17 UTC by Alan McGovern
Modified: 2017-08-02 10:03 UTC (History)
8 users (show)

See Also:
Tags: 2017-05
Is this bug a regression?: ---
Last known good build:


Attachments

Description Alan McGovern 2017-03-28 09:17:01 UTC
The Xamarin Studio build is failing nearly all the time with this error:
https://gist.github.com/alanmcgovern/8bb1b78ebbf70a37415249ba386a00d8

It can be seen in the top few builds on the 'Sierra' group in this lane: https://wrench.internalx.com/Wrench/index.aspx?lane=monodevelop-mdaddins-decriptor-msbuild

The latest build on that lane is using Mono JIT compiler version 5.0.0.14 (2017-02/d15985c Mon Mar 27 13:52:01 EDT 2017)
Comment 1 Alan McGovern 2017-03-28 09:20:26 UTC
The 'Sierra' group is macos sierra bots, so this is a sierra failure.

We can't get results on El Capitan because the bots are broken again.
Comment 2 Martin Baulig 2017-03-28 18:22:33 UTC
There is nothing I can do about this - it's some obscure error on some bot.

And the lane that you linked doesn't even show the problem anywhere.

How do I reproduce this?

Looks like I have the exact same Mono from the 2017-02 lane and my nugget works fine.

This is my Mono:

$ mono --version
Mono JIT compiler version 5.0.0.14 (2017-02/d15985c Mon Mar 27 13:52:01 EDT 2017)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           normal
	SIGSEGV:       altstack
	Notification:  kqueue
	Architecture:  x86
	Disabled:      none
	Misc:          softdebug 
	LLVM:          yes(3.6.0svn-mono-master/8b1520c)
	GC:            sgen (concurrent by default)
Comment 3 Martin Baulig 2017-03-28 18:24:14 UTC
If anyone needs a nugget to try:

$ git clone git@github.com:xamarin/web-tests.git web-tests-stable
$ cd web-tests-stable
$ git checkout -f jenkins
$ nuget restore Xamarin.WebTests.iOS.sln

(the master branch should work as well)
Comment 4 Martin Baulig 2017-03-28 18:24:45 UTC
Can we try that on that bot after installing that Mono and see what happens?
Comment 5 Martin Baulig 2017-03-28 18:52:05 UTC
My Mac is macOS Sierra 10.12.3.
Comment 6 Andi McClure 2017-03-28 20:42:22 UTC
In discussions with akoeplinger and jo shields I'm told this is a recurring error with parallel nuget restore with nuget 3 running on mono. It occurs on all three of legacytls, btls and appletls and it is not obviously a problem with tls at all (that is, it could be a bug in our threadpool or socket implementation). Jo just tested and reports he is able to reproduce this on 4.8.

The solution is to invoke nuget with -DisableParallelProcessing , which works around the issue. We should schedule time to fix the underlying issue but not for 15.2.
Comment 7 Andi McClure 2017-03-28 21:31:43 UTC
Assigning to David to fix on the XS build script side
Comment 8 Andi McClure 2017-03-28 21:32:08 UTC
[status change]
Comment 9 Lluis Sanchez 2017-03-28 22:09:21 UTC
Reassigning since this bug needs to be fixed in Mono, independently of the workaround we can use in XS. Filed bug #54154 to keep track of the application of the workaround.
Comment 10 Andi McClure 2017-03-31 19:08:35 UTC
The underlying issue here is not a 15.2 bug and *probably* is not a Martin bug
Comment 11 Andi McClure 2017-03-31 20:36:59 UTC
Ludovic, can you look at this one, I think this is most likely threadpool or sockets acting out. If you are able to establish TLS is at fault please assign back over to Martin.
Comment 12 Rodrigo Kumpera 2017-06-01 16:05:55 UTC
Parallel nuget restore was fixed, marking this one too.
Comment 13 Aman Dharwal 2017-06-30 07:14:22 UTC
Greetings Rodrigo,

Bug bug #54154 is in verified state. Should we mark this defect as Verified as well as per the above comment?
Or do we need to verify this defect in some other build?

Kindly advise on how we should proceed on this.

Thanks!!!

Note You need to log in before you can comment on or make changes to this bug.