Bug 8320

Summary: Runtime metadata bug with RX.
Product: [Mono] Runtime Reporter: Atsushi Eno <atsushi>
Component: JITAssignee: Zoltan Varga <vargaz>
Severity: normal CC: lupus, miguel, mono-bugs+mono, mono-bugs+runtime, rolf, sebastien, vargaz
Priority: ---    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Tags: Is this bug a regression?: ---
Last known good build:
Attachments: repro file set
PEVerify results for all those assemblies.
test assembly

Description Atsushi Eno 2012-11-09 13:17:24 UTC
Created attachment 2893 [details]
repro file set

This is not well analyzed bug that reproduces for MS rx and nunit tests. nunit results in 

* Steps to repro

The repro steps are tied to mono/mcs build. On mono topdir, do as follows (there is docs/reactive-extension-bundle.txt which described the same):

- check out Rx sources from http://rx.codeplex.com:

	$ cd external
	$ git clone https://git01.codeplex.com/rx
	$ cd ..

  Note that it will *fail* on Linux! codeplex.codeplex.com/workitem/26133

- expand the attached rx-mono-changes.tar.bz2:

	$ tar jxvf rx-mono-changes.tar.bz2

- Apply changes to MS rx repo. It is very small:

	$ cd external/rx/Rx.NET
	$ patch -i mono.patch -p2
	$ cd Tests.System.Reactive
	$ csharp replacer.sh
	$ cd ../../..

- Apply changes to mcs/class/Makefile:

	$ cd mcs/class
	$ patch -i add-rx-libs.patch
	$ cd ../..


now, go to mcs/class/Mono.Reactive.Testing and run "make RUNTIME_FLAGS=--verify-all run-test" which runs nunit-console.exe against Mono.Reactive.Testing_test.dll.

* Actual result

NUnit version 2.4.8
Copyright (C) 2002-2007 Charlie Poole.
Copyright (C) 2002-2004 James W. Newkirk, Michael C. Two, Alexei A. Vorontsov.
Copyright (C) 2000-2002 Philip Craig.
All Rights Reserved.

Runtime Environment - 
   OS Version: Unix
  CLR Version: 4.0.30319.17020 ( 3.0.1 (master/7b1d9f1 2012年 11月  8日 木曜日 22:07:48 JST) )

Could not load image /svn/mono/mcs/class/lib/net_4_5/System.Reactive.Linq.dll due to Tables data require 358144 bytes but the only 341040 are available in the #~ stream
Could not load image /svn/mono/mcs/class/lib/net_4_5/System.Reactive.Linq.dll due to Tables data require 358144 bytes but the only 341040 are available in the #~ stream
Could not load image /svn/mono/mcs/class/lib/net_4_5/System.Reactive.Linq.dll due to Tables data require 358144 bytes but the only 341040 are available in the #~ stream
Could not load file or assembly 'Mono.Reactive.Testing_test_net_4_5, Version=, Culture=neutral, PublicKeyToken=null' or one of its dependencies. The system cannot find the file specified.

* Expected result:

no runtime crash (I have no idea how tests will pass/fail after all).

This is assumed as compiler or ikvm reflection issue.
Comment 1 Atsushi Eno 2012-11-09 13:25:39 UTC
I put a compiled binaries at https://dl.dropbox.com/u/493047/tmp/rx-mono-assemblies.tar.bz2 (may remove once it gets fixed)
Comment 2 Atsushi Eno 2012-11-09 14:03:23 UTC
Created attachment 2894 [details]
PEVerify results for all those assemblies.
Comment 3 Marek Safar 2012-11-11 05:48:58 UTC
This is runtime bug. Following patch fixes the issue for me

diff --git a/mono/metadata/metadata.c b/mono/metadata/metadata.c
index bd07b1f..b8f2965 100644
--- a/mono/metadata/metadata.c
+++ b/mono/metadata/metadata.c
@@ -757,7 +757,7 @@ mono_metadata_compute_size (MonoImage *meta, int tableindex, guint32 *result_bit
                        n = MAX (n, meta->tables [MONO_TABLE_METHOD].rows);
                        n = MAX (n, meta->tables [MONO_TABLE_MODULEREF].rows);
                        n = MAX (n, meta->tables [MONO_TABLE_TYPESPEC].rows);
-                       n = MAX (n, meta->tables [MONO_TABLE_MEMBERREF].rows);
+                       //n = MAX (n, meta->tables [MONO_TABLE_MEMBERREF].rows);
                        /* 3 bits to encode */
                        field_size = rtsize (n, 16 - 3);
Comment 4 Marek Safar 2012-11-11 05:51:04 UTC
Created attachment 2898 [details]
test assembly
Comment 5 Paolo Molaro 2012-11-13 06:23:00 UTC
Confirmed the change from Marek is the correct fix and pushed the change to master.
Comment 6 Miguel de Icaza [MSFT] 2012-11-13 09:25:12 UTC
Since we are still shipping 2.10, can we ge tthis into 2-10 and mobile-master?
Comment 7 Sebastien Pouliot 2012-11-15 08:18:06 UTC
The MT bots have been experiencing some random problems since that patch got applied to mobile-master.

[19:44:08] <spouliot> doing a git clean, maybe somethings was stale after make clean
[19:44:49] <vargaz> spouliot: try reverting  3c1a338fdd23196e39fee55f8b72eabed37be2e3
[19:45:20] <spouliot> vargaz: could you dupe the issue ?
[19:45:31] <vargaz> spouliot: dupe ?
[19:45:37] <spouliot> duplicate
[19:46:22] <vargaz> no, but that looks like the only commit from that time interval which could cause these problems. if you can repro it, you can try reverting that and see if that fixes it.
[19:46:56] <vargaz> that commit might cause mcs to create broken assemblies.

The bots had signing issues, between last Friday and Tuesday, so they could not do any AOT compilation. Those revisions are still being rebuilt to pinpoint the commit that caused this. The random problems occurs on most (but not all builds) since that commit was applied. No failure is (yet) found earlier than this one. 

I'll re-open the bug once the few last revisions are rebuilt *and* we try reverted this fix.
Comment 8 Sebastien Pouliot 2012-11-15 09:56:15 UTC
Two other (pre-fix) revisions were rebuilt by the bots without any issues (2 other are yet to be rebuilt). So this fix still mark the known start of the random crashes.

Also most (but not all) of the random crashes occurs on the dontlink.app - which is an app that the linker did NOT process, so the metadata was not re-written by Cecil (the original metadta, by the runtime, is being used by the AOT compiler).

Any useful data to get out of this before I trying reverting it from mobile-master ?
Comment 9 Sebastien Pouliot 2012-11-15 10:54:06 UTC
reverted in mobile-master - ios6 branch updated to confirm the "fix"
Comment 10 Miguel de Icaza [MSFT] 2012-11-15 11:36:50 UTC
Perhaps we should revert from master as well, it could be breaking code there as well.
Comment 11 Sebastien Pouliot 2012-11-15 13:03:19 UTC
A newer build (ios6) still has this issue after the revert (so I won't revert this from other branches). Looking at the next candidate...