Bug 18880 - Xamarin.iOS startup performance issue
Summary: Xamarin.iOS startup performance issue
Alias: None
Product: iOS
Classification: Xamarin
Component: XI runtime ()
Version: 7.2.0
Hardware: PC Windows
: Normal normal
Target Milestone: 7.2.6
Assignee: Bugzilla
Depends on:
Reported: 2014-04-08 21:29 UTC by Jerome Laban
Modified: 2014-07-04 11:50 UTC (History)
7 users (show)

Is this bug a regression?: ---
Last known good build:

A synthetic benchmark (47.41 KB, application/x-zip-compressed)
2014-04-08 21:29 UTC, Jerome Laban

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Jerome Laban 2014-04-08 21:29:32 UTC
Created attachment 6534 [details]
A synthetic benchmark

The cold-start of an application takes a significant amount of time on lower-end devices, like the iPhone 4 and iPad 2.

The profiling sessions show that the time is spent in generic_trampoline_delegate, a method that will build the trampolines for generic method calls.

I’ve attached a synthetic benchmark that reproduces the issue, which contains thousands of generic methods, in a similar way as our application does work with generic methods.
On an iPad 2, the first call of the chain takes about 1150ms, whereas on the second call, it takes about 4ms (four). The caching mechanism of the mini runtime is working properly, as we can see that the time drops significantly.
However, during the first calls, the time taken to resolve the methods is significant, and seems to linearly increase, in relation to the number of generic types present in the application domain.
As a tentative performance improvement, parallelizing does not seem to have any impact, as when calling the same code on two threads, the first call takes 2330ms, where with the second call, both take 4ms (four).
Note that the ratio between cold and warm time is *very* different with latest Apple devices, like the 5S, where the cold duration drops by a factor of 4.
This has a great impact on the perceived performance of the app for the consumer, even though when the app is warmed up, the performance is great.
Comment 1 Zoltan Varga 2014-04-08 22:00:55 UTC
Thanks for the testcase. 

Checked in a fix to mono master 078dc0321d53f9e161957656550fd10cc41db618/mono-3.4.0 0081c27e0d6473a83cc856abf67c4a42dc21b53d.

It improves the first run of the benchmark from 1.1s to 0.4s for me.
Comment 2 Jerome Laban 2014-04-09 11:10:33 UTC
Thank Zoltan, that's quite an improvement :)

Would you know if that also improves the performance in multi-thread scenarios ?

Thank you,
Comment 3 Zoltan Varga 2014-04-09 12:27:19 UTC
It probably does.
Comment 4 Jerome Laban 2014-04-09 14:56:28 UTC
I'm asking because of this: 


Where there is contention when resolving the generic methods. The work being done inside the lock is pretty significant...
Comment 5 Miguel de Icaza [MSFT] 2014-04-09 15:54:25 UTC
This patch introduced a regression, Mono no longer bootstraps, see:

[1] https://github.com/mono/mono/commit/078dc0321d53f9e161957656550fd10cc41db618#commitcomment-5953321
Comment 6 Zoltan Varga 2014-04-09 15:59:28 UTC
The changes were reverted from master/3.4.0 for now.

@Jerome: Will look at reducing the work done inside the lock.
Comment 7 Zoltan Varga 2014-04-11 23:17:10 UTC
Committed a fixed fix to mono master ea490c5486af6e1ce6ce8b1a117f1d99cf988df0. It will be in a future mt version after some testing.
Comment 8 Zoltan Varga 2014-04-17 12:28:14 UTC
The corresponding change on the 3.4.0 branch is 28145e01f42317e685ad1020a47ba746f164c28b.
Comment 9 Jerome Laban 2014-04-17 14:41:55 UTC
Using the same PoC, the run time is down from 1150ms to 268ms, same hardware.

Great improvement Zoltan, thanks !

Note that the behavior for multi-thread is vastly better, bit still slower than the single-cpu test. (2330ms down to 380ms)
Comment 10 Sebastien Pouliot 2014-06-18 21:14:04 UTC
This fix is part of the 7.2.6 release (in th alpha channel right now).
Comment 11 Mohit Kheterpal 2014-07-04 11:50:10 UTC
As per comment 9, this issue is working fine now i.e. run time is down from 1150ms to 268ms on same hardware. 

Hence closing this issue.