Created attachment 18455 [details]
On iOS, the first invocation of a method on a generic interface is extremely slow.
Using the attached sample, on an iPhone 5 (ARMv7), the first invocation is 1060 times slower than the second. On an iPhone 7 (ARM64), it is 340 times slower.
The code in the sample is a repetition of :
where XXX counts to 5000.
This impacts the startup of a large application pretty severely.
This happens because we lazily resolve interface methods, it's an expected cost.
We're constantly working to optimize startup and given how fast interface dispatch is, such huge difference is expected.
Do you have a specific app where interface method resolution is disproportionately affecting your startup time?
I understand this is an expected cost, yet it's very high, and it's even worse when multithreading is involved (because of mono_loader_lock).
I picked this sample specifically, but the same happens the using async methods, with the AsyncTaskMethodBuilder.
Most apps we create exhibit this specific issue, in the same way most F# apps which rely heavily on generics should be impacted.
It is possible to do something about this, but instantiating 1000s of types is never going to be fast, the runtime is designed to do the same thing multiple times, not a lot of things once.
Keeping the bug open.
@zoltan thanks for the update. That's the thing though, the startup of an application is all about the first use of a lot of types and methods, either from the BCL, or the app itself. Capturing (and non-capturing to a certain extent) C# lambdas in generic types, as well as async methods are also all about using new types for display classes.
I've also noticed that on an iPhone 7, with large apps, most leaves of stack traces shown by Instruments are mutexes locks or unlocks. Could some cases benefits from rwlocks instead of mutexes ? Maybe not the loader lock though, which is used at a very large number of unrelated locations, but ones like the domain lock or image lock.
I got some news, not great, but good. We're in the process of landing some loading scalability improvements to the runtime.
This won't fix the fixed initialization cost, but it will behave much better on multi-threaded setups like yours.
Funny story, I was just informed that work I did for the current cycle might actually help here. I did change how we handle interfaces on arrays to be lazier.
This should reduce the type loading cost by some.
We must re-evaluate this issue once a Xamarin.iOS with the above changes ship.
@kumpera a few reasons this is still slow:
- we create the full vtables for the array classes even if only one method is required, this
requires the inflation of 25 generic methods per class.
- the inflated methods are kept in one hash which grows to 125k entries when using the testcase.
- due to the creation of all these interfaces, we allocate 80mb of memory to hold