Bug 26384

Summary: Assembly.GetType() fails to parse type names containing commas
Product: [Mono] Runtime Reporter: Eirik Tsarpalis <eirik>
Component: ReflectionAssignee: Aleksey Kliger <aleksey>
Severity: normal CC: aleksey, mono-bugs+runtime, vargaz
Priority: ---    
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Tags: Is this bug a regression?: ---
Last known good build:

Description Eirik Tsarpalis 2015-01-24 17:36:49 UTC
Consider the following F# type definition:

type ``Comma, Separated`` () = class end

This will generate a type whose name contains a comma character. However, when I try do some reflection on it:

let t = typeof<``Comma, Separated``>

t.Assembly.GetType(t.FullName, true)

The latter fails with the following error:

System.ArgumentException: Type names passed to Assembly.GetType() must not specify an assembly.
  at (wrapper managed-to-native) System.Reflection.Assembly:InternalGetType (System.Reflection.Assembly,System.Reflection.Module,string,bool,bool)
  at System.Reflection.Emit.AssemblyBuilder.GetType (System.String name, Boolean throwOnError, Boolean ignoreCase) [0x00000] in <filename unknown>:0 
  at System.Reflection.Assembly.GetType (System.String name, Boolean throwOnError) [0x00000] in <filename unknown>:0 
  at <StartupCode$FSI_0020>.$FSI_0020.main@ () [0x00000] in <filename unknown>:0 
  at (wrapper managed-to-native) System.Reflection.MonoMethod:InternalInvoke (System.Reflection.MonoMethod,object,object[],System.Exception&)
  at System.Reflection.MonoMethod.Invoke (System.Object obj, BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00000] in <filename unknown>:0
Comment 1 Zoltan Varga 2015-01-24 19:23:16 UTC
Fixed in mono master b21fe11d6c18c8d0dc26f8959f439ebc354be9bf.
Comment 2 Zoltan Varga 2015-01-24 19:23:34 UTC
Wrong bug.
Comment 3 Aleksey Kliger 2015-10-08 14:38:14 UTC
t.FullName on .NET has the comma escaped, not so on Mono.

Additionally Type.GetType(String) needs to handle escaped characters properly so that we can roundtrip.
Comment 4 Aleksey Kliger 2015-10-08 16:05:49 UTC
Self contained reproduction in C#

using System;
using System.Reflection;
using System.Reflection.Emit;

class Program
    public static void Main(String[] args)
        var nm = new AssemblyName("h");
        var ab = AssemblyBuilder.DefineDynamicAssembly(nm, AssemblyBuilderAccess.Run);
        var mb = ab.DefineDynamicModule("h", true);
        var tb = mb.DefineType("Foo\\, Bar", TypeAttributes.Class | TypeAttributes.Public);

        var ty = tb.CreateType();


        var a = (System.Reflection.Assembly)ab;

        var tyNoEsc = a.GetType("Foo\\, Bar");

        var tyEsc = a.GetType("Foo\\\\\\, Bar");


// Expected output:
// Foo\\\, Bar
// Foo\\\, Bar, h, Version=, Culture=neutral, PublicKeyToken=null
// False
// True
// Actual output:
// Foo\, Bar
// Foo\, Bar, h, Version=, Culture=neutral, PublicKeyToken=null
// True
// False
Comment 5 Aleksey Kliger 2015-10-14 18:08:55 UTC
It's more than just Type.GetType().

The components of a type name (namespace, outer class name, nested type names, generic type arguments) are _allowed_ to contain the characters ,+[]&*\ provided that they're preceded by a backslash.   Names where this escaping has been done can be unambiguously parsed into their components.  These are called the "displayed" names.  Names _components_ stored without the escaping are "internal" names.[1]  

It would be grand if we could just work with the displayed names everywhere and be done with it.  But we can't.   Because the System.Reflection.Emit API  operates on internal names: you get to call 
   var tb = ModuleBuilder.DefineType("Comma, Separated");
and you get back a type with displayed name "Comma\, Separated" which is what tb.FullName happens to be, and what Type.GetType(String) wants to consume.

So in the guts of the reflection API we have to keep a pretty clear distinction whether a particular string argument is a displayed name or an internal name.  Fun fun.

[1]: The reason it's components is because if you take the displayed name "Outer\+Name+InnerName" and break it up into components you get "Outer\+Name" as the outer class and "InnerName" as the nested type name.  If you first unescape and the break it up you get "Outer" class containing a nested "Name" containing an "InnerName"
Comment 6 Aleksey Kliger 2015-10-21 16:55:08 UTC
Fixed in master 5fe98cbad47087a10e1df7fc74d6997bb0e22b66