Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where in Roslyn is a C# 'class' statement emitted as IL?

Tags:

c#

roslyn

I'm trying to modify the Microsoft Roslyn compiler to do some weird things that are beyond the scope of the provided API. But I'm totally new to Roslyn, and Roslyn is huge (4.6 million lines of code), and I'm finding it very difficult to find my way around in it.

Specifically, I'd like to find where the ".class" IL statement is emitted, and in general were the final compilation of a C# class is performed, i.e., the outer structure of the class. (I have found the emitting of inner stuff like methods and expressions.)

EDIT:

@nejcs is correct, there is no such thing as a ".class" IL statement in the emitted code. I was mistakenly thinking of what you see when you use .Net Reflector or dotPeek.

I'll try to explain in more detail what I'm trying to do, and what I'm looking for, in the hopes that will allow me to do what I'm trying to do.

Consider a simple C# class like this:

   public static class Yacks00020001
   {
      public static readonly string s;

      static Yacks00020001()
      {
         s = YacksCore.M0002("Hello world!", 42);
      }
   }

What I want to do is to "create" and emit a large number of these static objects (with different names and different strings, obviously) on-the-fly during the emitter processing. I hope to do this by creating enough simulated data to "fool" the emitter methods that emit class declarations, methods and statements, and calling them with this simulated input.

I think I've found the object that describes a C# class declaration during compilation, it's here:

https://github.com/dotnet/roslyn/blob/45f6e9bc6dd457a5279f0f1b380a70ca8ac0a59d/src/Compilers/CSharp/Portable/Declarations/SingleTypeDeclaration.cs

and it's created here:

https://github.com/dotnet/roslyn/blob/45f6e9bc6dd457a5279f0f1b380a70ca8ac0a59d/src/Compilers/CSharp/Portable/Declarations/DeclarationTreeBuilder.cs#L343

But I'm not 100% sure of this.

And despite the great information provided by @nejcs in his answer, I still can't find where in the emitter processing the class declarations get emitted.

like image 911
RenniePet Avatar asked Dec 06 '25 13:12

RenniePet


1 Answers

There is no .class IL statement. What you are referring to is IL assembler (ilasm) directive for class declaration. Actual declaration of class in assembly is in specific section and is not identified by any keyword. For detailed explanation of CLR sections I would suggest reading part1 and part2.

Since classes are just declarations there is also no compilation of classes. Only when you want to emit to PE stream, you have to retrieve all declared types, methods and other objects to write them to a correct location inside stream. So compilation goes something like this:

  1. Let's start with CompileAndEmit method. This method performs compilation and emits final assembly.
  2. If there are no parse errors, module builder is created. For C# overriden CreateModuleBuilder will be called which will PEAssemblyBuilder with stored reference to compilation's source assembly symbol which contains all other symbols including classes.
  3. CompileAndEmit invokes compilation of method bodies and stores result to module builder.
  4. If everything is successful, CompileAndEmit serialises module builder to PE stream. After some setup SerializePeToStream is called which creates EmitContext with module builder reference which is passed to metadata writer. Writer uses module builder to pull out information about classes and other objects and creates appropriate indices for final storage.

In short Compilation class (CSharpCompilation for C# code) provides source assembly symbol to assembly builder which is then passed around and can be queried for various objects. I don't know what you want to achieve, but you would probably want to modify what is stored inside Compilation itself, since there is not much logic further down the pipeline. I might have missed something, though.

EDIT

Based on question edit I will describe a little bit more in detail, how metadata writer pulls out information about classes. To simplify, I will focus on writing full metadata and ignore differential one.

  1. If we continue from SerializePeToStreamMethod, full metadata writer is created and BuildMetadataAndIL is called.
  2. This method will call CreateIndices which is responsible for creation of internal structures which will be serialised to PE. We are interested in CreateIndicesForModule method.
  3. CreateIndicesForModule retrieves top level types, by eventually calling GetTopLevelType on CommonPEModuleBuilder. You can see that there are various types which are retrieved but we are interested in GetTopLevelTypesCore method.
  4. GetTopLevelTypesCore returns top level types and walks all namespace symbols and return child types. At this point you see that types are retrieved directly from compilation symbols when metadata writer is setting up it's internal structures for serialisation.

As for your concrete problem, I still think it would be better to generate valid compilation object (with correct symbols) and leave emit phase as it is. Otherwise you must be really careful that data is in consistent state or you will get exceptions or invalid PE.

like image 166
nejcs Avatar answered Dec 12 '25 00:12

nejcs



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!