Wednesday, October 17, 2007

Unmanaged Structures, Padding and C#, Part 1

One of the nice things about the Spooler API is the ability to be able to use the API to receive notification events when something happens on a printer such as having a job printed. Using the FindNextPrinterChangeNotification method you get access to a PRINTER_NOTIFY_INFO structure and an array of PRINTER_NOTIFY_INFO_DATA structures shown below.


struct PRINTER_NOTIFY_INFO
{
DWORD Version;
DWORD Flags;
DWORD Count;
PRINTER_NOTIFY_INFO_DATA aData[1];
}

struct PRINTER_NOTIFY_INFO_DATA {
WORD Type;
WORD Field;
DWORD Reserved;
DWORD Id;
union
{
DWORD adwData[2];
struct
{
DWORD cbBuf;
LPVOID pBuf;
} Data;
} NotifyData;
}
Now this is all you need if you are planing on writing your application in C++, but lets say you want to write your application in C# or some other .Net language, all you are going to get back from your interop version of the FindNextPrinterChangeNotification method will be an IntPtr object. How do we get to the data?

Well, fortunately there is a handy methods in the Marshal class which is part of the System.Runtime namespace. The method is PtrToStructure which will take in two parameters, the IntPtr to convert and the type of the structure that you want to map to. The return type of this function is the object that has been created.

So it sounds fairly straight forward then, all we need to do here is to define our structures and we are away. Looking at the documentation for the data that is being returned to us we see that we have a single infor structure followed by an array of data structures. Starting out with the PRINTER_NOTIFY_INFO_DATA structure we might come out with the following on the first pass.

public struct PRINTER_NOTIFY_INFO_DATA
{
public ushort Type;
public ushort Field;
public uint Reserved;
public uint Id;
}


Two problems come up here: The Union statement, and the fixed length array of the adwData. How do you do this in C#? There is no such thing as a union keyword in C#, and fixed length arrays can not be specified in structures. The answer here is defining an explicit layout for your structure via attribution to solve the union problem and the MarshalAs attribute to define a fixed length array.

The defining a structure with an explicit layout rather than the default sequential layout gives you more flexibility in defining where each field is located in memory. Using this you can define the offset in bytes of each field from the start of the structure. This allows more than one field to share the same location in memory. (Note: Once a structure is declared as having an explicite layout, all fields must be explicitly define therefore you should only use this type of layout if you actually need to, e.g. have a union in your structure.

Fixed size arrays can be defined using the MarshalAs attribute and setting the type as a ByValArray and setting the size to the number of items in the array.

Putting this into practice you would come up with the following.

[StructLayout(LayoutKind.Explicit)]
public struct PRINTER_NOTIFY_INFO_DATA
{

[FieldOffset(0)]
public ushort Type;

[FieldOffset(2)]
public ushort Field;

[FieldOffset(4)]
public uint Reserved;

[FieldOffset(8)]
public uint Id;

[FieldOffset(12)]
[MarshalAs(UnmanagedType.ByValArray, SizeConst=2)]
public uint[] adwData;
[FieldOffset(12)]
public uint DataLength;
[FieldOffset(16)]
public IntPtr DataBuffer;
}
Using the same technique the PRINTER_NOTIFY_INFO structure can also be written.
[StructLayout(LayoutKind.Sequential)]
public struct PRINTER_NOTIFY_INFO
{
public uint Version;
public uint Flags;
public uint Count;
[MarshalAs(UnmanagedType.ByValArray, SizeConst=1)]
public PRINTER_NOTIFY_INFO_DATA[] aData;
}
While it might look now like we are getting closer unfortunatelly this will still not work in all cases, and it is down to the way that memory is allocated for different things. Basic data types must appear on byte boundaries that are multiples of their size. a char datatype which is only 1 byte long can appear every byte in memory, a short (2bytes) every second byte, and an integer (4bytes) every 4th byte. If there are gaps these are simply ignored memory addresses. In addition to this rule, the size of a structure must be in multiples of its largest base data type.

The above code will work on 32bit systems. Here memory pointers are 32 bits (duh) or 4 bytes, so if we apply the above rules to an info structure with 2 data structures we would get the following in memory.

0x0000 PRINTER_NOTIFY_INFO.Version
0x0004 PRINTER_NOTIFY_INFO.Flags
0x0008 PRINTER_NOTIFY_INFO.Count
0x000c PRINTER_NOTIFY_INFO_DATA[0].Type
--Struct can start here as IntPtr = 4bytes.
0x000e PRINTER_NOTIFY_INFO_DATA[0].Field
0x0010 PRINTER_NOTIFY_INFO_DATA[0].Type
0x0014 PRINTER_NOTIFY_INFO_DATA[0].Reserved

0x0018 PRINTER_NOTIFY_INFO_DATA[0].Id
0x001c PRINTER_NOTIFY_INFO_DATA[0].DataLength
0x0020 PRINTER_NOTIFY_INFO_DATA[0].DataBuffer
0x0024 PRINTER_NOTIFY_INFO_DATA[1].Type
--
The second item can start as still on 4byte margin.
0x0026 PRINTER_NOTIFY_INFO_DATA[1].Field
0x0028 PRINTER_NOTIFY_INFO_DATA[1].Type

0x002c PRINTER_NOTIFY_INFO_DATA[1].Reserved
0x0030 PRINTER_NOTIFY_INFO_DATA[1].Id
0x0034 PRINTER_NOTIFY_INFO_DATA[1].DataLength
0x0038 PRINTER_NOTIFY_INFO_DATA[1].DataBuffer

Now imagine the same data in a 64 bit system. In this case a pointer is 8bytes long.

0x0000 PRINTER_NOTIFY_INFO.Version
0x0004 PRINTER_NOTIFY_INFO.Flags
0x0008 PRINTER_NOTIFY_INFO.Count
0x000c
Padding
-- Because the data structure has a 8byte
pointer, it must start on offsets of 8

0x0010 PRINTER_NOTIFY_INFO_DATA[0].Field
0x0012 PRINTER_NOTIFY_INFO_DATA[0].Type
0x0014 PRINTER_NOTIFY_INFO_DATA[0].Reserved
0x0018 PRINTER_NOTIFY_INFO_DATA[0].Id
0x001c PRINTER_NOTIFY_INFO_DATA[0].DataLength
0x0020 PRINTER_NOTIFY_INFO_DATA[0].DataBuffer
0x0028 PRINTER_NOTIFY_INFO_DATA[0].Field
0x002a PRINTER_NOTIFY_INFO_DATA[0].Type
0x0030 PRINTER_NOTIFY_INFO_DATA[0].Reserved
0x0034 PRINTER_NOTIFY_INFO_DATA[0].Id
0x0038 PRINTER_NOTIFY_INFO_DATA[0].DataLength
0x003c PRINTER_NOTIFY_INFO_DATA[0].DataBuffer

Unfortunatelly this is still not correct. Looking right back to the original C structure definition we can see that inside the union we have a structure contained inside itself. This structure to must be subject to the rules defining the size of structures. This also causes another problem. If this internal structure is dynamic in size due to the OS's pointer size how do we explicitly define the location of the data for the data in the union clause.

The solution to the problem tommorrow in Part 2

No comments: