-
I have attached the source code for the Web Graphics presentation to this blog post.
I have corrected the problem with IE9 and Chrome.
If you have questions, contact me at jons@magenic.com.
-
I ran out of time in my presentation and did not get to the SOS segment. I promised to write a blog post on the topic and here it is.
First, we have to define what a "memory leak" in .NET means. In an unmanaged language/environment, a memory leak occurs when the logic asks for a chunk of memory to be allocated and forgets to give it back when it is no longer needed. Very often, the chunk of memory is lost when it goes out of scope without being deallocated. At that point there is no way to reference the chunk and it just sits there. If the logic that was careless with such things is iterative (and it usually is), the amount of "lost" memory grows and grows and grows. If the logic is contined within a long-running program such as might be running on a server, eventually, the program (and possibly the server) must be restarted to reclaim the memory.
In a managed memory aproach, the garbage collection process would look at all of the objects in memory and detect that these "lost" chunks had no references (either directly or indirectly) from the running code. At this point, the garbage collector would re-claim the chunk and it would no longer be lost. How can you have a memory leak in a managed memory system? Here is one way how it is done. An object, which we will call the parent, creates one or more child objects. The parent exposes an event, Bam; each child attaches an event handler to the Bam event of the Parent. Sometime later, the Parent gets rid of some number of child objects (but does not unhook the event handler). The simplistic view at this point is the Parent no longer has a reference to the child and therefore the child is ready for garbage collection. T'ain't true. To understand why we have to take a deeper look at the event hander mechanism.
In the parent, the event actually points at an object, a multicast delegate. This object contains a reference to the parent object that is hosting the event and an invocation list that points to each of the objects that subscribed to the event. When the parent creates a child and that child subscribes to the event exposed by the parent, there are two references to the child: the obvious one in the child collection maintained by the parent and a second, not so obvious one, in the invocation list of the event exposed by the parent. When the parent removes the reference to the child from the obvious collection, it is only taking care of the obvious reference but is ignoring the non-obvious reference. Be warned: just because you do not see it, does not mean that it will not come back and bite you in a tender spot.
You may think that the object is gone but the garbage collector knows better. The parent (that is in the active set of objects) holds a reference to the event delegate and the event delegate holds a reference to the "no longer loved" child. In the eyes of the garbage collector, the child is still active. The logic cannot get to the child (at least not easily) and the garbage collection cannot collect it: the child is a memory leak, pure and simple.
My guess is that the vast majority of programs have "low volume" memory leaks but unless the leak is iterative and there are many interations, no one will notice. It is only when the logic is iterative and there are many iterations during the lifetime of the program (or the child is a very heavy child), that anyone will notice.
There are other ways to "lose" memory but all of them involve a reference that does not go away. The vast majority of memory leaks can be traced to the above scenario.
Second, now that you know what to look for, how do you track down? There are two ways: inside-out and outside-in. Inside-out means that you start looking at each instance in the code where a class adds an event handler to an event, checking to make sure that the event handler is removed properly. This is a good thing to do but you might have hundreds of such events and looking at each one might not be the most efficient way of handling the problem. The outside-in approach is a debugging approach using the SOS debugger.
Here is a top-level description of what we are going to do. At a breakpoint, we are going to fire up SOS and ask it for a population of object types. We will get a list of object types along with the count of and memory allocated to each of those types. Chances are, we will see one or more types with counts that vastly exceed our expectations. We pick one of these objects and ask the debugger to tell us who owns this object. The debugger will trace reference to the object up to a top-level object. If we see a delegate type in that list, we can proceed under the assumption that an event handler is involved (as described above).
At this point we may have all of the data that we need to track down the problem. We can get the name of the event from the debugger output and we can revert to the inside-out approach. In fact, we can go back to one of the debugging windows (locals, watch, quck watch) and expand the event object fully to see the invocation list. The "lost" children are all in there if we only had known to look. The solution is to remove the event handler from the parents event as a part of the disposal process. In fact it might be neccessary to implement IDisposable and call the Dispose method as a part of getting rid of the child to ensure that the child can be properly collected.
Third, now that we know the how and why, lets take a look at the mechanics:
- At a breakpoint, bring up the Immediate windows (Ctrl-Alt-I).
- Type ".load sos" to load the sos.dll and get access to it.
- Type ">log <your log file name here> to capture all of the Immediate output to a file
- Type "!dumpheap -stat" to get the statistics for types in the heap.
- Say "Holy #$%^%$, that' s a lot of type <insert type that has way too many instances>
- Type "!dumpheap -type <insert full name type that has way too many instances> [see note below]
- Pick one of the entries (the earlier in the list, the better -- these are typically the oldest items that should have been collected)
- Type "!gcroot <insert address of and instance of a type that has way too many instances>
- For each object in the reference trace, examine the contents of the object by typing !do <address of the object> or !da <address of the object> if the object is a list or array
If you have a bad memory leak, it is possible that there are many, many instance of the lost child. The dump heap command will dump all of these when the dumpheap -type command is used. There may be a way to redirect that output to a file but I do not know what it is. There are -min and -max parameters that establish memory address limits but you would have to guess at least the first few times that you used the command. Note, however, that the name of the type alone may be sufficient to pin-point the problem. It is also quite possible that the output of !gcroot will point you at the problem immediately without the need to look at the whole reference stack.
If you need more information do a search for ".net sos".
-
Here is the sample code that I used in my session with Carl Franklin of Dot Net Rocks.
-
See the attached file. It is a zip file to keep my blogging server happy. Source code was just posted in a post earlier today.
-
I have attached the source code from my Code Camp talk as a zip file. You will have to download and install the Reactive Extensions from the following web page:
http://msdn.microsoft.com/en-us/devlabs/ee794896.aspx
There are different versions based upon the version of .NET that you are using. The code was developed against .NET 3.5 SP1.
I will attach the PowerPoint slides in a subsequent post.
-
I am working on a project that uses MSTest for unit (and integration) testing. In the main project (Win Forms) there is a subdirectory that holds some data files that are needed by the program and several of the unit tests. Remember that MsTest creates a “shadow subdirectory” for each test run that contains everything that the test needs to run and reports that path as the base application path. For our tests to run, we need to create a Data subdirectory that holds copies of these data files. To make these data files available to the unit tests, we added some lines to the solution testrunconfig file:
1: <Deployment>
2: <DeploymentItem filename="DeploymentMain\Data\File1.txt" outputDirectory="Data" />
3: </Deployment>
The intent here was that File1.txt (plus several other similar files) would be copied to the Data subdirectory under the out directory for the test run. It did not happen. MsTest copied the file directly to the Out directory and did not create a Data subdirectory. All of the unit tests that depended upon the existence of a fully stocked Data subdirectory failed.
I looked for answers on the web with no luck. To make a long morning’s work short, I created a small test solution and tried all of the variations that I would think of. I finally stumbled on to the following "trick”: you cannot copy a file to a subdirectory but you can copy a subdirectory to a subdirectory. The following deployment lines accomplish just that:
1: <Deployment>
2: <DeploymentItem filename="DeploymentMain\Data\" outputDirectory="Data" />
3: </Deployment>
Note that the back slash at the end of the file name attribute value is essential as is the “outputDirectory” attribute for this to work.
UPDATE: An additional point: it seems that the directory lines need to be first in the <Deployment> section. I had a situation where the line was not first and it was as if the “outputDirectory” attribute was not there.
And now you know.
-
I do a lot of development using VirtualPC images. It keeps my base laptop relatively clean and allows my VPC images to join client domains. Recently, I rolled on to a project that had a very large VHD file. The file was too large to host on my laptop, even after compressing the VHD. Initially, I created a VPC with the VHD hosted on a USB external drive but the performance was simply not acceptable. The project did have a spare machine running Windows Server 2008 R2 Enterprise, with the Hyper-V role installed. Luckily, the VHD format is common to VPC and Hyper-V.
I created a Hyper-V virtual machine based upon the VHD. Everything worked except for the mouse. After a lot of banging around, I came to the realization that the VPC extensions were not compatible with the Hyper-V setup. In the end, I had to take the VHD image back to VPC, un-install the VPC extensions, return the VHD to Hyper-V, and install the Hyper-V extensions to get the mouse to work properly. Now, everything is goodness.
-
See above.
-
More than a bit late but this has been a very busy time for me: this is the zip file containing the actual code as a VS2008 Solution.
-
More than a bit late but this has been a very busy time for me: this is the zip file containing the presentation.
-
Yesterday, I was able to install one of the packages that was trapped in an endless “disk resources” loop using the following sequence:
- Un-install Virtual Machine Additions.
- Install package 1.
- Re-install virtual Machine Additions.
- Realize that I forgot to install package 2.
- Slap forehead!
- Install Package 2.
This would suggest that the issue with the Virtual Machine Additions is some how related to the usage of the additions. In the above case, I was working with a fresh boot of the virtual machine and, as far as I can remember, did not go outside of of the virtual machine environment, meaning that I did not move my mouse to the physical machine to check email during this process. (Email was running on the physical laptop during this entire time.)
-
I am working at a client that requires the consultants to work on a PC that is a member of their domain. I and the other consultants do not what to join our physical laptops to their domain. Enter Virtual PC 2007 SP1. Most of us have installed Windows Server 2008 on the Virtual PC image and then joined that virtual image to the clients domain. This is a very workable solution, with one minor problem.
The problem comes in trying to install certain packages. The installer gets to the point where it is ‘determining disk requirements”. It puts up a dialog box with the message "Please wait while the installer finishes determining your disk space requirements." In normal circumstances, this goes up and comes down in less than a blink of an eye. In my situation (and several others), the dialog box comes up and stays, and stays for a very loooooooooooong time.
Searching the web found others with the same problem. One suggestion was to run the install package using the follow command line:
msiexec /i filename.msi
That did not help me. One of the packages that I was trying to install was the indispensible TestDriven.Net. It is an exe file and it just hung on the “please wait” message.
OK, I have made you wait long enough for an answer. I stumbled on this by accident. I un-installed Virtual Machine Addins, tried the install of TestDriven.NET and it worked. I re-installed the Addins and it failed in exactly the same way. I do not know if this will solve the problem for every install but it does provide a ray of hope.
-
In the play and subsequent movie, A Streetcar Named Desire, the playwright Tennessee Williams has the character Blanche DuBois say, “I've always depended on the kindness of strangers.” That is also quite applicable to the technical community. I asked a question on the MSDN forums and Oleg V. Sych at http://www.olegsych.com/ provided the answer:
----------------------------------------------------------------------------
Jon,
I think that changing your host class to inherit from MarshalByRefObject should resolve the serialization issue and allow you to access the host from a "hostspecific" template running in a new AppDomain.
Oleg ----------------------------------------------------------------------------
I tried it and it worked. Thanks, Oleg.
This is kind of stupid on my part. As soon as I read the above answer, I realized what was going on. The error message kept saying that the T4 engine class was not serializable. Intellectually, I knew that I must be doing something wrong but I kept thinking that Microsoft must have screwed up. It turns out the the engine class was serializable but it was holding a reference to something, my custom host class, that was not capable of moving across the remoting boundary to the other (newly created) AppDomain. In my defense, I have not done any serious remoting for several years. Oleg’s answer brought a lot of that experience flooding back.
What I have been doing is thrashing, trying to find an answer. What I have to do now is to drop back into a unit testing mode to get all of the piece parts working. I suppose that I should add one or more unit tests on the use of AppDomains. This has been an interesting distraction, but still a distraction.
-
I have been tracking down why I could not get my custom host for T4 to work. I now know the specific condition that causes the failure and that knowledge leads to a short-term work around. This is the question that I posted on the MSDN forums on Visual Studio extensibility:
--------------------------------------------------------------------------------------
As I noted in an earlier post :Source Code for a Working Implementation of ITextTemplatingEngineHost, I am trying to build a custom host for T4. The answer to my query about available source code for a working custom host lead to the code in the MSDN documentation. That code worked. Using that code as a baseline, I was able to isolate the cause of my problem.
My situation is that I want to expose the custom host in the template file. Following the example in MSDN, I wrote the following code for the ProvideTemplatingAppDomain:
1: public AppDomain ProvideTemplatingAppDomain(string content)
2: {
3: // This host will provide a new application domain each time the
4: // engine processes a text template.
5: // -------------------------------------------------------------
6: return AppDomain.CreateDomain("Generation App Domain");
7: }
This works fine if I supply the current AppDomain and include the "hostspecific=true" parameter on the template directive. It also works just fine if I supply a new Appdomain but do not include the "hostspecific=true" parameter. It DOES NOT WORK if I supply a new AppDomain and also include the "hostspecific=true" parameter. The issue is that the Engine class is not serializable. Apparently, it must be serializable to move across the AppDomain boundary.
I have to specify the "hostspecific=true" parameter to make the functionality that I am trying to provide work. In the short term, I can work around this by re-using the AppDomain, but I wonder how that will affect Visual Studio performance. Am I right in thinking that the custom tool will run in a separate process space? If it does, the AppDomains will only live for the duration of the code generation process; there would be no accumulation in memory. But the author of the MSDN sample took some effort to mention that creating a searpate AppDomain was a good idea. Is there a way to get a new AppDomain and "host specific" at the same time?
Jon Stonecash
--------------------------------------------------------------------------------------
I am going to proceed on the basis of re-using the current AppDomain, keeping an eye out for any performance hits. Maybe someone will take notice of my question and provide a longer-term approach.
-
In my last post, T4 Parameter Passing: You Can’t There from Here, I complained that the TextTransformation class did not have direct knowledge of the host. I tried a couple of ways to get around this. Before I do that, I need to list some definitions:
- Grandparent class: This is the TextTransformation class in the .NET library class that all templates must inherit from.
- Parent class: In this situation, this is the class that inherits from the Grandparent class and is the parent class to the templates that we intend to create. This is the class that provides some standard services such as locating the control.xml file.
- Child class: Again, in this situation, this is the class created by the custom tool. This is the class that actually contains the statements to generate the application code from some kind of model data. To make use of the services in parent class, this class must inherit from the parent class by specifying the “inherits” parameter with a value that is the full name of the parent class. Note that the template must also contain an assembly directive that points to the assembly that contains the parent class and the project that contains the template must also contain a reference to that same assembly.
First, in the parent class, I added some code to hold the reference to the host. This is a duplicate of the code that would have been added to the generated child class if the “host specific” parameter was added to the template directive.
1: private Microsoft.VisualStudio.TextTemplating.ITextTemplatingEngineHost hostValue;
2: public virtual Microsoft.VisualStudio.TextTemplating.ITextTemplatingEngineHost Host {
3: get {
4: return this.hostValue;
5: }
6: set {
7: this.hostValue = value;
8: }
9: }
The hope here was that Visual Studio would somehow “see” the property and store the reference. The problem is that it appears that Visual Studio does not store the reference unless the “host specific” parameter is present on the template directive and when that parameter is present, it adds a duplicate of the above code to the generated child class overriding the above code in the intermediate base class. You can’t get there from here. Alright, to be fair, I could not figure out a way to make this work. Your mileage may vary, etc.
Second, Since I could not figure out how to store the host reference directly in the parent class, I decided to use reflection in the parent class to find the method in the generated child class:
1: private ITextTemplatingEngineHost m_Host = null;
2: private ITextTemplatingEngineHost GetHost()
3: {
4: if (m_Host == null)
5: {
6: Type effectiveType = this.GetType();
7: PropertyInfo info = effectiveType.GetProperty("Host", typeof(ITextTemplatingEngineHost));
8: if (info == null)
9: {
10: throw new InvalidOperationException("Could not find Host property");
11: }
12: m_Host = info.GetValue(this, null) as ITextTemplatingEngineHost;
13: if (m_Host == null)
14: {
15: throw new ArgumentNullException("Host property does not have a value");
16: }
17: }
18: return m_Host;
19: }
This actually works just fine. This is a little weird, but I have done worse things. This assumes that the template directive for the template has the “host specific” parameter set to true; if it is not set, the above code will throw an exception. I also have some logic in the parent class to find the Control.XML file, given the reference to the host file. In this case, it is finding the file in the same sub-directory that holds the template. Obviously, I could make this logic more complicated and search up and down the file hierarchy, or look up a registry entry, or make a web service call, or yada yada yada.
I placed the DLL file holding the parent into the GAC but you could also use an absolute or relative path to get to the assembly. I used the GAC because the sample code that I based my logic on did that, but in the interests of making the logic in the template as simple and un-coupled as possible I think that the GAC solution is the way to go. Again, your mileage may vary.