8
Vote

Out-Of-Process service does NOT detect schema update

description

Once SLAB out-of-process service is started and an EventSource starts to emit events, this service does not detect event schema update of EventSource even if service, process of EventSource and Windows are restarted.

I explain this concretely.
  1. Define an EventSource and a process emits EventEntries with this EventSource.
  2. Configure SLAB out-of-service package and starts to subscribe above EventSource.
  3. Change method signature of above EventSource and rebuild and restart process (e,g, updated EventSchema). Then the process emit the updated EventEntries.
  4. HOWEVER out-of-service recognize received EventEntries have old EventSchema.
I found cache files are generated at $SYSTEMDRIVE%\Windows\temp{Provider-GUID} folder, and out-of-process CAN detect EventSchema update after I delete these files.
And I also find out-of-process service CAN detec EventSchema update if it runs on Console when it starts exe file with "-c" option.

comments

traviscogdill wrote Nov 19, 2014 at 4:35 PM

I'm also encountering this, and it is quite painful during development. Every time i update my EventSource I have to clear the SLAB schema cache, unless i'm not understanding. This was not a problem in some versions past, I just updated to the latest version today and ran into multiple breaking changes including this one.

Another workaround is to change the name of the event source each time (for instance, version your EventSource name using the EventSource attribute, which is fine for production but not so fun during development).

Contrary to the OP remark, my schema updates are not detected even when running in console with "-c".

I am using the built in .NET 4.5.2 EventSource implementation, and the latest slab service:

C:\dev\slab service 2>SemanticLogging-svc.exe -c
Enterprise Library Semantic Logging Service v2.0.1406.1

traviscogdill wrote Nov 19, 2014 at 5:01 PM

Hmm I spoke too soon ... I can't find the cache that the OP mentions. My current work-around has been renaming the EventSource as I mentioned. So far that is the only way I know of for changes to my EventSource to be "detected' by the SLAB service (running in console). Stopping, starting the service and/or application seems to have no affect so I feel there must be a file system cache somewhere but I can't find it.

traviscogdill wrote Nov 19, 2014 at 7:47 PM

Did find the path to the manifest cache ...

private static readonly string ManifestsPath = Path.Combine(Path.GetTempPath(), "7D2611AE-6432-4639-8B91-3E46EB56CADF");

At some point this was modified in TraceEventServiceWorker to correspond to a change in DynamicTraceEventParser
        // listen to new manifests
        this.source.Dynamic.DynamicProviderAdded += m => this.OnManifestReceived(m);
but this event doesn't seem to fire at any point when my app is generating events ... maybe there is some other way to listen to manifest changes ...

traviscogdill wrote Nov 19, 2014 at 10:04 PM

this.source.Dynamic.DynamicProviderAdded is called and correctly gives the manifest if the provider is not previously configured with a manifest. It does not appear to be called if the manifest is changed or re-sent ...

These observations seem to directly contradict the metadata for DynamicTraceEventParser which i copy here:
    // Summary:
    //     This event, will be fired any time a new Provider is added to the table of
    //     ETW providers known to this DynamicTraceEventParser. This includes when the
    //     EventSource manifest events are encountered as well as any explicit calls
    //     to AddDynamicProvider. (including ReadAllManifests).  The Parser will filter
    //     out duplicate manifest events, however if an old version of a provider's
    //     manifest is encountered, and later a newer version is encountered, you can
    //     receive this event more than once for a single provider.
    public event Action<ProviderManifest> DynamicProviderAdded;
Where they say "The Parser will filter out duplicate manifest events" ... i think they may be filtering out NON duplicate manifest events as well ... not sure how they detect a "Duplicate" manifest but it is not working properly when versioning an eventsource by increasing an event id on a particular event, for instance (which should be supported behavior). Adding a new event to the evensource with a new ID did cause it to refresh ... so I think it is just a bit too happy to call a new manifest a "duplicate".

traviscogdill wrote Nov 20, 2014 at 3:57 PM

OK got a response from Vance regarding TraceEvent ability to detect changes to EventSource ... basically we need to follow certain Versioning practices when modifying an EventSource, and then TraceEvent will be able to parse these out properly. These best practices are laid out in his TraceEvent Programmers Guide ... the highlights are:
  1. Events can never remove or change the meaning of existing fields. If you need to so this, you need to make new events that are logically independent from the old events.
  2. If you wish to add new data to an existing event make sure you add it to the END of the existing arguments.
  3. When you add data in this way, you should increment the version number for the event. You can do this for an EventSource by adding an attribute like [Event(ID, Version=1)]. When you don’t provide a version number EventSource defaults it to 0, so you can start with Version=1 for the first update. You can have at most 255 versions (so don’t version unless you need to).

randylevy wrote Nov 20, 2014 at 6:50 PM

Thanks for posting. For future visitors the TraceEvent Programmers Guide is included in the Microsoft TraceEvent Library NuGet Package.

chrduf wrote Dec 12, 2015 at 1:32 PM

While I agree with Vance's stance about best practices for production. In development, especially when starting to define your logging strategy, it's often not practical to follow these best practices. To clear the cache you can set a post build event to clear the manifests.

capcom923 wrote Dec 13, 2015 at 6:15 PM

Great answer to this question. We was driven mad by this issue in recent few month since we have to always update the EventSource Name to let the SLAB detect our changes.

The solution that removing the %SYSTEMDRIVE%\Windows\temp\7D2611AE-6432-4639-8B91-3E46EB56CADF folder is a best way I think. And in my test, the SLAB service will create this cache once it handle the first event. And fortunately, this GUID is the same for all the machine. So, I just made a post build event which is "rd /s /q %SYSTEMDRIVE%\Windows\temp\7D2611AE-6432-4639-8B91-3E46EB56CADF" to remove this folder silently.

BTW, this solution does not always take effect while using -c cmd to start this service in my case. However, this solution is good enough.

Thanks for all of you!!