Alexa Skills - A Little Easier

I delayed writing this post because I can't remember the article I was reading that triggered the thought about the new library I've written. I know I was reading a post about Alexa skill development (I know, shocker!) and although it used Alexa.NET I realised that for a beginner there is a lot of wiring up in order to get an initial response from your skill.

Actually, maybe that's not quite right. It's not that there's a lot of actual wiring - but there's a lot of understanding about how all the pieces fit together in order to just get your shiny new Alexa device to say something you wrote.

And the thought wouldn't go away. If you're a new developer who just wants to try it out - you want to be able to go from zero to functional as fast as possible. Reducing that barrier to entry, making the "how" as small as possible, is what is going to allow developers to get excited about it.

TL;DR

I used source code generators to develop Alexa.NET.Annotations, a library that allows you to build skills using class and method attributes.

Maybe it's as easy as...

So my first thought was that this had already been done, kinda. I remembered a blog post about generating AWS Lambdas using attributes. I found the article by Norm Johanson and started having a look at the Lambda annotations library. The thought was that if this was an extensible model in some way I could create new attributes that said "this is an Alexa skill".

I really like the new library, but actually the power behind it is what made me think it wasn't a fit for what I was trying to do. Lambda has to be incredibly flexible and reducing code without screwing up that flexibility is a tough ask, and it has to work in a large range of scenarios. The more I looked at the library the more I felt I was going down a rabbit hole I didn't need to go down for my task, and if I'm struggling to fit the pieces together - surely I'm less likely to fail in my task of helping others?

It's worth noting: I still feel that there was a route where I could have made this work. It all felt right - but I think my understanding of how the library was generating everything and piecing it together got confused with what I needed for the Alexa skill wiring. If I wasn't trying to do this on my own maybe? If I'd talked it through with someone else maybe there was a route where there was a combination of existing and new to make this work, but I digress.

Imitation is the sincerest form of flattery

Okay - so the existing annotations library wasn't the path I wanted to take. But I REALLY liked the idea. Attributes on a class definitely felt like the right approach. And I'd worked on Roslyn Analyzers already - so I knew enough that I thought source code generators wouldn't be a huge learning curve.

So I started a new project, and began by defining an example of what I felt should work. This was just to prove my theory - so what would "good" look like as a first milestone?

Well I already have a helper library within the Alexa.NET suite that helps you build Alexa skill pipelines - Alexa.NET.RequestHandlers helps you encapsulate skill functionality into classes (when it launches, when a particular intent is called etc.) and boils all that down it to a single method. There's no point re-inventing all that, it works well (even if I do say so myself) and it means I'm producing less code in my generator. So that's my output.

So what aobut my input? Well I want each method to be able to say "I want this in the pipeline", but moving up and down the roslyn model to figure out how to group them seems a bit painful - so a class attribute to say "Hey, this is the pipeline". Cool - and if there's one thing I know about source code generators it's that I need a partial class. So my starting point is I want this to work

[AlexaSkill]
public partial class RockPaperScissors
{
    [Launch]
    public SkillResponse Launch(LaunchRequest intent)
    {
        return ResponseBuilder.Ask("What's your move? Rock, Paper or scissors?", new("What's your move?"));
    }

    [Intent("MakeMyMove")]
    public async Task<SkillResponse> PlayAGame(IntentRequest intentRequest)
    {
        return ResponseBuilder.Tell("You Win", null);
    }
}

This will create a pipeline, and then two handlers in the pipeline - one for launch and one for the intent. Super.

Source code generators, and Roslyn

Straight off I want to say this, source code generators are awesome! Build code from code in a way that helps everyone else. Dramatically reduce the runtime cost of logic that we can calculate at build time - they're great.

Second thing I want to say? The documentation is...erm...well...I found it to be either end of the spectrum. Either it was super high level and told me what I'd been able to figure out from talks, or it threw me into code samples where suddenly I felt I had to know a lot more than I already did. Thank goodness for the awesome .NET community.

I spent a lot of time reading, and re-reading, this Andrew Lock series of posts and used the information he'd put together as a guide to try and figure out the bits I didn't know. Incremental generators were new to me, but the idea made sense.

Then when I was ready to throw my laptop out the window, because I was constantly restarting Visual Studio, I found out he'd also produced this amazing series on testing them which introduced me to snapshot testing with Verify and that unlocked a much faster speed of learning and testing and building.

Now most of my learning was try, produce output, see output was wrong, repeat. I mean a LOT of it was doing that. But there's one thing I want to mention because it's something I think a lot of developers will disagree with. The way you produce the output.

If you look at most examples out there on source code generators you'll see they have the same setup, then you dump a string of C# as a new output and add it to the compilation. I mean that's what you HAVE to do. But when I say a string of C# - I mean that there's a string.Format in there somewhere or a StringBuilder or (as is the case in the Lambda Annotations library) some template files with the logic in them.

I appreciate that way of working - but honestly? I just find it incredibly unpleasant to work with. Very much a personal thing. But when I'm trying to figure out how my output is going to change - strings in strings in strings just gets complicated quickly, and (mnaybe this is where it's a me thing) I love to refactor my code - and then I miss a bracket in a string somewhere and I've got failing tests and it just...yeah, I don't like it.

So what have I done with my Annotations? I've used Roslyn. It's WAY more verbose - and I spent a lot of time at http://roslynquoter.azurewebsites.net/ figuring out how to build things. But it meant that I could easily refactor, I could clearly see how I was building up complex calls, and I just felt like I was a bit more in control of everything.

It does mean I have code like this (where SF is an alias for SyntaxFactory)

  var pipelineInvocation = SF.InvocationExpression(
                    SF.MemberAccessExpression(SyntaxKind.SimpleMemberAccessExpression,
                        SF.IdentifierName("LambdaHelper"),
                        SF.GenericName(SF.Identifier("RunLambda"),
                            SF.TypeArgumentList(
                                SF.SingletonSeparatedList<TypeSyntax>(SF.IdentifierName(cls.Identifier.Text))))))
                .WithArgumentList(SF.ArgumentList());

but after a while that became second nature to me. The final code still needs tidying up and moving into seperate files for clarity, but I know I can do that with zero change to the output. And yes it still boils down to a C# output thanks to this little method

internal static string ToCodeString(this SyntaxNode token)
{
    var sb = new StringBuilder();
    using var writer = new StringWriter(sb);
    token.NormalizeWhitespace().WriteTo(writer);
    return sb.ToString();
}

So after a while (I don't know how long it took me as most of my side project work is done half an hour a day in a local coffee shop before I start work, so it's slow but steady progress) I finally ended up with my example class generating this kind of output

public partial class RockPaperScissors
{
  private AlexaRequestPipeline _pipeline;
  public virtual Task<SkillResponse> Execute(SkillRequest skillRequest) => _pipeline.Process(skillRequest);
  public void Initialize()
  {
      _pipeline = new AlexaRequestPipeline(new IAlexaRequestHandler<SkillRequest>[]{new LaunchHandler(this), new PlayAGameHandler(this)});
  }

  private class LaunchHandler : LaunchRequestHandler
  {
      private RockPaperScissors Wrapper { get; }

      internal LaunchHandler(RockPaperScissors wrapper)
      {
          Wrapper = wrapper;
      }

      public override Task<SkillResponse> Handle(AlexaRequestInformation<SkillRequest> information) => Task.FromResult(Wrapper.Launch((LaunchRequest)information.SkillRequest.Request));
  }

  private class PlayAGameHandler : IntentNameRequestHandler
  {
      private RockPaperScissors Wrapper { get; }

      internal PlayAGameHandler(RockPaperScissors wrapper) : base("MakeMyMove")
      {
          Wrapper = wrapper;
      }

      public override Task<SkillResponse> Handle(AlexaRequestInformation<SkillRequest> information) => Wrapper.PlayAGame((IntentRequest)information.SkillRequest.Request);
  }
}

So here you can see that each of the methods have become their own handler class (relying on the base classes I've already written as part of the Alexa.NET.RequestHandlers library) and they wrap calls to the methods that have been tagged. I did this as it means that the class constructor can do whatever service wiring is required, and I'm not trying to replicate that or guess how it interacts - I'm just making the right call at the right time as far as the method is concerned.

There's an initialize method which sets it all up, and that's isolated as a method in order to be called by the skill developer at the right time for them (If they're not using AWS Lambda, I've no idea when they're going to want this called)

And then there's the Execute method - so this is the main entry point into the class. Takes a SkillRequest, returns a SkillResponse, single method to handle everything that's been built. Result!

It's Alive! So I used it

I was incredibly happy at this point - I had compiling code that worked the way I wanted to. Now I had to see what the experience would be like to try and create your first skill. So I opened Visual Studio - created a new AWS Lambda project (still the fastest and most likely path for a new dev trying this stuff out) and went to town!

I added my newly published Alexa.NET.Annotations NuGet package. Then I removed the assembly attribute it defaults in the lambda project. Then I added the new attribute, oh and I added the NuGet package that needs for the attribute because it's different. Then I changed the skill function to take a SkillRequest...and then I stopped.

I have a single line of code that I need to call to make this work. I've added the annotations library. Why have I got all these other steps? I just want to run the skill! I'm a new developer, I've got a new Alexa device sat there and I've created a new skill in the developer console. Why is this "makes this easier" library, well, not easier?

Thank you AWS .NET Team

I clearly wasn't finished. I'd reduced the barrier a little, but this still feels like developers quitting before they started.

There's a gap - it's between File->New->AWS Lambda and a working skill. But it's all plumbing. Source code generators can DO plumbing!

So I went hunting again. And again, the .NET team have made this super easy for me.

Norm and his team had made some great improvements with the .NET 6 runtime for AWS Lambda. One of those improvements was Support for Top Level Statements, but it's not the top level statements I wanted as much as the full example they gave on how to wire up a lambda function using a Main method.

This showed me that with a single extra NuGet reference and some code that called LambdaBootstrapBuilder - I could get an AWS Lambda to run whatever I needed. What's more - I could do it so that all it required was a change to the handler in the Publish to AWS Lambda wizard in Visual Studio (make your handler the Assembly, not the Method, and it knows it's an executable). This was awesome!

Initially I wanted to make this the default, but I liked the fact that you could generate a pipeline without making any assumptions about how to call said pipeline. So I added an extra attribute - AlexaLambda. This would only work with something tagged as AlexaSkill, and would generate an extra file with the Lambda executable in it.

{
    static Task Main(string[] args) => LambdaHelper.RunLambda<RockPaperScissors>();
}

and inside the LambdaHelper class (a static file I pushed in via the source code generator) this is what RunLambda looks like

public static Task RunLambda<T>() where T : ISkillLambda, new()
{
    var skillClass = new T();
    skillClass.Initialize();

    //https://docs.aws.amazon.com/lambda/latest/dg/csharp-handler.html

    return LambdaBootstrapBuilder
        .Create<SkillRequest, SkillResponse>(skillClass.Execute,
            new Amazon.Lambda.Serialization.Json.JsonSerializer()).Build().RunAsync();
}

The ISkillLambda is an interface that I push into the generator if you want to use AWSLambda, and the source code generator ensures the partial class has it as a base type - so there's no extra wiring from the developer's point of view.

So NOW it works?

Yes, now it works. I mean this is really early days - proper Alpha - but I got it working! (link to see the tweet I sent with proof)

The code needs tidying up, I need to add Roslyn Analyzers to help developers get the method signatures right, and really the signatures themselves are more complicated than needed (I can wire up slot intents directly to named arguments rather than pass the request in, and if you return a string - surely I can build the ResponseBuilder.Ask statement? Maybe an [End] or [Tell] attribute to show I don't want a response?) but it's a start.

For me I've just loved the journey - I had no idea how to write this or IF I could write this. It's taken information and articles from all over the place and not a small amount morning caffeine to produce it. Yes it's opionated, no it doesn't handle everything, but that's the point! I don't want it to.

If you're a .NET developer and you have a hugely complex skill with a ton of extra functionality, harness the Alexa.NET libraries that already exist and get the best out of your AWS Lambda or API Gateway or whatever else you're using - do the wonderful creation work you already do.

But if you're a new dev, who just wants to hear their Alexa say "Hello" without a bunch of work? Maybe put some simple interactions in and get the bug for working with conversations? Maybe Alexa Annotations is something you get interested in and want to talk about?