Writing a Shader Effect Language Part 1

Overview

Data Driven Rendering Series:

  1. https://jorenjoestar.github.io/post/writing_shader_effect_language_1/
  2. https://jorenjoestar.github.io/post/writing_shader_effect_language_2/
  3. https://jorenjoestar.github.io/post/writing_shader_effect_language_3/
  4. https://jorenjoestar.github.io/post/data_driven_rendering_pipeline/

In this article we will create a simple language that can encapsulate shader code (called code fragments) and output different files for each fragment. This is the initial step to switch from an engine that loads single files for each shader stage (vertex, fragment, compute, …) to one that uses an effect file that contains more than one shader.

We will start by motivation, then will define the language itself (very simple), then we will look at the Parser and last the Code Generator.

Have a good read!

Motivation

In the incredible quest of data-driven rendering, after we defeated the dragon of code generation another multiple headed dragon arises: an hydra! We have different options here: be the brave warrior in shiny armor that tries to cut all the heads of the hydra, built some machines that can fight for us and send them, or both built the machines AND fight.

Our code is invaluable, like our energies fighting the hydra. We need to carefully balance them and see how can we use for the BEST.

Writing manual code is good, it is generally what is done, but it is slow and error prone. Going data-driven can be fast, but can give you a sense of losing control (not personally, but I heard few people saying that). Only generating code can quickly become a recipe for disaster: so many particular use cases need attention, that the code could be come a different kind of mess.

We will try to go down the route of code generation mixed with data-driven. As I wrote in my previous articles, it is a fine line and can be good to know when to go in which direction!

I will divide the article in 2 parts. The first part (this one) will contain the new Shader Code Generator to generate shader permutations and add include support to GLSL. The second will require a low-level rendering library and will show Code Generation of more CPU areas of Rendering, the real goal of all these articles!

The code is available here:

https://github.com/JorenJoestar/DataDrivenRendering

Effect file structure

Looking at effects, the first thing to do is to define a file that will represent our shaders. My choice is to create a simple language to embed shaders code and generate the CPU code necessary to render it.

Why not using Json ?

While it is an amazing data-format, I still want a bigger degree of control of what to parse and what to generate. The decision is based on the fact that by writing a parser for the language, I can automate some code-generation that would be more intricate with Json. Also, this series itself is a personal exploration on the topic, so using Json was not an option for this level of complexity.

The HFX Format

HFX (Hydra Effects) is a new language we will define to write out shaders. The first iteration will be barebone - it will simply be a shader permutation generator - but it will be the foundation to extensions that will allow us to write CPU rendering code that we want to automate.

In defining the format, there will be few keywords that will be defined, but the general architecture will make straightforward to copy-paste shader code fragments from any language into the HFX language. We will use the following keywords (and concepts).

Shader

The root of a shader effect. It will contain everything we are writing.

Glsl/Hlsl

These will define the actual shader code, enclosed fragments. Fragments can be composed and reused. For Glsl in particular, code fragments needs to be embedded in defines for each stage. More on that later.

Pass, Technique, Variant

This is the central part for the effects to work. I’ve researched a bit, between Microsoft effects, Unity effects, Godot and Bungie and the concepts are very similar, but they seem to differ a little and also each implementation becomes very engine-specific of course. The presentation by Bungie is amazing and their system is by far the more extensive and complex, we will work on a much simpler shader effect system.

Let’s define a pass as a combination of shader code for at least one stage of the shader pipeline. For example a single compute shader or a couple vertex-fragment shader.

Variants and techniques are loose concept to help separating shader paths. For example a variant could be a different post-process shader, like different implementations of SSAO.

A technique could be a whole set of passes that target a specific platform.

Not having my mind set on those still, I will omit them for now, as they are concepts that are less central than the code generation, and can be very subjective opinion-wise. Possibly I’ll get them in part 2.

Properties

Final piece of the puzzle. This will define the resources used by the shader effect on a per-effect level. Keeping an eye on the newer rendering APIs (DX12 and Vulkan) this defines also the layout of the resources and how they are used. Possibly the most intense part from an automation possibility (and thus code-generation). We will define this in part 2 of this article.

High level workflow

From a high level perspective what will happen in all this code is enclosed in this code:

text = ReadEntireFileIntoMemory( "..\\data\\SimpleFullscreen.hfx", nullptr );
initLexer( &lexer, (char*)text );

hfx::Parser effect_parser;
hfx::initParser( &effect_parser, &lexer );
hfx::generateAST( &effect_parser );

hfx::CodeGenerator hfx_code_generator;
hfx::initCodeGenerator( &hfx_code_generator, &effect_parser, 4096 );
hfx::generateShaderPermutations( &hfx_code_generator, "..\\data\\" );

We separated the Lexer from the Parser so we can reuse the lexer functionalities, thus we can reuse it from the previous example (parsing the HydraDataFormat files). Then we initialize the Parser and generate the AST. This will save all the passes and code fragments we defined in the HFX file. Finally we will get the parsing informations and give them to the code generator, that will write out the files for each pass and stage.

Let’s dig into the example!

Parser: welcome HFX!

In most rendering-API (OpenGL, Vulkan, Direct3D12, …) shaders are compiled by compiling the individual stages (vertex, fragment, compute, geometry, …) and in some APIs (especially the newer ones) are compiled into a Shader State.

As first step of this shader language, single shader files will be created by the shader generation method in our code.

We will define a simple fullscreen HFX with code fragments and passes.

First, we define the root shader (SimpleFullscreen.hfx, under folder ‘data’):

shader SimpleFullscreen {

This is simply the container for all the code and passes that will define the shader effect.

Now we need some actual code, so we can define a shader fragment. The keyword used in our language is glsl followed by a name and an open brace:

glsl ToScreen {

This will define a code fragment named ToScreen, that can be referenced from the passes. Next we use a glsl trick to signal our parser to use includes:

#pragma include "Platform.h"

This #pragma is actually ignored by the compiler, but will be used by the parser to actually add the include! BEWARE: this code will be included in BOTH vertex and fragment program! Anything outside of the VERTEX/FRAGMENT/COMPUTE macros will be, and this is done on purpose, like defining an interpolator struct only once or for common includes.

Next we define the vertex program. BEWARE: vertex only code must be enclosed in VERTEX define!

#if defined VERTEX

out vec4 vTexCoord;

void main() {

   vTexCoord.xy = vec2((gl_VertexID << 1) & 2, gl_VertexID & 2);
   vTexCoord.zw = vTexCoord.xy;
   gl_Position = vec4(vTexCoord.xy * 2.0f + -1.0f, 0.0f, 1.0f);
}

#endif // VERTEX

This code is a simple fullscreen triangle that does not require any vertex buffer, but uses the vertex id to draw. Nothing fancy.

Next is the fragment program, and again enclosed in FRAGMENT define:

#if defined FRAGMENT

in vec4 vTexCoord;

out vec4 outColor;

layout(binding=0) uniform sampler2D input_texture;

void main() {

    vec3 color = texture2D(input_texture, vTexCoord.xy).xyz;
    outColor = vec4(color, 1);
}

#endif // FRAGMENT

} // glsl ToScreen

This code simply reads a texture and outputs it to the screen.

We defined the code fragment ToScreen, containing both a vertex and a fragment program, and now we can actually generate the permutation that we need. The code for this in our effect file is:

pass ToScreen {
   vertex = ToScreen
   fragment = ToScreen
}

We are simply defining a pass with the vertex and fragment program defined in the ToScreen code fragment (yes I don’t like this term too).

Running the code generator on this simple effect file will generate the two files ToScreen.vert and ToScreen.frag.

These can be read directly into your favourite OpenGL renderer and used as is!

The Parser

Now that we have defined the effect and we know what is the outcome of generating code from the effect file, let’s look into the different component of the parser and code generator needed.

By design, we chose the Lexer to know nothing about the language, so that we can use it between different languages. The entry point to parse the effect is the method generateAST:

void generateAST( Parser* parser ) {

    // Read source text until the end.
    // The main body can be a list of declarations.
    bool parsing = true;

    while ( parsing ) {

        Token token;
        nextToken( parser->lexer, token );

        switch ( token.type ) {

            case Token::Token_Identifier:
            {
                identifier( parser, token );
                break;
            }

            case Token::Type::Token_EndOfStream:
            {
                parsing = false;
                break;
            }
        }
    }
}

This code simply process the file -  using the lexer -  until the end of it, and reads only identifiers. It is the same as the previous article and the previous parser. What changes drastically is the identifier method! We will have 3 different set of identifiers, usable in different parts of the HFX file:

  1. Main identifiers, ‘shader’, ‘glsl’, ‘pass’
  2. Pass identifiers, ‘compute’, ‘vertex’, ‘fragment’
  3. Directive identifiers, ‘if defined’, ‘pragma include’, ‘endif’

Let’s have a look at the code for parsing the main identifiers:

inline void identifier( Parser* parser, const Token& token ) {

    // Scan the name to know which 
    for ( uint32_t i = 0; i < token.text.length; ++i ) {
        char c = *(token.text.text + i);

        switch ( c ) {
            case 's':
            {
                if ( expectKeyword( token.text, 6, "shader" ) ) {
                    declarationShader( parser );
                    return;
                }

                break;
            }

            case 'g':
            {
                if ( expectKeyword( token.text, 4, "glsl" ) ) {
                    declarationGlsl( parser );
                    return;
                }
                break;
            }

            case 'p':
            {
                if ( expectKeyword( token.text, 4, "pass" ) ) {
                    declarationPass( parser );
                    return;
                }
                break;
            }

        }
    }
}

This code simply defers the parsing of a particular identifier using the declaration method corresponding to the identifier. We will look into detail on each method.

Parsing ‘shader’

We are parsing now the following part from the HFX file:

// HFX

shader SimpleFullscreen {

This is the entry point of the effect itself. What should the parser do here ? Simply iterate through the main identifiers, ‘glsl’ and ‘pass’. Technically I could have separated the methods to have one with parsing shader only and the others parsing ‘glsl’ and ‘pass’, but did not want to complicate the code further.

Let’s look at how we parse the identifier ‘shader’:

// C++

inline void declarationShader( Parser* parser ) {
    // Parse name
    Token token;
    if ( !expectToken( parser->lexer, token, Token::Token_Identifier ) ) {
        return;
    }

    // Cache name string
    StringRef name = token.text;

    if ( !expectToken( parser->lexer, token, Token::Token_OpenBrace ) ) {
        return;
    }

    while ( !equalToken( parser->lexer, token, Token::Token_CloseBrace ) ) {

        identifier( parser, token );
    }
}

As the previous article’s code, this will get the tokens from the lexer and generate data if the syntax is correct. When we enter the method the Lexer will be just at the beginning of the name (SimpleFullscreen), so the code will parse the name, the open brace, and parse everything else until it encounter the close brace.

The method identifier will parse also identifiers ‘glsl’ and ‘pass’.

Parsing ‘glsl’

This is the most complex parsing in the code. I will put both the HFX part and C++ code so hopefully it will be clearer what the parser is doing and why.

As a refresh and reference, this is the code fragment ToScreen defined in SimpleFullscreen.hfx:

// HFX

glsl ToScreen {

    #pragma include "Platform.h"

    #if defined VERTEX
    out vec4 vTexCoord;

    void main() {

        vTexCoord.xy = vec2((gl_VertexID << 1) & 2, gl_VertexID & 2);
        vTexCoord.zw = vTexCoord.xy;

        gl_Position = vec4(vTexCoord.xy * 2.0f + -1.0f, 0.0f, 1.0f);
    }
    #endif // VERTEX

    #if defined FRAGMENT

    in vec4 vTexCoord;

    out vec4 outColor;

    layout(binding=0) uniform sampler2D input_texture;

    void main() {
        vec3 color = texture2D(input_texture, vTexCoord.xy).xyz;
        outColor = vec4(1, 1, 0, 1);
        outColor = vec4(color, 1);
    }
    #endif // FRAGMENT
}

Let’s start from the beginning. When the parser finds the ‘glsl’ keyword in the identifier method:

// C++

case 'g':
{
    if ( expectKeyword( token.text, 4, "glsl" ) ) {
        declarationGlsl( parser );
        return;
    }
    break;
}

It calls the method void declarationGlsl( Parser parser )*.

The lexer reading the HFX is after the glsl keyword when entering the method, just before the ToScreen identifier:

// HFX

glsl (Here!)ToScreen {

Let’s see the C++ code step by step. First parsing the name ‘ToScreen’:

// C++

inline void declarationGlsl( Parser* parser ) {

    // Parse name
    Token token;
    if ( !expectToken( parser->lexer, token, Token::Token_Identifier ) ) {
        return;
    }

as seen in other methods as well. We are defining a new code fragment, thus we need to initialize it. There is tracking of the #ifdef depths to manage when some code must be included in a code fragment and when not:

    CodeFragment code_fragment = {};
    // Cache name string
    code_fragment.name = token.text;

    for ( size_t i = 0; i < CodeFragment::Count; i++ ) {
        code_fragment.stage_ifdef_depth[i] = 0xffffffff;
    }

    if ( !expectToken( parser->lexer, token, Token::Token_OpenBrace ) ) {
        return;
    }

Next is simply arriving at the first token that contains all the glsl code:

    // Advance token and cache the starting point of the code.
    nextToken( parser->lexer, token );
    code_fragment.code = token.text;

And now some more parsing craftmanship. We cannot use anymore the simple check to end parsing when encountering a closed brace, because there can be different structs defined that will break that mechanism. Instead we track the number of open braces and when we close the last one, we consider finished the parsing of the code fragment!

    uint32_t open_braces = 1;

    // Scan until close brace token
    while ( open_braces ) {

        if ( token.type == Token::Token_OpenBrace )
            ++open_braces;
        else if ( token.type == Token::Token_CloseBrace )
            --open_braces;

The only token that we care inside the code fragment is the hash, signalling either an include or a define, used for separating per-stage code. The parsing of the hash token will be done inside the directiveIdentifier method:

        // Parse hash for includes and defines
        if ( token.type == Token::Token_Hash ) {
            // Get next token and check which directive is
            nextToken( parser->lexer, token );

            directiveIdentifier( parser, token, code_fragment );
        }

Before diving deep into the directive identifiers, let’s finish the main parsing routine. We advance to the next token until we close all the braces, and then save the text length of all the code fragment:

        nextToken( parser->lexer, token );
    }
    
    // Calculate code string length
    code_fragment.code.length = token.text.text - code_fragment.code.text;

Final step is to save the newly parsed code fragment into the parser data:

    parser->code_fragments.emplace_back( code_fragment );
}

We can now dive deep into the parsing of directives, namely #if defined, #pragma include and #endif.

Parsing ‘#if defined’

When we encounter the Hash token within the glsl part, we need to parse further to understand the other keywords. #if defined is the most important directive for us, because it will tell the parser which shader stage we are parsing currently and thus where to direct the text! It starts from a common/shared stage, for shared code, and when encounters a #if defined it can signal a stage specific code.

Namely when parsing the following line in HFX:

// HFX

#(Here!)if defined VERTEX

The parser needs to check 2 other identifiers. Remember that the parser is currently AFTER the Hash token, as beautifully written in the previous snippet! Let’s look at the code:

// C++

inline void directiveIdentifier( Parser* parser, const Token& token, CodeFragment& code_fragment ) {
    
    Token new_token;
    for ( uint32_t i = 0; i < token.text.length; ++i ) {
        char c = *(token.text.text + i);

        switch ( c ) {
            case 'i':
            {
                // Search for the pattern 'if defined'
                if ( expectKeyword( token.text, 2, "if" ) ) {
                    nextToken( parser->lexer, new_token );

                    if ( expectKeyword( new_token.text, 7, "defined" ) ) {
                        nextToken( parser->lexer, new_token );

                        // Use 0 as not set value for the ifdef depth.
                        ++code_fragment.ifdef_depth;

                        if ( expectKeyword( new_token.text, 6, "VERTEX" ) ) {

                            code_fragment.stage_ifdef_depth[CodeFragment::Vertex] = code_fragment.ifdef_depth;
                            code_fragment.current_stage = CodeFragment::Vertex;
                        }
                        else if ( expectKeyword( new_token.text, 8, "FRAGMENT" ) ) {

                            code_fragment.stage_ifdef_depth[CodeFragment::Fragment] = code_fragment.ifdef_depth;
                            code_fragment.current_stage = CodeFragment::Fragment;
                        }
                        else if ( expectKeyword( new_token.text, 7, "COMPUTE" ) ) {

                            code_fragment.stage_ifdef_depth[CodeFragment::Compute] = code_fragment.ifdef_depth;
                            code_fragment.current_stage = CodeFragment::Compute;
                        }
                    }

                    return;
                }
                break;
            }

Let’s dissect this code!

Starting from the current token, just after the #(Hash), we need to check the correct composition of the keywords. We expect ‘if’, and then if found we go to the next token:

if ( expectKeyword( token.text, 2, "if" ) ) {
    nextToken( parser->lexer, new_token );

We search for the ‘defined’ identifier and if found we go to the next identifier:

if ( expectKeyword( new_token.text, 7, "defined" ) ) {
    nextToken( parser->lexer, new_token );

The parser is currently here:

#if defined (Here!)VERTEX

And thus the last step is to check which shader stage is currently starting. This is done here:

if ( expectKeyword( new_token.text, 6, "VERTEX" ) ) {

    code_fragment.stage_ifdef_depth[CodeFragment::Vertex] = code_fragment.ifdef_depth;
    code_fragment.current_stage = CodeFragment::Vertex;
}

In this central piece of code, we set the current stage to Vertex (because we found the keyword ‘VERTEX’) and we save the current ifdef depth. Why that ? Because when we will parse #endif, we will do the same for the open/close braces depth in the main glsl parser: we want to be sure that the defines are paired correctly and we are saving the per-stage code in the correct way! This will be more clear when we see the #endif parsing.

Moving on, we will do the same for all the other keywords (‘FRAGMENT’ and ‘COMPUTE’ for now):

else if ( expectKeyword( new_token.text, 8, "FRAGMENT" ) ) {

    code_fragment.stage_ifdef_depth[CodeFragment::Fragment] = code_fragment.ifdef_depth;
    code_fragment.current_stage = CodeFragment::Fragment;
}
else if ( expectKeyword( new_token.text, 7, "COMPUTE" ) ) {

    code_fragment.stage_ifdef_depth[CodeFragment::Compute] = code_fragment.ifdef_depth;
    code_fragment.current_stage = CodeFragment::Compute;
}

And the parsing of #if defined is over!

Parsing ‘#pragma include’

In HFX we are parsing the following:

// HFX

#pragma include "Platform.h"

With the following code (inside directiveIdentifier method):

// C++

case 'p':
{
    if ( expectKeyword( token.text, 6, "pragma" ) ) {
        nextToken( parser->lexer, new_token );

        if ( expectKeyword( new_token.text, 7, "include" ) ) {
            nextToken( parser->lexer, new_token );

            code_fragment.includes.emplace_back( new_token.text );
            code_fragment.includes_stage.emplace_back( code_fragment.current_stage );
        }

        return;
    }
    break;
}

This is simply saving the filename after the include, that being surrounded by "" is classified as string, and is using the current stage to know which stage should include that file!

Parsing ‘#endif’

Final part is the #endif identifier:

case 'e':
{
    if ( expectKeyword( token.text, 5, "endif" ) ) {

        if ( code_fragment.stage_ifdef_depth[CodeFragment::Vertex] == code_fragment.ifdef_depth ) {
            
            code_fragment.stage_ifdef_depth[CodeFragment::Vertex] = 0xffffffff;
            code_fragment.current_stage = CodeFragment::Common;
        }
        else if ( code_fragment.stage_ifdef_depth[CodeFragment::Fragment] == code_fragment.ifdef_depth ) {

            code_fragment.stage_ifdef_depth[CodeFragment::Fragment] = 0xffffffff;
            code_fragment.current_stage = CodeFragment::Common;
        }
        else if ( code_fragment.stage_ifdef_depth[CodeFragment::Compute] == code_fragment.ifdef_depth ) {

            code_fragment.stage_ifdef_depth[CodeFragment::Compute] = 0xffffffff;
            code_fragment.current_stage = CodeFragment::Common;
        }

        --code_fragment.ifdef_depth;

        return;
    }
    break;
}

This is mirroring the #if defined and simply goes back to set the current stage to common/shared and reset the per-stage ifdef depth.

We can now proceed to the final part of the parsing, the passes! This is the glue to generate the different files from the code fragments.

Parsing ‘pass’

Reading the HFX file, we are now in the final part of the file:

// HFX

pass ToScreen {
   vertex = ToScreen
   fragment = ToScreen
}

A pass is simply a collection of code fragments associated with each shader stage (vertex, fragment, compute). When we parsed the fragments, we saved them in the parser to be retrieved.

To refresh our memory, this is the actual Pass struct in C++:

// C++

struct Pass {

    StringRef                   name;

    const CodeFragment*         vs                  = nullptr;
    const CodeFragment*         fs                  = nullptr;
    const CodeFragment*         cs                  = nullptr;

}; // struct Pass

Going back to the main directive method, we call the declarationPass method when we encounter the ‘pass’ identifier. We will parse the following line:

// HFX

pass ToScreen {

With the following code (similar to everything else, it should be easier to read now):

// C++

inline void declarationPass( Parser* parser ) {

    Token token;
    if ( !expectToken( parser->lexer, token, Token::Token_Identifier ) ) {
        return;
    }

    Pass pass = {};
    // Cache name string
    pass.name = token.text;

    if ( !expectToken( parser->lexer, token, Token::Token_OpenBrace ) ) {
        return;
    }

After we saved the pass name we can start reading the individual stages using the passIdentifier method:

    while ( !equalToken( parser->lexer, token, Token::Token_CloseBrace ) ) {
        passIdentifier( parser, token, pass );
    }

And then save the newly parsed pass.

    parser->passes.emplace_back( pass );
}

For each identifier now, we will check which stage we are parsing. Currently we are here, after the open brace and all the whitespace:

// HFX

pass ToScreen {
   (Here!)vertex = ToScreen
   fragment = ToScreen
}

What is next is thus checking the identifier and filling the corresponding shader stage of the pass. I will post all the code of the method, because is similar to most code we seen and should be straightforward:

// C++

inline void passIdentifier( Parser* parser, const Token& token, Pass& pass ) {
    // Scan the name to know which stage we are parsing    
    for ( uint32_t i = 0; i < token.text.length; ++i ) {
        char c = *(token.text.text + i);

        switch ( c ) {
            
            case 'c':
            {
                if ( expectKeyword( token.text, 7, "compute") ) {
                    declarationShaderStage( parser, &pass.cs );
                    return;
                }
                break;
            }

            case 'v':
            {
                if ( expectKeyword( token.text, 6, "vertex" ) ) {
                    declarationShaderStage( parser, &pass.vs );
                    return;
                }
                break;
            }

            case 'f':
            {
                if ( expectKeyword( token.text, 8, "fragment" ) ) {
                    declarationShaderStage( parser, &pass.fs );
                    return;
                }
                break;
            }
        }
    }
}

The real ‘magic’ here is the ‘declarationShaderStage’ method. This method parses the couple ‘identifier’ ‘=’ ‘identifier’, and searches the code fragment with the same name:

inline void declarationShaderStage( Parser* parser, const CodeFragment** out_fragment ) {

    Token token;
    if ( !expectToken( parser->lexer, token, Token::Token_Equals ) ) {
        return;
    }

    if ( !expectToken( parser->lexer, token, Token::Token_Identifier ) ) {
        return;
    }

    *out_fragment = findCodeFragment( parser, token.text );
}

After all the stages of the current pass are parsed, we save the pass and finish parsing the file!

Shader Permutation Generation

The final step of this amazing journey is the simplest, and it is actually to generate the single files we need. In our case another specific class, CodeGenerator, will generate the different files from the parsed HFX file.

After we’ve done with the parsing, we can call the generateShaderPermutations method that will generate files for each shader stage in each pass:

void generateShaderPermutations( CodeGenerator* code_generator, const char* path ) {

    code_generator->string_buffer_0.clear();
    code_generator->string_buffer_1.clear();
    code_generator->string_buffer_2.clear();

    // For each pass and for each pass generate permutation file.
    const uint32_t pass_count = (uint32_t)code_generator->parser->passes.size();
    for ( uint32_t i = 0; i < pass_count; i++ ) {

        // Create one file for each code fragment
        const Pass& pass = code_generator->parser->passes[i];
        
        if ( pass.cs ) {
            outputCodeFragment( code_generator, path, CodeFragment::Compute, pass.cs );
        }

        if ( pass.fs ) {
            outputCodeFragment( code_generator, path, CodeFragment::Fragment, pass.fs );
        }

        if ( pass.vs ) {
            outputCodeFragment( code_generator, path, CodeFragment::Vertex, pass.vs );
        }
    }
}

The code should be straightforward, and the real action happens into the outputCodeFragment method. Let’s have a look at the code.

First we define some data, like the file extensions for each shader stage or the defines to compile the code:

// Additional data to be added to output shaders.
static const char*              s_shader_file_extension[CodeFragment::Count] = { ".vert", ".frag", ".compute", ".h" };
static const char*              s_shader_stage_defines[CodeFragment::Count] = { "#define VERTEX\r\n", "#define FRAGMENT\r\n", "#define COMPUTE\r\n", "" };

Then we start to write the file. We will use the string_buffer_0 to dynamically generate the path of the file without allocating memory:

void outputCodeFragment( CodeGenerator* code_generator, const char* path, CodeFragment::Stage stage, const CodeFragment* code_fragment ) {
    // Create file
    FILE* output_file;

    code_generator->string_buffer_0.clear();
    code_generator->string_buffer_0.append( path );
    code_generator->string_buffer_0.append( code_fragment->name );
    code_generator->string_buffer_0.append( s_shader_file_extension[stage] );
    fopen_s( &output_file, code_generator->string_buffer_0.data, "wb" );

    if ( !output_file ) {
        printf( "Error opening file. Aborting. \n" );
        return;
    }

And then use string_buffer_1 to instead generate the actual code into the file. First, and most important, we will add all the includes for this particular stage by opening the file, reading it into memory and adding it into the final code buffer.

We will still use string_buffer_0 to generate the path of the file:

    code_generator->string_buffer_1.clear();

    // Append includes for the current stage.
    for ( size_t i = 0; i < code_fragment->includes.size(); i++ ) {
        if ( code_fragment->includes_stage[i] != stage && code_fragment->includes_stage[i] != CodeFragment::Common ) {
            continue;
        }

        // Open and read file
        code_generator->string_buffer_0.clear();
        code_generator->string_buffer_0.append( path );
        code_generator->string_buffer_0.append( code_fragment->includes[i] );
        char* include_code = ReadEntireFileIntoMemory( code_generator->string_buffer_0.data, nullptr );

        code_generator->string_buffer_1.append( include_code );
        code_generator->string_buffer_1.append( "\r\n" );
    }

After that is done we can copy the define needed for the current shader stage:

    code_generator->string_buffer_1.append( "\t\t" );
    code_generator->string_buffer_1.append( s_shader_stage_defines[stage] );

And finally the actual code:

    code_generator->string_buffer_1.append( "\r\n\t\t" );
    code_generator->string_buffer_1.append( code_fragment->code );

Write to file and close it and we are done!

    fprintf( output_file, "%s", code_generator->string_buffer_1.data );

    fclose( output_file );
}

And this will generate the shader permutations for each pass with a single file, using the standard GLSL convention for files extensions.

Conclusions and next part

We parsed our simple shader language to enhance and embed glsl code fragments into our codebase by generating single files that can be used into any OpenGL based renderer. We also laid out the foundation for a more powerful tool - namely code generation - even though there are some intermediate steps to be taken to arrive there. First of all, we will need a target rendering library (something like the amazing Sokol), so we can specialize our CPU rendering code. I already wrote something like Sokol but with a more Vulkan/D3D12 interface in mind, and I will use that. Still unsure if I will write a specific post on that.

In the next article we will add support for the new graphics library and develop the language more to generate code that will manage Constant buffers, automatically creating a CPU-side class, adding UI to edit it in realtime and possibly load/save the values.

Of course, any feedback/improvements/suggestions on anything related here (article, code, etc) please let me know.

Stay tuned! Gabriel

Avatar
Gabriel Sassone
Principal Rendering/Engine Programmer