<HTML> <TITLE>Shading Language Support</TITLE> <link rel="stylesheet" type="text/css" href="mesa.css"></head> <BODY> <H1>Shading Language Support</H1> <p> This page describes the features and status of Mesa's support for the <a href="http://opengl.org/documentation/glsl/" target="_parent"> OpenGL Shading Language</a>. </p> <p> Last updated on 15 December 2008. </p> <p> Contents </p> <ul> <li><a href="#envvars">Environment variables</a> <li><a href="#120">GLSL 1.20 support</a> <li><a href="#unsup">Unsupported Features</a> <li><a href="#notes">Implementation Notes</a> <li><a href="#hints">Programming Hints</a> <li><a href="#standalone">Stand-alone GLSL Compiler</a> <li><a href="#implementation">Compiler Implementation</a> <li><a href="#validation">Compiler Validation</a> </ul> <a name="envvars"> <h2>Environment Variables</h2> <p> The <b>MESA_GLSL</b> environment variable can be set to a comma-separated list of keywords to control some aspects of the GLSL compiler and shader execution. These are generally used for debugging. </p> <ul> <li>dump - print GLSL shader code to stdout at link time <li>log - log all GLSL shaders to files. The filenames will be "shader_X.vert" or "shader_X.frag" where X the shader ID. <li>nopt - disable compiler optimizations <li>opt - force compiler optimizations <li>uniform - print message to stdout when glUniform is called <li>nopvert - force vertex shaders to be a simple shader that just transforms the vertex position with ftransform() and passes through the color and texcoord[0] attributes. <li>nopfrag - force fragment shader to be a simple shader that passes through the color attribute. <li>useprog - log glUseProgram calls to stderr </ul> <p> Example: export MESA_GLSL=dump,nopt </p> <a name="120"> <h2>GLSL 1.20 support</h2> <p> GLSL version 1.20 is supported in Mesa 7.3 and later. Among the features/differences of GLSL 1.20 are: <ul> <li><code>mat2x3, mat2x4</code>, etc. types and functions <li><code>transpose(), outerProduct(), matrixCompMult()</code> functions (but untested) <li>precision qualifiers (lowp, mediump, highp) <li><code>invariant</code> qualifier <li><code>array.length()</code> method <li><code>float[5] a;</code> array syntax <li><code>centroid</code> qualifier <li>unsized array constructors <li>initializers for uniforms <li>const initializers calling built-in functions </ul> <a name="unsup"> <h2>Unsupported Features</h2> <p> The following features of the shading language are not yet fully supported in Mesa: </p> <ul> <li>Linking of multiple shaders does not always work. Currently, linking is implemented through shader concatenation and re-compiling. This doesn't always work because of some #pragma and preprocessor issues. <li>gl_ClipVertex <li>The gl_Color and gl_SecondaryColor varying vars are interpolated without perspective correction </ul> <p> All other major features of the shading language should function. </p> <a name="notes"> <h2>Implementation Notes</h2> <ul> <li>Shading language programs are compiled into low-level programs very similar to those of GL_ARB_vertex/fragment_program. <li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full float[4] registers. <li>Float constants and variables are packed so that up to four floats can occupy one program parameter/register. <li>All function calls are inlined. <li>Shaders which use too many registers will not compile. <li>The quality of generated code is pretty good, register usage is fair. <li>Shader error detection and reporting of errors (InfoLog) is not very good yet. <li>The ftransform() function doesn't necessarily match the results of fixed-function transformation. </ul> <p> These issues will be addressed/resolved in the future. </p> <a name="hints"> <h2>Programming Hints</h2> <ul> <li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. This improves the efficiency of function inlining. </li> <br> <li>To reduce register usage, declare variables within smaller scopes. For example, the following code: <pre> void main() { vec4 a1, a2, b1, b2; gl_Position = expression using a1, a2. gl_Color = expression using b1, b2; } </pre> Can be rewritten as follows to use half as many registers: <pre> void main() { { vec4 a1, a2; gl_Position = expression using a1, a2. } { vec4 b1, b2; gl_Color = expression using b1, b2; } } </pre> Alternately, rather than using several float variables, use a vec4 instead. Use swizzling and writemasks to access the components of the vec4 as floats. </li> <br> <li>Use the built-in library functions whenever possible. For example, instead of writing this: <pre> float x = 1.0 / sqrt(y); </pre> Write this: <pre> float x = inversesqrt(y); </pre> <li> Use ++i when possible as it's more efficient than i++ </li> </ul> <a name="standalone"> <h2>Stand-alone GLSL Compiler</h2> <p> A unique stand-alone GLSL compiler driver has been added to Mesa. <p> <p> The stand-alone compiler (like a conventional command-line compiler) is a tool that accepts Shading Language programs and emits low-level GPU programs. </p> <p> This tool is useful for: <p> <ul> <li>Inspecting GPU code to gain insight into compilation <li>Generating initial GPU code for subsequent hand-tuning <li>Debugging the GLSL compiler itself </ul> <p> After building Mesa, the glslcompiler can be built by manually running: </p> <pre> make realclean make linux cd src/mesa/drivers/glslcompiler make </pre> <p> Here's an example of using the compiler to compile a vertex shader and emit GL_ARB_vertex_program-style instructions: </p> <pre> bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt </pre> <p> results in: </p> <pre> # Fragment Program/Shader 0: RCP TEMP[4].x, UNIFORM[2].xxxx; 1: RCP TEMP[4].y, UNIFORM[2].yyyy; 2: MUL TEMP[3].xy, VARYING[0], TEMP[4]; 3: MOV TEMP[1], TEMP[3]; 4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx; 5: FRC TEMP[1].z, TEMP[0].wwww; 6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx; 7: IF (NE.wwww); # (if false, goto 9); 8: ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx; 9: ENDIF; 10: FRC TEMP[1].xy, TEMP[1]; 11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1]; 12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy; 13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1]; 14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx; 15: MOV OUTPUT[0].xyz, TEMP[0]; 16: MOV OUTPUT[0].w, CONST[4].yyyy; 17: END </pre> <p> Note that some shading language constructs (such as uniform and varying variables) aren't expressible in ARB or NV-style programs. Therefore, the resulting output is not always legal by definition of those program languages. </p> <p> Also note that this compiler driver is still under development. Over time, the correctness of the GPU programs, with respect to the ARB and NV languagues, should improve. </p> <a name="implementation"> <h2>Compiler Implementation</h2> <p> The source code for Mesa's shading language compiler is in the <code>src/mesa/shader/slang/</code> directory. </p> <p> The compiler follows a fairly standard design and basically works as follows: </p> <ul> <li>The input string is tokenized (see grammar.c) and parsed (see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). The nodes in this tree are slang_operation structures (see slang_compile_operation.h). The nodes are decorated with symbol table, scoping and datatype information. <li>The AST is converted into an Intermediate representation (IR) tree (see the slang_codegen.c file). The IR nodes represent basic GPU instructions, like add, dot product, move, etc. The IR tree is mostly a binary tree, but a few nodes have three or four children. In principle, the IR tree could be executed by doing an in-order traversal. <li>The IR tree is traversed in-order to emit code (see slang_emit.c). This is also when registers are allocated to store variables and temps. <li>In the future, a pattern-matching code generator-generator may be used for code generation. Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for patterns in IR trees, compute weights for subtrees and use the weights to select the best instructions to represent the sub-tree. <li>The emitted GPU instructions (see prog_instruction.h) are stored in a gl_program object (see mtypes.h). <li>When a fragment shader and vertex shader are linked (see slang_link.c) the varying vars are matched up, uniforms are merged, and vertex attributes are resolved (rewriting instructions as needed). </ul> <p> The final vertex and fragment programs may be interpreted in software (see prog_execute.c) or translated into a specific hardware architecture (see drivers/dri/i915/i915_fragprog.c for example). </p> <h3>Code Generation Options</h3> <p> Internally, there are several options that control the compiler's code generation and instruction selection. These options are seen in the gl_shader_state struct and may be set by the device driver to indicate its preferences: <pre> struct gl_shader_state { ... /** Driver-selectable options: */ GLboolean EmitHighLevelInstructions; GLboolean EmitCondCodes; GLboolean EmitComments; }; </pre> <ul> <li>EmitHighLevelInstructions <br> This option controls instruction selection for loops and conditionals. If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK instructions will be emitted. Otherwise, those constructs will be implemented with BRA instructions. </li> <li>EmitCondCodes <br> If set, condition codes (ala GL_NV_fragment_program) will be used for branching and looping. Otherwise, ordinary registers will be used (the IF instruction will examine the first operand's X component and do the if-part if non-zero). This option is only relevant if EmitHighLevelInstructions is set. </li> <li>EmitComments <br> If set, instructions will be annoted with comments to help with debugging. Extra NOP instructions will also be inserted. </br> </ul> <a name="validation"> <h2>Compiler Validation</h2> <p> A <a href="http://glean.sf.net" target="_parent">Glean</a> test has been create to exercise the GLSL compiler. </p> <p> The <em>glsl1</em> test runs over 170 sub-tests to check that the language features and built-in functions work properly. This test should be run frequently while working on the compiler to catch regressions. </p> <p> The test coverage is reasonably broad and complete but additional tests should be added. </p> </BODY> </HTML>