Age | Commit message (Collapse) | Author |
|
This small set of changes repairs several different stenciling problems;
now redbook/stencil also runs correctly (and maybe others - I haven't
checked everything yet).
- The number of instructions that had been allocated for fragment ops
used to be 64 (in cell/common.h). With complicated stencil use, we
managed to get up to 93, which caused a segfault before we noticed
we'd overran our memory buffer. It's now been bumped to 128,
which should be enough for even complicated stencil and fragment op
usage.
- The status of cell surfaces never changed beyond the initial
PIPE_SURFACE_STATUS_UNDEFINED. When a user called glClear()
to clear just the Z buffer (but not the stencil buffer), this caused
the check_clear_depth_with_quad() function to return false (because
the surface status was believed to be undefined), and so the device
was instructed to clear the whole buffer (including the stencil buffer),
instead of correctly using a quad to clear just the depth, leaving the
stencil alone.
This has been fixed similarly to the way the i915 driver handles
the surface status: during cell_clear_surface(), the status is
set to PIPE_SURFACE_STATUS_DEFINED. Then a partial buffer clear is
handled with a quad, as expected. Note that we are *not* using
PIPE_SURFACE_STATUS_CLEAR (also similar to the i915); technically,
we should be setting the surface status to CLEAR on a clear, and
to DEFINED when we actually draw something (say on cell_vbuf_draw()),
but it's difficult to figure out exactly which surfaces are affected
by a cell_vbuf_draw(), so for now we're doing the easy thing.
- The fragment ops handling was very clever about only pulling out the
parts of the Z/stencil buffer that it needed for calculations;
but this failed when only part of the buffer was written, because
the part that was never pulled out was inadvertently cleared.
Now all the data from the combined Z/stencil buffer is pulled out,
just so the proper values can be recombined later and written back
to the buffer correctly. As a bonus, the fragment op code generation
is simplified.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zero-out the index for disabled execution channels to avoid using potential
garbage values (thus avoiding bad array indexing).
|
|
Fixes progs/vp/arl.txt test.
|
|
|
|
|
|
Conflicts:
src/gallium/auxiliary/rtasm/rtasm_execmem.c
src/mesa/shader/slang/slang_emit.c
src/mesa/shader/slang/slang_log.c
src/mesa/state_tracker/st_atom_framebuffer.c
|
|
This prevents vertex shaders from referencing invalid memory locations when
the shader is operating on less than four vertices or fragments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mainly for debugging purposes for now.
|
|
|
|
|
|
|
|
|
|
|
|
The Cell stencil tests were completely ignoring the stencil value mask.
Now the original code paths are still used if the stencil value mask
is all 1s; but code to use the mask for the stencil value and reference
value comparisons is now emitted if the mask is not all 1s.
|
|
temps. This avoids useless writes of temporary results.
|
|
|
|
Two definitive bugs in stenciling were fixed.
The first, reversed registers in the generated Select Bytes (selb)
instruction, caused the stenciling INCR and DECR operations to
fail dramatically, putting new values in where old values were
supposed to be and vice versa.
The second caused stencil tiles to not be read and written from
main memory by the SPUs. A per-spu flag, spu.read_depth, was used
to indicate whether the SPU should be reading depth tiles, and was set
only when depth was enabled. A second flag, spu.read_stencil, was
set when stenciling was enabled, but never referenced.
As stenciling and depth are in the same tiles on the Cell, and there
is no corresponding TAG_WRITE_TILE_STENCIL to complement
TAG_WRITE_TILE_COLOR and TAG_WRITE_TILE_Z, I fixed this by
eliminating the unused "spu.read_stencil", renaming "spu.read_depth"
to "spu.read_depth_stencil", and setting it if either stenciling or
depth is enabled.
I also added an optimization to the fragment ops generation code,
that avoids calculating stencil values and/or stencil writemask
when the stencil operations are all KEEP.
|
|
blocks during st_readpixels due to a flush wait not happening in order to allow any previous rendering to complete.
|
|
|
|
|
|
|
|
|
|
|
|
Was 32, now 5. The param is expressed as a power of two exponent.
The net effect is that the alignment was a no-op on X86 but on PPC we
always got the same memory address everytime rtasm_exec_malloc() was called.
|
|
Was 32, now 5. The param is expressed as a power of two exponent.
The net effect is that the alignment was a no-op on X86 but on PPC we
always got the same memory address everytime rtasm_exec_malloc() was called.
|
|
functions in mesa/main/mm.c
|
|
|
|
Scalar calls only use the X component of the src regs and smear the
result across the dest register's X/Y/Z/W.
|
|
|
|
That's the last of the ARB_v_p opcodes, except for ARL.
|
|
|
|
|
|
|
|
|