Age | Commit message (Collapse) | Author |
|
|
|
This moves construction of the mapping between VP outputs
and FP inputs into validation.
The map also contains slots for special outputs like clip
distance and point size, so we need to at least merge the
VP related and FP related parts on validation if we want
to support those.
Now we match every single FP input component with results
from the VP and leave those not read out of the map, or
replace those not written by 0 (xyz) or 1 (w).
The bitmap indicating linear interpolants is also filled,
and flat FP inputs are mapped in only after non-flat ones,
as is required.
Furthermore, we can save some space by only fetching VP
attrs we actually use, and avoid wasting any output regs
because of TGSI using less than 4 components.
|
|
Make use of tgsi_shader_info to determine how many nv50_regs we
need to allocate, whether program uses KIL, or writes DEPR.
|
|
|
|
|
|
|
|
|
|
Makes some opcode cases nicer and might reduce the total
nr of TEMPs required, or save some MOVs.
|
|
|
|
We're going to try to reorder the scalar ops of a vector instr
to accomodate swizzles that would otherwise require us to emit
to an additional TEMP first (like MOV R0.xy, R0.zx).
|
|
Extend its usage to avoiding e.g. emission of negation
instructions in tx_insn for sources we don't need.
|
|
|
|
|
|
|
|
1D tile span support for depth/stencil/color/textures
Z and stencil buffers are always tiled, so this fixes
sw access to Z and stencil buffers. color and textures
are currently linear, but this adds span support when we
implement 1D tiling.
This fixes the text in progs/demos/engine and progs/tests/z*
|
|
Noticed by rnoland on IRC.
|
|
|
|
This fixes the compilation errors reported in bug 23945 but someone more
familiar with the code should review for correctness and close the bug
report.
|
|
|
|
It is missing in some Microsoft DDKs.
|
|
|
|
Commit 36dd53a3cded9d003ec418732b7fc93c1476aa9b caused a few regressions
because the glReadBuffer() buffer wasn't getting mapped when GL_READ_BUFFER
!= GL_DRAW_BUFFER.
|
|
These default swrast functions are already installed by
_mesa_init_driver_functions().
|
|
|
|
Should be easier to read and work with than the older ways of emitting
TGSI tokens.
Also, emit simpler TGSI than previously:
- translate away source and dest extended modifiers
- translate away the SWZ opcode
|
|
|
|
Union not worth the hassle of violating C99 or adding a name to
the structure.
|
|
|
|
|
|
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure
intermediate results do not overwrite yet-to-be read source registers.
This change ensures that MOV/SWZ handle this correctly, which is poor but
no worse than the current tgsi_exec.c path. Remove the fallback as there
is nothing to be gained correctness-wise between the two implementations now.
Fixing this properly looks like a bit of work in this code, but might be
easily achieved by sending destination writes to temporary storage.
|
|
|
|
|
|
|
|
registers.
How I'm thankful for regular expressions -- just a couple of them were
all that was needed to do this otherwise tiresome and bug prone change.
|
|
Basically cover all low hanging fruit, and mark the still missing opcodes
as "fixme" or deprecated.
|
|
We are relying on SSE4.1 for round/trunc/ceil/floor. We'll need to
eventually find alternatives for the rest of the world.
|
|
meaning.
|
|
Fix recent performance regression.
|
|
|
|
|
|
Previously ureg would always call the driver's create-shader function. This
allows the caller the opportunity to hold onto the tokens if it needs to
reuse them, eg. to create an internal draw shader.
|
|
Couldn't previously emit these except by calling the opcode-specific helper.
|
|
Avoid the need to emit all constant declarations in order. Makes
referring to a specific constant in the constant buffer much easier.
|
|
Never set in mesa. Remove from tgsi translation as well.
|
|
Fix ureg_DECL_vs_input to reflect this and fix up all callers.
|
|
|
|
|
|
|
|
Individual texture images have a stride, but textures as a whole do not.
There are still pieces of code which are confused about this, but the core of
the confusion is hopefully gone.
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Previously, it was trying to mess around with the varying's
WM setup data to produce a result. Along with not actually working when
passed a varying, this wouldn't work if you did dFd[xy]() on a temporary.
Instead, just calculate the derivative using the neighbors in the subspan.
|