Age | Commit message (Collapse) | Author |
|
Saves ~480 bytes of code.
|
|
|
|
This didn't work for quad/quadstrips at all, and for all other primitive types
it only worked when they were unclipped.
Fix up the former in gs stage (could probably do without these changes and
instead set QuadsFollowProvokingVertexConvention to false), and the rest in
clip stage.
|
|
This keeps the individual state files from having to export their
structures for brw_state_cache initialization.
|
|
|
|
|
|
|
|
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
Noticed while debugging a weird 1D FBO testcase that left its existing
viewport and projection matrix in place when switching drawbuffers. Didn't
fix the testcase, though.
|
|
need to clamp point size to user set min/max values, even for constant
point size. Fixes glean pointAtten test.
|
|
The FBO pixel coordinate system, with (0,0) as the
upper-left pixel, is inverted in Y compared to the
normal OpenGL pixel coordinate system, which has
(0,0) as its lower-left pixel.
Viewport and polygon stipple are sensitive to this
inversion; so is point rasterization. The basic
fix is simple: when rendering to an FBO, instead
of the normal RASTRULE_UPPER_RIGHT that's
appropriate for OpenGL windows, use the Y inversion
RASTRULE_LOWER_RIGHT.
Unfortunately, current Intel documentation has this
value listed as "Reserved, but not seen as useful".
It does work on at least some i965-class devices,
though; and the worst that could happen if an
older device didn't support it would be incorrect
point rasterization to FBOs, which is what happens
already, so this fix is at least no worse than what
happens presently, and is better for some (and possibly
all) i965-class devices.
|
|
|
|
|
|
|
|
|
|
Anytime we're not rendering to the default/window FBO, need to invert
rendering, not just when rendering to a texture. Otherwise, if a FBO
consists of a mix of textures and renderbuffers the up/down orientation
was inconsistant.
Fixes shadowtex.c bad rendering.
|
|
|
|
Since we use an inverted viewport transformation for render to texture, that
inverts front/back polygon orientation.
Now glCullFace(GL_FRONT / GL_BACK) works correctly.
|
|
|
|
We were dividing the number of URB entries by two to get number of threads,
which looks suspiciously like a copy'n'paste-o from brw_vs_state.c. Also, the
maximum number of threads is 24, not 12.
|
|
Makefile.template
|
|
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
This is an API breakage only.
|
|
|
|
The GEM flags are much more descriptive for what we need. Since this makes
bufmgr_fake rather device-specific, move it to the intel common directory.
We've wanted to do device-specific stuff to it before.
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
|
|
These fields are no longer indexed by shader output. Now, we just have
a simple array of renderbuffer pointers.
If the shader writes to gl_FragData[i], send those colors to the N
_ColorDrawBuffers. Otherwise, replicate the single gl_FragColor (or
the fixed-function color) to the N _ColorDrawBuffers.
A few more changes and simplifications can follow from this...
|
|
We have two consumers of relocations. One is static state buffers, which
want the same relocation every time. The other is the batchbuffer, which gets
thrown out immediately after submit. This lets us reduce repeated computation
for static state buffers, and clean up the code by moving relocations nearer
to where the state buffer is computed.
|
|
|
|
To do so, merge the remainnig necessary code from the buffers, blit, span, and
screen code to shared, and replace it with those.
|
|
|
|
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
In the process, fix some alignment issues:
- Scratch space allocation was aligned into units of 1KB, while the allocation
wanted units of bytes, so we never allocated enough space for scratch.
- GRF register count was programmed as ALIGN(val - 1, 16) / 16 instead of
ALIGN(val, 16) / 16 - 1, which overcounted for val != 16n+1.
|
|
glPolygonMode with point sprite on i965
|
|
|
|
The clamping for these values depends on whether we're drawing AA or non-AA
points, lines. Defer clamping until drawing time. Drivers could compute and
keep clamped AA and clamped non-AA values if desired.
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|