Age | Commit message (Collapse) | Author |
|
Otherwise, glDrawBuffer(GL_NONE); glDrawPixels() results in a segfault when
we try to emit the color buffer state during setup.
|
|
The IS_945 case was left to fall through to the 830 case, along with the
not-recognized-at-all case, making that dead code.
|
|
|
|
|
|
|
|
|
|
Currently only implemented for intel hw.
|
|
|
|
It wasn't being initialized at screen setup, so we were getting stub
entrypoints even though it was exposed as enabled. Fixes arbocclude mesa demo.
|
|
The 965 path wasn't setting pClipRects for batch submission since it didn't
want kernel cliprect handling before. The 915 path also grew the INTEL_NO_HW=1
option for testing just driver overhead.
|
|
It's not worth the extra effort to avoid a free/malloc, and we'd rather
auto-size the reloc data buffer at some point so we don't need to have
max_relocs.
|
|
Harmless since we don't yet have any bits above 31 for flags.
|
|
|
|
|
|
|
|
The failure mode that was a available was:
reloc 1 -> target_buf
exec: PRESUMED_OFFSET wrong, buffer migrates, r1 entry updated.
reloc 2 -> target_buf
exec: suppose buffer migrates again. PRESUMED_OFFSET wrong. r2 entry updated.
reloc 1 -> target_buf
exec: suppose buffer doesn't migrate. PRESUMED_OFFSET right. no relocations
performed. r1 has stale pointer at original location.
Failures were reported with OGLconform's VBO test and SPECviewperf90, though
I haven't confirmed that this fixes it.
|
|
In addition to potentially binding when it was about to be mapped anyway,
failure to use CACHED_MAPPED means eating a full wbinvd on validate. Thanks to
airlied for catching this.
|
|
This reverts commit e2cb905bc6e23eaafaeeb2abdc9480e70959ee3f.
It was a reversion of an optimization hidden as otherwise.
pre_target_buf_handle was always NULL, so the optimization was never enabled,
rather than fixing the important optimization (resulting in 25-50% performance
loss).
|
|
|
|
this fix bad texture issue in some games(UT and quake).
|
|
buffer is used for a relocatee in the former relocation process
then another target buffer is used for this relocatee at the same
offset in the current relocation process.
|
|
|
|
|
|
|
|
|
|
The screen wide info such as pitch and cpp are obsoleted by the FBO
changes, so clean up the last few references to those, except for
setting up the legacy screen regions.
|
|
Around 10% of a CPU was being wasted to create the linked list which we
threw out immediately after passing it to the kernel.
|
|
|
|
The fix for pageflipping with cliprects ended up causing a batch flush at
an inopportune time, which is fixed by moving it up.
Additionally, the recovery code for handling batch wraps at bad times is
replaced by just checking for the space up front, and using a no_batch_wrap
assert like on 965 to make sure that we weren't wrong about how much space that
was.
|
|
|
|
Workaround for recursive batchbuffer flushing: If the window is
moved, we can get into a case where we try to flush during a
flush. What happens is that when we try to grab the lock for
the first flush, we detect that the window moved which then
causes another flush (from the intel_draw_buffer() call in
intelUpdatePageFlipping()). To work around this we reset the
batchbuffer tail pointer before trying to get the lock. This
prevent the nested buffer flush, but a better fix would be to
avoid that in the first place.
|
|
Good for a 10-15% improvement to OA.
|
|
Increases OA performance by about 3%.
|
|
|
|
|
|
The previous code would reference freed memory on window moves.
|
|
The previous change gave us only two modes, one which looped over the batch
per cliprect (3d drawing) and one that didn't (state updeast).
However, we really want 4:
- Batch doesn't care about cliprects (state updates)
- Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing)
- Batch needs to be executed just once (region fills, copies, etc.)
- Batch already includes cliprect handling, and must be flushed by unlock time
(copybuffers, clears).
All callers should now be fixed to use one of these states for any batchbuffer
emits. Thanks to Keith Whitwell for pointing out the failure.
|
|
|
|
Drop a bunch of unused arguments from intel_create_renderbuffer() and
introduce intel_renderbuffer_set_region() to set the region for
a renderbuffer.
|
|
|
|
|
|
This may allow better concurrency (noop in openarena performance now), but is
also important for the previous commit -- otherwise, we may end up with
BufferData, draw_prims, BufferData and the draw_prims would use the new VBO
data instead of old. This could still occur with user-supplied VBOs and poor
use of MapBuffer without BufferData.
|
|
Both drivers have ended up relying on lost_hardware being called after each
batch buffer, so update the name. This removes one of the calls on 965 whic
h was outside of the batchbuffer handling code and just duplicating what had
already happened through batchbuffer handling.
|
|
In particular, batch buffers are no longer flushed when switching from
CLIPRECTS to NO_CLIPRECTS or vice versa, and 965 just uses DRM cliprect
handling for primitives instead of trying to sneak in its own to avoid the
DRM stuff. The disadvantage is that we will re-execute state updates per
cliprect, but the advantage is that we will be able to accumulate larger
batch buffers, which were proving to be a major overhead.
|
|
|
|
|
|
Each array element is now a BUFFER_x token rather than a BUFFER_BIT_x bitmask.
The number of active color buffers is specified by _NumColorDrawBuffers.
This builds on the previous DrawBuffer changes and will help with drivers
implementing GL_ARB_draw_buffers.
|
|
These fields are no longer indexed by shader output. Now, we just have
a simple array of renderbuffer pointers.
If the shader writes to gl_FragData[i], send those colors to the N
_ColorDrawBuffers. Otherwise, replicate the single gl_FragColor (or
the fixed-function color) to the N _ColorDrawBuffers.
A few more changes and simplifications can follow from this...
|
|
|
|
By avoiding the repeated relocation buffer creation/map/unmap/destroy for each
new batch buffer, this improves OpenArena framerates by 30%. Caching batch
buffers themselves doesn't appear to be a significant performance win over
this change.
|