Age | Commit message (Collapse) | Author |
|
|
|
|
|
As part of the DRI driver interface rewrite I merged __DRIscreenPrivate
and __DRIscreen, and likewise for __DRIdrawablePrivate and
__DRIcontextPrivate. I left typedefs in place though, to avoid renaming
all the *Private use internal to the driver. That was probably a
mistake, and it turns out a one-line find+sed combo can do the mass
rename. Better late than never.
|
|
Saves another 600 bytes or so of code.
|
|
Saves ~2KB of code.
|
|
Saves ~480 bytes of code.
|
|
Fixing this is a prereq for avoiding flagging all state at new
batch time. Eliminating that still causes problems, though (notably
glean logicOp fails on my GM965).
|
|
|
|
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
Fixes shadowtex.c. And an assert is added to catch this sooner next time.
|
|
Also, only create VS surface state if there's a VS constant buffer to be
uploaded, and set the contents of the buffer at the same time as creation.
|
|
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
|
|
The polygon stipple pattern, like the viewport and the
polygon face orientation, must be inverted on the i965
when rendering to a FBO (which itself has an inverted pixel
coordinate system compared to raw Mesa).
In addition, the polygon stipple offset, which orients
the stipple to the window system, disappears when rendering
to an FBO (because the window system offset doesn't apply,
and there's no associated FBO offset).
With these fixes, the conform triangle and polygon stipple
tests pass when rendering to texture.
|
|
|
|
|
|
|
|
The mobile and desktop chipsets are the same, and having them separate is
more typing and more chances to screw up.
|
|
Previously, since my check_aperture API change, we would check each piece of
state against the batchbuffer individually, but not all the state against the
batchbuffer at once. In addition to not being terribly useful in assuring
success, it probably also increased CPU load by calling check_aperture many
times per primitive.
|
|
This avoids issues with dereferencing stale cliprects around intel_draw_buffer
time. Additionally, take advantage of cliprects staying constant for FBOs and
DRI2, and emit cliprects in the batchbuffer instead of having to flush batch
each time they change.
|
|
This ensures there is an unfilled batchbuffer used for emitting states again. Partial fix for #17964.
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
Conflicts:
src/mesa/drivers/dri/common/dri_bufmgr.c
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
|
|
|
|
This is an API breakage only.
|
|
This existed to get the icache flushed. However, GEM handles this for us
now for sure, and we had disabled it prematurely anyway.
|
|
The GEM flags are much more descriptive for what we need. Since this makes
bufmgr_fake rather device-specific, move it to the intel common directory.
We've wanted to do device-specific stuff to it before.
|
|
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
This is required since our buffer manager may now move our
instruction-containing buffers at any batchbuffer emit.
|
|
|
|
The previous change gave us only two modes, one which looped over the batch
per cliprect (3d drawing) and one that didn't (state updeast).
However, we really want 4:
- Batch doesn't care about cliprects (state updates)
- Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing)
- Batch needs to be executed just once (region fills, copies, etc.)
- Batch already includes cliprect handling, and must be flushed by unlock time
(copybuffers, clears).
All callers should now be fixed to use one of these states for any batchbuffer
emits. Thanks to Keith Whitwell for pointing out the failure.
|
|
This allows us to avoid re-emitting some state when validate_state happens
multiple times per batchbuffer. Even though we flush batch per primitive
currently, that may still happen already if the primitive changed (this should
probably be fixed as well).
|
|
|
|
To do so, merge the remainnig necessary code from the buffers, blit, span, and
screen code to shared, and replace it with those.
|
|
With FBOs, we end up wanting to do 3D metaops against one or the other without
having to find the other one to fill in if we're not going to draw to it.
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
This is currently believed to work but be a significant performance loss.
Performance recovery should be soon to follow.
The dri_bo_fake_disable_backing_store() call was added to allow backing store
disable like bufmgr_fake.c did, which is a significant performance win (though
it's missing the no-fence-subdata part).
This commit is a squash merge of the 965-ttm branch, which had some history
I wanted to avoid pulling due to noisiness and brokenness at many points
for git-bisecting.
|
|
|
|
passing them to the kernel. This works because all drawing commands
in the 965 driver are emitted with the lock held and the batchbuffer
is always flushed prior to releasing the lock. This allows multiple
cliprects to be dealt with, without replaying entire batchbuffers and
redundantly re-emitting state.
|
|
we render. Currenly requires that some state be re-examined after
every LOCK_HARDWARE().
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|