 |
|
|
Scene.org is hosted and supported by:
|
|
|
Scene.org is sponsored by:
|
|
|
|
|
 |
forum - #coders |
|
 | | Topic: | Writing a softsynth for 4k intro | | | Hi!
I'm currently writing sound synthesizer in asm to be used in 4k intro for Linux. I have one detail which is buggin me.
In track rendering loop a "playlist" is looped through. The playlist contains pointers to the instrument functions. And the track renderer "calls" the functions sequentially.
This looping is done for every frame (frame = 2*16bit sample).
So when instrument gets called it renders one frame. Instrument also "inherits" the sample generated by the previous instruments. This allows me to use some of the instruments as effects. Note that the playlist is updated once and a while by other functions.
Now do you guys find this aproach sensible, considering that the code must be as small as possible? Does the indirect calling of functions for every sample create overhead? My main concern is the prefetching / branch prediction.
Is there a way not to use indirect calls/jmps so extensively and still maintain the small code size and flexibility? Self modifying code would be one solution i guess, but does it really make difference compared to indirect jumps? Intel docs specifically instruct not to use self modifying code and to avoid indirect calls... | | |
| Self modifying code is to be avoided. You don't gain ANYTHING by it (especially in 4k intros), as the logic required to flush the CPU pipeline after the modification is already more complicated than forgetting about selfmod at all in most cases. Also it requires giving your code section write access rights unter NT based OSes which is one additional import and quite a bit of code to achieve. So, to put it short: Don't.
Indirect jumps on the other hand aren't that bad. First we're talking 4k intros here, so better forget about performance, people are used to 5fps and still like those small things. If you're too concerned, pre-render the sound before seinding it to PlayFile(), that should do it.
If you don't like indirect jumps, you can build a "fake jump table" by chaining 'loop' instructions and putting the index into cx. This is way shorter than jump tables (2 bytes for a loop instead of 4 for a hardcoded offset) and makes the music data more packable (1 byte with small values instead of quasi-random dwords). Only downside is that the loop instruction is for whatever reason dead slow on P4 as well as on AMD cpus.
kb/fr | | |
| Hmm, not all true: selfmodifying code under at least XP doesnt require any extra function calls. Just set the Write flag in the exe header (use pe explorer or something to do it easy). The flag is set to read/execute unless you tell the linker something else. it is a kinda of nice trick for 1k demos to overwrite allready used data with pointers that some other function returned (strings like "EDIT",0 that was used to create a window etc etc)
About the synth stuff, i think i better shut up and listen to what kb said ;) | | |
| | Does your cpu flush its code cache and pipeline upon writing to a memory address? | | |
|
|
|